You’re correct. In the previous post given, it was somehow assumed that the score for a wrong answer was 0. In that case, the only proper score function is the log.
If you have a score function f1(q) for the right answer f0(q) for the wrong answer, and there are n possible choices, the right p are critical only if
f0′ (x) = (k—x.f1′ (x))/(1-x)
if we set f1(x) = 1 - (1-x)^p
we can set f0(x) = -(1-x)^p + (1-x)^(p-1) * p/(p-1)
for p = 2, we find f0(x) = -(1-x)^2 + 2(1-x) = 1 - x^2 this is Brier score
for p = 3, we find f0(x) = -(1-x)^3 + (1-x)^2 3⁄2 = x^3 − 3x^2/2
1-(1-x)^3 and x^3-3*x^2/2 shall be known as ArthurB’s score
I’m not following your calculations exactly, so please correct me if I’m misunderstanding, but it seems that you are assuming that the student chooses an option and a confidence for that option? My understanding was that the student chooses a probability distribution over all options and is scored on that. As for how to extend the Brier score to more than two options, I’m not sure whether there’s a standard way to do that, but one could always limit oneself to true/false questions… (in the log case you simply score log q_i, where q_i is the probability the student put on the correct answer, of course)
I am assuming the student has a distribution in mind and we want to design a scoring rule where the best strategy to maximize the expected score is to write in the distribution you have in mind.
If there are n options and the right answer is i and you give log(n p_i) / log(n) points to the student, then his incentive is to write in the exact distribution. On the other hand, if you give him say p_i* point, his incentive would be to write in “1” for the most likely answer and 0 otherwise.
Another way to score is not to give point only on p_i but to take away points on p_i where i != i by using a function f1 for p_i* and f0 otherwise. I gave a necessary condition on f1 and f0 for the student belief to be a local maximum of the expected score. The technique is simply lagrangian multipliers.
The number of options drop out of the equation that’s beautiful, so you can extend to any number of answers or even a continuous question. (when asked what the population of Zimbabwe is, the student could describe any parametric distribution and be scored on that… histograms, gaussians… there are many ways a students could write in his answer.
Ok, so you’re saying the total score the student gets is f1(q_i*) + Sum_(i /= i*) f0(q_i)? I didn’t understand that from your original post, sorry.
So does “(if) he score for a wrong answer was 0 (...) the only proper score function is the log” mean that if there are more than two options, log is the only proper score function that depends only on the probability assigned to the correct outcome, not on the way the rest of the probability mass is distributed among the other options? Or am I still misunderstanding?
Yes, if there are two or more options and the score function depends only on the probability assigned to the correct outcome, then the only proper function is log. You can see that with the equation I gave
f0′ (x) = (k—x.f1′ (x))/(1-x)
for f0 = 0, it means
x.f1′(x) = -k
thus f1(x) = -k ln(x) + c (necessary condition)
Then you have to check that -k ln(x) + c indeed works for some k and c, that is left as an exercise for the reader ^^
Unless I’m misunderstanding something, this is true for the Brier score, too: http://en.wikipedia.org/wiki/Scoring_rule#Proper_score_functions
You’re correct. In the previous post given, it was somehow assumed that the score for a wrong answer was 0. In that case, the only proper score function is the log.
If you have a score function f1(q) for the right answer f0(q) for the wrong answer, and there are n possible choices, the right p are critical only if
f0′ (x) = (k—x.f1′ (x))/(1-x)
if we set f1(x) = 1 - (1-x)^p we can set f0(x) = -(1-x)^p + (1-x)^(p-1) * p/(p-1)
for p = 2, we find f0(x) = -(1-x)^2 + 2(1-x) = 1 - x^2 this is Brier score for p = 3, we find f0(x) = -(1-x)^3 + (1-x)^2 3⁄2 = x^3 − 3x^2/2
1-(1-x)^3 and x^3-3*x^2/2 shall be known as ArthurB’s score
I’m not following your calculations exactly, so please correct me if I’m misunderstanding, but it seems that you are assuming that the student chooses an option and a confidence for that option? My understanding was that the student chooses a probability distribution over all options and is scored on that. As for how to extend the Brier score to more than two options, I’m not sure whether there’s a standard way to do that, but one could always limit oneself to true/false questions… (in the log case you simply score log q_i, where q_i is the probability the student put on the correct answer, of course)
No.
I am assuming the student has a distribution in mind and we want to design a scoring rule where the best strategy to maximize the expected score is to write in the distribution you have in mind.
If there are n options and the right answer is i and you give log(n p_i) / log(n) points to the student, then his incentive is to write in the exact distribution. On the other hand, if you give him say p_i* point, his incentive would be to write in “1” for the most likely answer and 0 otherwise.
Another way to score is not to give point only on p_i but to take away points on p_i where i != i by using a function f1 for p_i* and f0 otherwise. I gave a necessary condition on f1 and f0 for the student belief to be a local maximum of the expected score. The technique is simply lagrangian multipliers.
The number of options drop out of the equation that’s beautiful, so you can extend to any number of answers or even a continuous question. (when asked what the population of Zimbabwe is, the student could describe any parametric distribution and be scored on that… histograms, gaussians… there are many ways a students could write in his answer.
Ok, so you’re saying the total score the student gets is
f1(q_i*) + Sum_(i /= i*) f0(q_i)
? I didn’t understand that from your original post, sorry.So does “(if) he score for a wrong answer was 0 (...) the only proper score function is the log” mean that if there are more than two options, log is the only proper score function that depends only on the probability assigned to the correct outcome, not on the way the rest of the probability mass is distributed among the other options? Or am I still misunderstanding?
Yes, if there are two or more options and the score function depends only on the probability assigned to the correct outcome, then the only proper function is log. You can see that with the equation I gave
f0′ (x) = (k—x.f1′ (x))/(1-x)
for f0 = 0, it means x.f1′(x) = -k thus f1(x) = -k ln(x) + c (necessary condition)
Then you have to check that -k ln(x) + c indeed works for some k and c, that is left as an exercise for the reader ^^