I am assuming the student has a distribution in mind and we want to design a scoring rule where the best strategy to maximize the expected score is to write in the distribution you have in mind.
If there are n options and the right answer is i and you give log(n p_i) / log(n) points to the student, then his incentive is to write in the exact distribution. On the other hand, if you give him say p_i* point, his incentive would be to write in “1” for the most likely answer and 0 otherwise.
Another way to score is not to give point only on p_i but to take away points on p_i where i != i by using a function f1 for p_i* and f0 otherwise. I gave a necessary condition on f1 and f0 for the student belief to be a local maximum of the expected score. The technique is simply lagrangian multipliers.
The number of options drop out of the equation that’s beautiful, so you can extend to any number of answers or even a continuous question. (when asked what the population of Zimbabwe is, the student could describe any parametric distribution and be scored on that… histograms, gaussians… there are many ways a students could write in his answer.
Ok, so you’re saying the total score the student gets is f1(q_i*) + Sum_(i /= i*) f0(q_i)? I didn’t understand that from your original post, sorry.
So does “(if) he score for a wrong answer was 0 (...) the only proper score function is the log” mean that if there are more than two options, log is the only proper score function that depends only on the probability assigned to the correct outcome, not on the way the rest of the probability mass is distributed among the other options? Or am I still misunderstanding?
Yes, if there are two or more options and the score function depends only on the probability assigned to the correct outcome, then the only proper function is log. You can see that with the equation I gave
f0′ (x) = (k—x.f1′ (x))/(1-x)
for f0 = 0, it means
x.f1′(x) = -k
thus f1(x) = -k ln(x) + c (necessary condition)
Then you have to check that -k ln(x) + c indeed works for some k and c, that is left as an exercise for the reader ^^
No.
I am assuming the student has a distribution in mind and we want to design a scoring rule where the best strategy to maximize the expected score is to write in the distribution you have in mind.
If there are n options and the right answer is i and you give log(n p_i) / log(n) points to the student, then his incentive is to write in the exact distribution. On the other hand, if you give him say p_i* point, his incentive would be to write in “1” for the most likely answer and 0 otherwise.
Another way to score is not to give point only on p_i but to take away points on p_i where i != i by using a function f1 for p_i* and f0 otherwise. I gave a necessary condition on f1 and f0 for the student belief to be a local maximum of the expected score. The technique is simply lagrangian multipliers.
The number of options drop out of the equation that’s beautiful, so you can extend to any number of answers or even a continuous question. (when asked what the population of Zimbabwe is, the student could describe any parametric distribution and be scored on that… histograms, gaussians… there are many ways a students could write in his answer.
Ok, so you’re saying the total score the student gets is
f1(q_i*) + Sum_(i /= i*) f0(q_i)
? I didn’t understand that from your original post, sorry.So does “(if) he score for a wrong answer was 0 (...) the only proper score function is the log” mean that if there are more than two options, log is the only proper score function that depends only on the probability assigned to the correct outcome, not on the way the rest of the probability mass is distributed among the other options? Or am I still misunderstanding?
Yes, if there are two or more options and the score function depends only on the probability assigned to the correct outcome, then the only proper function is log. You can see that with the equation I gave
f0′ (x) = (k—x.f1′ (x))/(1-x)
for f0 = 0, it means x.f1′(x) = -k thus f1(x) = -k ln(x) + c (necessary condition)
Then you have to check that -k ln(x) + c indeed works for some k and c, that is left as an exercise for the reader ^^