For zero initial wealth and log-wealth utility, answering “honestly” is optimal even for the many-round generalization. I wrote a script and realized this through experiments, but it is obvious in retrospect.
Yep. Notice that if you have external wealth, the number of questions is relevant to deciding how much to overweight your best guess.
Yes, I knew about the properness of the logarithmic scoring rule. But I think my pretty little result is not covered by your Wikipedia link. Here it is, for completeness: If I play the multi-turn version of the game, with log utility and zero initial wealth, then my best strategy is honesty in each turn, even if they tell me the questions one by one. (Actually, they can uncover the questions to me in any order and by any schedule, the statement still holds.) I think this is a nontrivial generalization. Nontrivial in the sense that the (completely trivial) proof crucially depends on the utility being logarithmic.
I think it results from the scale independence of log (and as soon as you add external wealth, the scale independence goes away). That makes it so you can treat every question separately (since only its scale is determined by the previous questions, and that doesn’t impact the maximization). It is a pretty result, but as I don’t think people often talk about this sort of problem I don’t know if I would call it well known or not.
It’s also cool to work back from the last question and see how conditional probabilities connect the full knowledge case and the discovery case.
And here is the prettiest part: The whole thing seems to work even if we don’t assume that the answers to the quiz questions are independent random variables. The expected utility is always equal to the entropy of the joint distribution of the variables, and the best strategy is always honesty in every turn. Note the statement about entropy, it’s a new motif. Before generalizing to non-independent variables it did not add too much value, but in the current version, it is a quite powerful statement.
For example, let the first question be a 50%-50% A-B, but let the second question depend on the first correct answer in the following way: if it was A, then the second question is a 50%-50% C-D, but if it was B, then the second question is a 100% C. First it seemed to me that we can win some here by not being honest in the first round, but actually we can’t. The entropy of the joint distribution is 1.5 bits, and the only way to achieve this much expected utility is by betting 50%-50% in the first round.
I don’t see how this can be true in conjunction with Vaniver’s post. In particular, suppose you start with 0 wealth, and get a question right by being honest. Now you have e.g. £10,000 in winnings.
Then wouldn’t the problem of answering the second question be isomorphic to the problem of answering the first question when you don’t start with zero wealth, which as we’ve seen involves being dishonest?
I have two hypotheses explaining this. One is that I don’t understand something about the game show rules. Another is that you’re combining winnings improperly: winning £X then £Y should give you log(X+Y) utility, not log(X) + log(Y) (and if you mistakenly do the latter, then I think honesty is the best option in all cases).
You start with zero wealth. The quiz show host “gives” you a million pounds to play with, but you can only take home the money left after the last round. The intermediary levels of imaginary wealth do not affect the calculation.
Yep. Notice that if you have external wealth, the number of questions is relevant to deciding how much to overweight your best guess.
Yes.
Yes, I knew about the properness of the logarithmic scoring rule. But I think my pretty little result is not covered by your Wikipedia link. Here it is, for completeness: If I play the multi-turn version of the game, with log utility and zero initial wealth, then my best strategy is honesty in each turn, even if they tell me the questions one by one. (Actually, they can uncover the questions to me in any order and by any schedule, the statement still holds.) I think this is a nontrivial generalization. Nontrivial in the sense that the (completely trivial) proof crucially depends on the utility being logarithmic.
I think it results from the scale independence of log (and as soon as you add external wealth, the scale independence goes away). That makes it so you can treat every question separately (since only its scale is determined by the previous questions, and that doesn’t impact the maximization). It is a pretty result, but as I don’t think people often talk about this sort of problem I don’t know if I would call it well known or not.
It’s also cool to work back from the last question and see how conditional probabilities connect the full knowledge case and the discovery case.
And here is the prettiest part: The whole thing seems to work even if we don’t assume that the answers to the quiz questions are independent random variables. The expected utility is always equal to the entropy of the joint distribution of the variables, and the best strategy is always honesty in every turn. Note the statement about entropy, it’s a new motif. Before generalizing to non-independent variables it did not add too much value, but in the current version, it is a quite powerful statement.
For example, let the first question be a 50%-50% A-B, but let the second question depend on the first correct answer in the following way: if it was A, then the second question is a 50%-50% C-D, but if it was B, then the second question is a 100% C. First it seemed to me that we can win some here by not being honest in the first round, but actually we can’t. The entropy of the joint distribution is 1.5 bits, and the only way to achieve this much expected utility is by betting 50%-50% in the first round.
I don’t see how this can be true in conjunction with Vaniver’s post. In particular, suppose you start with 0 wealth, and get a question right by being honest. Now you have e.g. £10,000 in winnings.
Then wouldn’t the problem of answering the second question be isomorphic to the problem of answering the first question when you don’t start with zero wealth, which as we’ve seen involves being dishonest?
I have two hypotheses explaining this. One is that I don’t understand something about the game show rules. Another is that you’re combining winnings improperly: winning £X then £Y should give you log(X+Y) utility, not log(X) + log(Y) (and if you mistakenly do the latter, then I think honesty is the best option in all cases).
You start with zero wealth. The quiz show host “gives” you a million pounds to play with, but you can only take home the money left after the last round. The intermediary levels of imaginary wealth do not affect the calculation.
Oh, I see, I was completely misunderstanding the problem.