For zero initial wealth and log-wealth utility, answering “honestly” is optimal even for the many-round generalization. I wrote a script and realized this through experiments, but it is obvious in retrospect. A very nice fact anyway. I guess it can be turned into some parable about maximum log-likelihoods.
EDIT: Second paragraph about nonzero initial wealth retracted because I found a bug in my script. Zero initial wealth case unaffected.
EDIT 2: Wow, this is beautiful. Is this well-known? I just had two realizations. The first one was that my original analysis only covered the case when I get to hear all the questions at the beginning of the game. The second realization was that despite my sloppy original analysis, the statement is true even for the more realistic case when I only hear the next question when I answered the previous one. It’s always worth being honest.
Looking at some much more advanced related papers, I am now sure that it is well known. But I’d still love to see some reference, be it a paper or a textbook. Could you please help me with this?
Really? That’s cool. How about the slightly more general version that I stated down-thread? I hope at least that one would have been news to Bernoulli, entropy hadn’t been invented yet.
For zero initial wealth and log-wealth utility, answering “honestly” is optimal even for the many-round generalization. I wrote a script and realized this through experiments, but it is obvious in retrospect.
Yep. Notice that if you have external wealth, the number of questions is relevant to deciding how much to overweight your best guess.
Yes, I knew about the properness of the logarithmic scoring rule. But I think my pretty little result is not covered by your Wikipedia link. Here it is, for completeness: If I play the multi-turn version of the game, with log utility and zero initial wealth, then my best strategy is honesty in each turn, even if they tell me the questions one by one. (Actually, they can uncover the questions to me in any order and by any schedule, the statement still holds.) I think this is a nontrivial generalization. Nontrivial in the sense that the (completely trivial) proof crucially depends on the utility being logarithmic.
I think it results from the scale independence of log (and as soon as you add external wealth, the scale independence goes away). That makes it so you can treat every question separately (since only its scale is determined by the previous questions, and that doesn’t impact the maximization). It is a pretty result, but as I don’t think people often talk about this sort of problem I don’t know if I would call it well known or not.
It’s also cool to work back from the last question and see how conditional probabilities connect the full knowledge case and the discovery case.
And here is the prettiest part: The whole thing seems to work even if we don’t assume that the answers to the quiz questions are independent random variables. The expected utility is always equal to the entropy of the joint distribution of the variables, and the best strategy is always honesty in every turn. Note the statement about entropy, it’s a new motif. Before generalizing to non-independent variables it did not add too much value, but in the current version, it is a quite powerful statement.
For example, let the first question be a 50%-50% A-B, but let the second question depend on the first correct answer in the following way: if it was A, then the second question is a 50%-50% C-D, but if it was B, then the second question is a 100% C. First it seemed to me that we can win some here by not being honest in the first round, but actually we can’t. The entropy of the joint distribution is 1.5 bits, and the only way to achieve this much expected utility is by betting 50%-50% in the first round.
I don’t see how this can be true in conjunction with Vaniver’s post. In particular, suppose you start with 0 wealth, and get a question right by being honest. Now you have e.g. £10,000 in winnings.
Then wouldn’t the problem of answering the second question be isomorphic to the problem of answering the first question when you don’t start with zero wealth, which as we’ve seen involves being dishonest?
I have two hypotheses explaining this. One is that I don’t understand something about the game show rules. Another is that you’re combining winnings improperly: winning £X then £Y should give you log(X+Y) utility, not log(X) + log(Y) (and if you mistakenly do the latter, then I think honesty is the best option in all cases).
You start with zero wealth. The quiz show host “gives” you a million pounds to play with, but you can only take home the money left after the last round. The intermediary levels of imaginary wealth do not affect the calculation.
For zero initial wealth and log-wealth utility, answering “honestly” is optimal even for the many-round generalization. I wrote a script and realized this through experiments, but it is obvious in retrospect. A very nice fact anyway. I guess it can be turned into some parable about maximum log-likelihoods.
EDIT: Second paragraph about nonzero initial wealth retracted because I found a bug in my script. Zero initial wealth case unaffected.
EDIT 2: Wow, this is beautiful. Is this well-known? I just had two realizations. The first one was that my original analysis only covered the case when I get to hear all the questions at the beginning of the game. The second realization was that despite my sloppy original analysis, the statement is true even for the more realistic case when I only hear the next question when I answered the previous one. It’s always worth being honest.
Yes, it is well known. Log utility is used not because it is particularly realistic, but because it makes this calculation easy.
Looking at some much more advanced related papers, I am now sure that it is well known. But I’d still love to see some reference, be it a paper or a textbook. Could you please help me with this?
Maybe Bernoulli’s 1738 paper on the St Petersburg paradox, where he suggested that utility should be the log of wealth.
Really? That’s cool. How about the slightly more general version that I stated down-thread? I hope at least that one would have been news to Bernoulli, entropy hadn’t been invented yet.
Yep. Notice that if you have external wealth, the number of questions is relevant to deciding how much to overweight your best guess.
Yes.
Yes, I knew about the properness of the logarithmic scoring rule. But I think my pretty little result is not covered by your Wikipedia link. Here it is, for completeness: If I play the multi-turn version of the game, with log utility and zero initial wealth, then my best strategy is honesty in each turn, even if they tell me the questions one by one. (Actually, they can uncover the questions to me in any order and by any schedule, the statement still holds.) I think this is a nontrivial generalization. Nontrivial in the sense that the (completely trivial) proof crucially depends on the utility being logarithmic.
I think it results from the scale independence of log (and as soon as you add external wealth, the scale independence goes away). That makes it so you can treat every question separately (since only its scale is determined by the previous questions, and that doesn’t impact the maximization). It is a pretty result, but as I don’t think people often talk about this sort of problem I don’t know if I would call it well known or not.
It’s also cool to work back from the last question and see how conditional probabilities connect the full knowledge case and the discovery case.
And here is the prettiest part: The whole thing seems to work even if we don’t assume that the answers to the quiz questions are independent random variables. The expected utility is always equal to the entropy of the joint distribution of the variables, and the best strategy is always honesty in every turn. Note the statement about entropy, it’s a new motif. Before generalizing to non-independent variables it did not add too much value, but in the current version, it is a quite powerful statement.
For example, let the first question be a 50%-50% A-B, but let the second question depend on the first correct answer in the following way: if it was A, then the second question is a 50%-50% C-D, but if it was B, then the second question is a 100% C. First it seemed to me that we can win some here by not being honest in the first round, but actually we can’t. The entropy of the joint distribution is 1.5 bits, and the only way to achieve this much expected utility is by betting 50%-50% in the first round.
I don’t see how this can be true in conjunction with Vaniver’s post. In particular, suppose you start with 0 wealth, and get a question right by being honest. Now you have e.g. £10,000 in winnings.
Then wouldn’t the problem of answering the second question be isomorphic to the problem of answering the first question when you don’t start with zero wealth, which as we’ve seen involves being dishonest?
I have two hypotheses explaining this. One is that I don’t understand something about the game show rules. Another is that you’re combining winnings improperly: winning £X then £Y should give you log(X+Y) utility, not log(X) + log(Y) (and if you mistakenly do the latter, then I think honesty is the best option in all cases).
You start with zero wealth. The quiz show host “gives” you a million pounds to play with, but you can only take home the money left after the last round. The intermediary levels of imaginary wealth do not affect the calculation.
Oh, I see, I was completely misunderstanding the problem.