In the original problem, suppose my prior is 1:1 but I pick 2d6 because I think that they’re more aesthetically pleasing. If I see the first number off the sheet, it will convince me to switch from 2d6 to 1d12 if the number is 2 or less or 11 or more. I go from winning half of the time to winning 62.5% of the time; that’s worth £125, like you suggest.
Also, note that the VoI of the second number off the sheet depends on the first! If I saw a 1 first, I don’t need any more numbers. If I saw a 3 or a 10, then a second number is still worth £125, because I’m in the same position as I was before. If I saw a 7 the first time around, then a second number is worth only £51 pounds, because it raises my expected confidence from .625 to .676.
Vaniver, looks like you were thinking about the problem in the same way that I was, getting repeated chances to buy new numbers. So at some point, you might have bought enough information to move your expected confidence into a place where the calculation that gave you £125 now gives you £0.
What do you do then? The conclusion ‘I literally won’t lift a finger to know more numbers’ doesn’t seem right unless you’re certain of the answer already.
Vaniver, looks like you were thinking about the problem in the same way that I was, getting repeated chances to buy new numbers.
Sort of. The calculations that I ran are all one-step-ahead calculations, starting with different priors. Consider three different cases:
You pay X now, and he reads you the first number, and then you guess.
You pay Y now, then he reads you the first number, then you have the option to buy a second number, and then you guess.
You pay Z now, then he reads you the first two numbers, and then you guess.
Pricing X is easy; it’s £125. Pricing Z is a bit tougher, but still okay. Pricing Y involves coming up with 13 different prices- the twelve possibilities after the first roll, and then Y (which depends on each of those possibilities!). Doing that with arbitrary n is doable but tough! (It’s somewhat easier if you have a set price for each successive number, so you can swiftly terminate trees once you’ve hit the point that it’s no longer worth the price.)
And so, even at 9:1 odds, there is some number of numbers he can read off that will have positive VoI. It will be very low- because it’s very unlikely you will get that many informative numbers- but it is true that if you aren’t perfectly certain, a test that gives you perfect certainty will have positive VoI.
What do you do then? The conclusion ‘I literally won’t lift a finger to know more numbers’ doesn’t seem right unless you’re certain of the answer already.
The thing to focus on here is both the amount of additional certainty and the effect of additional certainty. The number you get when you’re at 9:1 tells you a lot less than the number you get when you’re at 1:1. Imagine the next number being a 1- in the first case, it feels like you just got £100, but in the second case it feels like you just got £500. Similarly, when I’m at 1:1, telling me one additional number is expected to change my guess in some cases. When I’m at 9:1, regardless of what he tells me, I still make the same call.
There is such a thing as certain enough when there are tests that aren’t informative enough.
(Interestingly, note that you can never reach perfect certainty that it’s 2d6, and there will always be a positive VoI for another number because there will always be a positive chance that it’s a 1.)
Great post by the way. Thank you. It sounds like your job is to think about this sort of thing!
I think I now believe that the answer to the original question can’t be £125, unless you already know what happens next.
Suppose the question is something like: “Every time you give me a penny, I’ll give you the next number. At any time you can stop and make your one guess.” It seems to me that there has to be a computer program that is best at playing this game. Do you have any idea what its stopping criterion would be? Or what the price would have to be for it to refuse to take any numbers at all?
It strikes me that this is actually a very dodgy problem indeed, and that if someone asks you these sorts of questions you should be very careful.
On the other hand it also strikes me that even in the absence of information about future offers, you should be prepared to pay something for the first number. You do, after all, expect to be £125 better off as a result of knowing it!
I have a queasy feeling of paradox and I notice that I am confused.
I put some time into solving this problem, and have reached a point where the amount of algebra necessary to continue is beyond what I’m willing to do. (The problem is that the transition probabilities are piecewise functions of the odds, and that makes everything unfun.) I have thought of an analogous problem that’s mathematically simpler (basically, it’ll be the unfair coin, and the reward will be based on guessing the degree of unfairness, not which of two it is) that I’ll write up a longer explanation of how to do sometime over the weekend.
I’ll look forward to it. Don’t put time into this unless you’re enjoying it. I haven’t seen Oswald in ages, and my current commitment is a mental note to either think about the biased coin version or write some computer simulations next time I’m bored.
So, not quite an explanation, more of an exercise:
Oswald brings his laptop to a bar, loads up Matlab, and types:
p=rand(); c=0;
p is now a double between 0 and 1, which we can treat as continuously and uniformly distributed across that range. c is the number of times you’ve gotten a hint.
Now, Oswald types in another line:
[c+=1, rand()<p]
This will both increase the number of hints you’ve received, and give you a 0 or 1, if a new, uniformly selected random number is smaller than the first random number. (Basically, this is flipping a biased coin which gives ‘heads’ with probability p and tails with probability 1-p. You can repeat this line as many times as you like.
Now, this bar is called The Improper Prior, and as such is filled with Bayesians. It’s readily obvious to the patrons that their posterior on p should be a beta distribution, with α equal to one plus the number of 1s and β equal to one plus the number of 0s.
But now is when things get interesting: your chance of guessing p exactly is basically zero. So Oswald might instead reward you for guessing within .05 of the actual p. More guesses should be penalized- either by decreasing the acceptable range or by decreasing the reward for guessing correctly. Alternatively, Oswald might reward you based on the precision of your posterior, or some other function.
Unfortunately, the beta distribution’s cdf is not pleasant to play with. Matlab can deal with it easily- just type:
We could determine the chance that your guess is within .05 of the correct by typing:
Unfortunately (again!), this isn’t maximized by centering your estimate at the mean, unless a=b. You can test this with a=3, b=2; we have:
And so if Oswald uses this reward system, we’ll have to solve an optimization problem to determine what our guess is at each stage, which isn’t going to be fun. (The dumb way to do it throws
into some nonlinear optimization algorithm which shifts around x until it finds a local maximum, starting with a/(a+b) as the guess. What’s the smart way to do it?)
Oswald might also be reluctant to reward us based on precision, because that can grow enormously high as α and β increase. So instead let’s suppose he offers a flat reward, minus some constant times the variance minus some constant times the number of guesses we made, and he wants to know how to price entry into the game, so he can set the expected profit where he wants it to be.
Now we’re in an interesting situation, because the variance can increase or decrease based on what we’ve seen. If you get two heads in a row, the variance is .06; a tails will increase it to .077, and a third heads will decrease it to .039. On average, you expect the variance after you see another coin to be .048. On average, the variance should always decrease after we get another hint. We also know that the amount each hint is expected to lower our variance will be a decreasing function of α and β for large enough values. (Really? Why would you believe those two statements?)
We can now easily calculate the actual variance and the expected variance after another hint for any (α,β) pair. If the costs are fixed we can determine when it wouldn’t be worthwhile to buy one more. If α and β and large enough, that’ll be enough for us to stop because we know future hints will be less valuable than the current hint and the current hint is a bad idea.
We can then propagate backwards from the terminal states to determine the total value of playing the game optimally. We also can be certain this game valuation procedure will terminate in reasonable time for reasonable choices of the penalty parameters. (Again, why?)
With a shameless reference to my own post: if your prior is 9:1 1d12:2d6, then one number is worthless because it cannot change your decision, as faul_sname’s comment details.
In the original problem, suppose my prior is 1:1 but I pick 2d6 because I think that they’re more aesthetically pleasing. If I see the first number off the sheet, it will convince me to switch from 2d6 to 1d12 if the number is 2 or less or 11 or more. I go from winning half of the time to winning 62.5% of the time; that’s worth £125, like you suggest.
Also, note that the VoI of the second number off the sheet depends on the first! If I saw a 1 first, I don’t need any more numbers. If I saw a 3 or a 10, then a second number is still worth £125, because I’m in the same position as I was before. If I saw a 7 the first time around, then a second number is worth only £51 pounds, because it raises my expected confidence from .625 to .676.
Vaniver, looks like you were thinking about the problem in the same way that I was, getting repeated chances to buy new numbers. So at some point, you might have bought enough information to move your expected confidence into a place where the calculation that gave you £125 now gives you £0.
What do you do then? The conclusion ‘I literally won’t lift a finger to know more numbers’ doesn’t seem right unless you’re certain of the answer already.
Sort of. The calculations that I ran are all one-step-ahead calculations, starting with different priors. Consider three different cases:
You pay X now, and he reads you the first number, and then you guess.
You pay Y now, then he reads you the first number, then you have the option to buy a second number, and then you guess.
You pay Z now, then he reads you the first two numbers, and then you guess.
Pricing X is easy; it’s £125. Pricing Z is a bit tougher, but still okay. Pricing Y involves coming up with 13 different prices- the twelve possibilities after the first roll, and then Y (which depends on each of those possibilities!). Doing that with arbitrary n is doable but tough! (It’s somewhat easier if you have a set price for each successive number, so you can swiftly terminate trees once you’ve hit the point that it’s no longer worth the price.)
And so, even at 9:1 odds, there is some number of numbers he can read off that will have positive VoI. It will be very low- because it’s very unlikely you will get that many informative numbers- but it is true that if you aren’t perfectly certain, a test that gives you perfect certainty will have positive VoI.
The thing to focus on here is both the amount of additional certainty and the effect of additional certainty. The number you get when you’re at 9:1 tells you a lot less than the number you get when you’re at 1:1. Imagine the next number being a 1- in the first case, it feels like you just got £100, but in the second case it feels like you just got £500. Similarly, when I’m at 1:1, telling me one additional number is expected to change my guess in some cases. When I’m at 9:1, regardless of what he tells me, I still make the same call.
There is such a thing as certain enough when there are tests that aren’t informative enough.
(Interestingly, note that you can never reach perfect certainty that it’s 2d6, and there will always be a positive VoI for another number because there will always be a positive chance that it’s a 1.)
Great post by the way. Thank you. It sounds like your job is to think about this sort of thing!
I think I now believe that the answer to the original question can’t be £125, unless you already know what happens next.
Suppose the question is something like: “Every time you give me a penny, I’ll give you the next number. At any time you can stop and make your one guess.” It seems to me that there has to be a computer program that is best at playing this game. Do you have any idea what its stopping criterion would be? Or what the price would have to be for it to refuse to take any numbers at all?
It strikes me that this is actually a very dodgy problem indeed, and that if someone asks you these sorts of questions you should be very careful.
On the other hand it also strikes me that even in the absence of information about future offers, you should be prepared to pay something for the first number. You do, after all, expect to be £125 better off as a result of knowing it!
I have a queasy feeling of paradox and I notice that I am confused.
I put some time into solving this problem, and have reached a point where the amount of algebra necessary to continue is beyond what I’m willing to do. (The problem is that the transition probabilities are piecewise functions of the odds, and that makes everything unfun.) I have thought of an analogous problem that’s mathematically simpler (basically, it’ll be the unfair coin, and the reward will be based on guessing the degree of unfairness, not which of two it is) that I’ll write up a longer explanation of how to do sometime over the weekend.
I’ll look forward to it. Don’t put time into this unless you’re enjoying it. I haven’t seen Oswald in ages, and my current commitment is a mental note to either think about the biased coin version or write some computer simulations next time I’m bored.
So, not quite an explanation, more of an exercise:
Oswald brings his laptop to a bar, loads up Matlab, and types:
p is now a double between 0 and 1, which we can treat as continuously and uniformly distributed across that range. c is the number of times you’ve gotten a hint.
Now, Oswald types in another line:
This will both increase the number of hints you’ve received, and give you a 0 or 1, if a new, uniformly selected random number is smaller than the first random number. (Basically, this is flipping a biased coin which gives ‘heads’ with probability p and tails with probability 1-p. You can repeat this line as many times as you like.
Now, this bar is called The Improper Prior, and as such is filled with Bayesians. It’s readily obvious to the patrons that their posterior on p should be a beta distribution, with α equal to one plus the number of 1s and β equal to one plus the number of 0s.
But now is when things get interesting: your chance of guessing p exactly is basically zero. So Oswald might instead reward you for guessing within .05 of the actual p. More guesses should be penalized- either by decreasing the acceptable range or by decreasing the reward for guessing correctly. Alternatively, Oswald might reward you based on the precision of your posterior, or some other function.
Unfortunately, the beta distribution’s cdf is not pleasant to play with. Matlab can deal with it easily- just type:
We could determine the chance that your guess is within .05 of the correct by typing:
Unfortunately (again!), this isn’t maximized by centering your estimate at the mean, unless a=b. You can test this with a=3, b=2; we have:
And so if Oswald uses this reward system, we’ll have to solve an optimization problem to determine what our guess is at each stage, which isn’t going to be fun. (The dumb way to do it throws
into some nonlinear optimization algorithm which shifts around x until it finds a local maximum, starting with a/(a+b) as the guess. What’s the smart way to do it?)
Oswald might also be reluctant to reward us based on precision, because that can grow enormously high as α and β increase. So instead let’s suppose he offers a flat reward, minus some constant times the variance minus some constant times the number of guesses we made, and he wants to know how to price entry into the game, so he can set the expected profit where he wants it to be.
Now we’re in an interesting situation, because the variance can increase or decrease based on what we’ve seen. If you get two heads in a row, the variance is .06; a tails will increase it to .077, and a third heads will decrease it to .039. On average, you expect the variance after you see another coin to be .048. On average, the variance should always decrease after we get another hint. We also know that the amount each hint is expected to lower our variance will be a decreasing function of α and β for large enough values. (Really? Why would you believe those two statements?)
We can now easily calculate the actual variance and the expected variance after another hint for any (α,β) pair. If the costs are fixed we can determine when it wouldn’t be worthwhile to buy one more. If α and β and large enough, that’ll be enough for us to stop because we know future hints will be less valuable than the current hint and the current hint is a bad idea.
We can then propagate backwards from the terminal states to determine the total value of playing the game optimally. We also can be certain this game valuation procedure will terminate in reasonable time for reasonable choices of the penalty parameters. (Again, why?)