Imagine that you and I are sitting at a table. Hidden in my lap, I have a jar of beans. We are going to play the traditional game wherein you try to guess the number of beans in the jar. However, you don’t get to see the jar. The rule is that I remove beans from the jar and place them on the table for as long as I like, and then at an arbitrary point ask you how many beans there are in total. That’s all you get to see.
One by one, I remove a dozen beans. As I place the twelfth bean on the table in front of you, I ask: “So how many beans are there total, including those left in the jar?”
“I have no idea,” you reasonably reply.
“Alright, well let’s try to narrow it down,” I say helpfully. “What is the greatest amount of beans I could possibly have in total?”
You reason thusly: “Well, given the Copernican principle, this twelfth bean is equally likely to fall anywhere along the distribution of the total number of beans. Thus, for example, all else held equal, there is a 50% chance that it will be within the last 50% of beans removed from the jar – or the first 50%, for that matter.
“But, obviously, it further follows that there is a 70% chance that it will be in the final 70% of beans, and a 95% chance that it will be within the last 95% of beans you might remove, and so on. In this scenario – if the 11 previous beans represent only 5% of the total – then there should be at most 11x20 total beans, or 220. Thus, I can be 95% confident that there are no more than 220 beans. Of course, the actual possible number asymptotically approaches infinity by this reasoning (say I wanted to be 99% confident?), but 95% confidence is good enough for me! So I’ll take 220 as my upper bound…”
You are wrong. I have over 1,000 beans left in the jar.
Or: you are (technically) right. There is only one bean left in the jar.
Or: any other possibility.
Either way, it seems obvious that your reasoning is completely disconnected from the actual number of beans left in the jar. Given the evidence you’ve actually seen, it seems intuitively that it could just as well be any number (12 or greater).
Where did you go wrong?
The proper Bayesian response to evidence is to pick a particular hypothesis – say, “there are fewer than 220 beans,” which is the hypothesis you just pegged at 95% confidence – and then see whether the given evidence (“he stopped at the 12th bean”) updates you towards or away from it.[1]
It seems clear that this kind of update is not what you have done in reasoning about the beans. Rather, you picked a hypothesis that was merely compatible with the evidence – “there are fewer than 220 beans.” You then found this weird value of the percentage of possible worlds wherein the evidence could possibly appear[2], out of possible worlds where the hypothesis is true (i.e. worlds where there are at least 12 beans out of worlds with fewer than 220). And this was then conflated this with the actual posterior probability of the hypothesis.
It seems to me that the Doomsday Argument is exactly analogous to this situation, except that it’s a sentient 12th bean itself (i.e. a human somewhere in the timeline) that happens to be making the guess.
I am not at all confident that I haven’t failed to address some obvious feature of the original argument. Please rebut.
[1] I’ve just tried to do this, but I’m rubbish at math, especially when it includes tricky (to me) things like ranges and summations. (Doesn’t the result depend entirely on your prior probability that there are (0, 220] beans, which would depend on your implicit upper bound for beans to begin with, if you assume there can’t be infinite beans in my jar?)
[2] Not does appear – remember, I could have stopped on any bean. This chunk of possible worlds includes, e.g. the world where I went all the way to bean 219.
His reasoning would be entirely correct if you had determined the number of beans you draw randomly from between 0 and the total number. His priors were all wrong, and so he failed.
Could we take all possible prior distributions, assign to each some prior that is probably wrong, and then use those prior distributions as theories to use the number of beans as evidence for?
Good point; you’re right that his reasoning would be correct if he knew that, e.g., I had used a random number generator to randomly-generate a number between 1 and (total # of beans) and resolved to ask him, only on that numbered bean, to guess the upper bound on the total.
Perhaps to make the bean-game more similar to the original problem, I ought to ask for a guess on the total number after every bean placed, since every bean represents an observer who could be fretting about the Doomsday Argument.
Analogously, it would be misleading to imagine that You the Observer were placed in the human timeline at a single randomly-chosen point by, say, Omega, since every bean (or human) is in fact an observer.
Unfortunately I’m getting muddled and am not clear what consequences this has. Thoughts?
His reasoning would be entirely correct if you had determined the number of beans you draw randomly from between 0 and the total number.
Let’s put this a bit more technically: the reasoning would have been correct if the number of beans were a random value drawn from a known (and sufficiently well-behaved) distribution.
Imagine that you and I are sitting at a table. Hidden in my lap, I have a jar of beans. We are going to play the traditional game wherein you try to guess the number of beans in the jar. However, you don’t get to see the jar. The rule is that I remove beans from the jar and place them on the table for as long as I like, and then at an arbitrary point ask you how many beans there are in total. That’s all you get to see.
One by one, I remove a dozen beans. As I place the twelfth bean on the table in front of you, I ask: “So how many beans are there total, including those left in the jar?”
“I have no idea,” you reasonably reply.
“Alright, well let’s try to narrow it down,” I say helpfully. “What is the greatest amount of beans I could possibly have in total?”
You reason thusly: “Well, given the Copernican principle, this twelfth bean is equally likely to fall anywhere along the distribution of the total number of beans. Thus, for example, all else held equal, there is a 50% chance that it will be within the last 50% of beans removed from the jar – or the first 50%, for that matter.
“But, obviously, it further follows that there is a 70% chance that it will be in the final 70% of beans, and a 95% chance that it will be within the last 95% of beans you might remove, and so on. In this scenario – if the 11 previous beans represent only 5% of the total – then there should be at most 11x20 total beans, or 220. Thus, I can be 95% confident that there are no more than 220 beans. Of course, the actual possible number asymptotically approaches infinity by this reasoning (say I wanted to be 99% confident?), but 95% confidence is good enough for me! So I’ll take 220 as my upper bound…”
You are wrong. I have over 1,000 beans left in the jar.
Or: you are (technically) right. There is only one bean left in the jar.
Or: any other possibility.
Either way, it seems obvious that your reasoning is completely disconnected from the actual number of beans left in the jar. Given the evidence you’ve actually seen, it seems intuitively that it could just as well be any number (12 or greater).
Where did you go wrong?
The proper Bayesian response to evidence is to pick a particular hypothesis – say, “there are fewer than 220 beans,” which is the hypothesis you just pegged at 95% confidence – and then see whether the given evidence (“he stopped at the 12th bean”) updates you towards or away from it.[1]
It seems clear that this kind of update is not what you have done in reasoning about the beans. Rather, you picked a hypothesis that was merely compatible with the evidence – “there are fewer than 220 beans.” You then found this weird value of the percentage of possible worlds wherein the evidence could possibly appear[2], out of possible worlds where the hypothesis is true (i.e. worlds where there are at least 12 beans out of worlds with fewer than 220). And this was then conflated this with the actual posterior probability of the hypothesis.
It seems to me that the Doomsday Argument is exactly analogous to this situation, except that it’s a sentient 12th bean itself (i.e. a human somewhere in the timeline) that happens to be making the guess.
I am not at all confident that I haven’t failed to address some obvious feature of the original argument. Please rebut.
[1] I’ve just tried to do this, but I’m rubbish at math, especially when it includes tricky (to me) things like ranges and summations. (Doesn’t the result depend entirely on your prior probability that there are (0, 220] beans, which would depend on your implicit upper bound for beans to begin with, if you assume there can’t be infinite beans in my jar?)
[2] Not does appear – remember, I could have stopped on any bean. This chunk of possible worlds includes, e.g. the world where I went all the way to bean 219.
His reasoning would be entirely correct if you had determined the number of beans you draw randomly from between 0 and the total number. His priors were all wrong, and so he failed.
Could we take all possible prior distributions, assign to each some prior that is probably wrong, and then use those prior distributions as theories to use the number of beans as evidence for?
Good point; you’re right that his reasoning would be correct if he knew that, e.g., I had used a random number generator to randomly-generate a number between 1 and (total # of beans) and resolved to ask him, only on that numbered bean, to guess the upper bound on the total.
Perhaps to make the bean-game more similar to the original problem, I ought to ask for a guess on the total number after every bean placed, since every bean represents an observer who could be fretting about the Doomsday Argument.
Analogously, it would be misleading to imagine that You the Observer were placed in the human timeline at a single randomly-chosen point by, say, Omega, since every bean (or human) is in fact an observer.
Unfortunately I’m getting muddled and am not clear what consequences this has. Thoughts?
Let’s put this a bit more technically: the reasoning would have been correct if the number of beans were a random value drawn from a known (and sufficiently well-behaved) distribution.