You can, but then you’d need to justify why “choose a bag, then choose from within the bag” is analogous to the correct way to do anthropic reasoning, despite failing in an obvious way in non-anthropic scenarios (e.g. you don’t have an equal probabilityof choosing any human on earth by following the procedure “pick a random country, then pick a random person within that country”, so at least to me it would be surprising if that was the correct procedure across universes).
despite failing in an obvious way in non-anthropic scenarios
On the contrary. This is exactly the way we do reasoning about probability every time, the idea of “mixing bags” suddenly comes up only in anthropics. Which, I suppose, can be somewhat justified by the fact that this case is isomorphic to picking from a bag with possibility not to find anything, but the point is that its what requires extra justification, not the other way around.
When we evaluate alternative theories based on our available evidence we do not combine their possible outcomes together into one set and assume equiprobable distribution over the elementary events for no reason. We assign equiprobable prior to the theories and then evaluate their likelihood conditionally on available evidence via Bayes Theorem.
you don’t have an equal probabilityof choosing any human on earth by following the procedure “pick a random country, then pick a random person within that country”
We are not supposed to have equal probability. I really don’t see where are you going with this example.
We assign equiprobable prior to the theories and then evaluate their likelihood conditionally on available evidence via Bayes Theorem.
I’m not entirely sure what you mean by this. Can you give a concrete, non-anthropic example?
Alternatively, in your “one bag with numbers 1-10 and one bag with numbers 1-10000” example, would you change your answer if there were instead 100,000 identical small bags with numbers 1-10 and 100 identical large bags with numbers 1-10000? If that would change your answer, then we don’t have to do any mixing at all—the anthropic argument as I understand it says that observing a 6 gives you informaron about the ratio of small bags to large bags, rather than information about your bag.
Well, the paper picking is exactly this kind of uncontroversial non-anthropic example. I specifically started from it in order for the subject to be less confusing, so its a bit ironic.
You have two alternative hypothesizes about the bag. Either it contains 10000 paper pieces or just 10. You don’t have any additional information which of the hypothesizes is more likely. Therefore, you assign equal prior probabilities to this two hypothesizes. Then, after you received the paper with 6 written on it you update your credence according to Bayes Theorem. Now the theory that there are only 10 pieces of paper in the bag seems much more likely in light of this new evidence.
would you change your answer if there were instead 100,000 identical small bags with numbers 1-10 and 100 identical large bags with numbers 1-10000?
It would mean that prior probabilities between the two hypothesizes are not equal but are 1000:1 instead. Naturally this would affect the resulting credence but not the logic of the update.
It would mean that prior probabilities between the two hypothesizes are not equal but are 1000:1 instead. Naturally this would affect the resulting credence but not the logic of the update.
Now let’s say we put all of the small bags in one sack, and all of the large bags in a different sack. Now we have a sack containing 100,000 small bags with numbers 1-10, and a sack containing 100 large bags with numbers 1-10,000. Does this now change your prior when choosing a random piece of paper?
What if, instead of physically putting the small and large bags inside of sacks, we just put them in the categories “small” and “large” in our head?
Basically what I’m saying here is that “we choose at random” or “we have a uniform prior” is underspecified. Which is why I’m asking for a concrete example where the “choose at random” procedure is more clearly defined, and a justification for why we should expect that specific “choose at random” procedure to map onto the question of whether we should expect to find ourselves in a world where we are probably not among the very first or very last observers asking that kind of anthropic question.
As a note, this should all add up to normality. If everyone who evaluates the DA guesses “I am not in the chronological first 1% of people in my world to ask this question, and I am also not in the chronological last 1% of people in my world to ask this question”, we should expect 98% of those people to be correct in their guess.
Edited to add: One more intuition pump for the original scenario with the small bag of 1-10 and the large bag of 1-10,000: Instead of saying “one piece of paper is drawn and handed to you, and it happens to be a 6” you say “every piece of paper is drawn and handed to a person—each paper is handed to exactly one person, and every person has exactly one paper—and you are one of those people. None of the people can communicate with each other in any way. What probability do you assign that the paper came from the small bag? Now you observe that the number on your piece of paper is a 6. What probability do you assign that the paper came from the small bag?”. I think that intuition pump maps better to the anthropic arguments I’ve seen.
Now let’s say we put all of the small bags in one sack, and all of the large bags in a different sack. Now we have a sack containing 100,000 small bags with numbers 1-10, and a sack containing 100 large bags with numbers 1-10,000. Does this now change your prior when choosing a random piece of paper?
Here we are back to the 1:1 prior between theories. It doesn’t matter how many equal bags you put together. 1⁄10 = 100000/1000000.
Basically what I’m saying here is that “we choose at random” or “we have a uniform prior” is underspecified.
Unless we explicitly specify it, like I did in the previous comment where I said that uniform prior is over two alternative hypothezises which we do not have any evidence for or against.
I feel that there is some confusion about fundamentals of the probability theory is going on and that’s why we are talking past each other. Lets take a step back. Forget about anthropics. Where does the idea of uniform prior even comes from? What is its justification?
Suppose you have a bag about which you know that it either contains 9 red marbles and 1 blue marble or 5 blue marbles and 5 red marbles. You have no idea which is more likely. You blindly pick one marble from the bag and its blue. How should you reason about this scenario?
Should you assume that the two “possible bags” are mixed together and there is uniform prior over the marbles from the united bag? Or should you assume that there is uniform prior over two alternative theories? Or should you assume that there are some M and N such as M bags with mostly red marbles and N bags with equal number of marbles are mixed together? Or should you assume that there is uniform prior over M+N bags? What should be different about the setting of the experiment or your knowledge state about it so that the answer was different?
Instead of saying “one piece of paper is drawn and handed to you, and it happens to be a 6” you say “every piece of paper is drawn and handed to a person—each paper is handed to exactly one person, and every person has exactly one paper—and you are one of those people. None of the people can communicate with each other in any way. What probability do you assign that the paper came from the small bag? Now you observe that the number on your piece of paper is a 6. What probability do you assign that the paper came from the small bag?”
Oh, you believe there is a difference between these two scenarios, don’t you? I suppose I’ll have to make a post about it as well. For now consider this problem:
There are 100 students and 100 tickets. Every student blindly picks a ticket and then its removed from the pool of available tickets, therefore every student gets a unique ticket. Students can decide the order they pick tickets among themselves. You’ve learned only half of the tickets. What is your optimal strategy to maximize your chances to pass the exam? Should you go first? Should you go last? Should you wait till there are only 50 tickets left? What are you probabilities to pass the exam when you follow these strategies?
Instead of saying “one piece of paper is drawn and handed to you, and it happens to be a 6” you say “every piece of paper is drawn and handed to a person—each paper is handed to exactly one person, and every person has exactly one paper—and you are one of those people. None of the people can communicate with each other in any way. What probability do you assign that the paper came from the small bag? Now you observe that the number on your piece of paper is a 6. What probability do you assign that the paper came from the small bag?”
Oh, you believe there is a difference between these two scenarios, don’t you? I suppose I’ll have to make a post about it as well. For now consider this problem:
For clarity, what I think in the case of “scenario starts with one small bag containing 1-10 and one large bag containing 1-10,000, each piece of paper goes to exactly one person, each person gets exactly one piece of paper, you are one of the people” is that, prior to seeing what number is on your piece of paper, you should think there’s a 10,00010,000+10≈0.999 chance that the paper you received came from the large bag—i.e. not a uniform prior of “50/50 small bag or large bag”.
Once you observe that your piece of paper has a 6 on it, you now know you either are the person who got the single 6 from the small bag or the person who got the single 6 from the large bag, so your posterior is 11+1=0.5 that the paper you received came from the large bag. Had you instead gotten a piece of paper that had e.g. 7194 on it, you would now be sure that your paper did not come from the small bag.
This is why I am skeptical of your choice to use the randomization procedure of “first choose the bag, then choose within the bag”—that choice of randomization procedure only works the first time you run it, but if you try to distribute the papers one per person that randomization procedure breaks down.
As a note, you could rescue your 50⁄50 prior on small bag vs large bag by saying that the small bag has 1,000 pieces of paper with a 1 on them, 1,000 pieces of paper with a 2 on them, [...], 1,000 pieces of paper with a 10 on them. But in that case, if you observe that your paper has a 6 on it, you know that there are 1,000 people who got a 6 from the small bag and 1 person who got a 6 from the large bag, and so your posterior is that there’s a 11001≈0.001 chance that your paper came from the large bag.
In both cases, observing that your paper had 6 written on it causes you to increase your expectation that the paper you received came from the small bag.
Forget about anthropics. Where does the idea of uniform prior even comes from? What is its justification?
You need to plug some number in for the prior if you want to get a posterior out, and usually you have enough evidence that the number you put in for the prior doesn’t really matter. “A uniform prior” seems like a pretty good one in the complete absence of any other information (note, though, that “there are bags containing different numbers of pieces of paper” is nonzero information). There’s some theoretical justification in the form of e.g. Laplace’s Rule of Succession but honestly the main justification I use is “probabilities are facts about my model of the world, not facts about the world itself, and so the mechanics of my model dictate the probability, and the prior is no exception”.
There are 100 students and 100 tickets. Every student blindly picks a ticket and then its removed from the pool of available tickets, therefore every student gets a unique ticket. Students can decide the order they pick tickets among themselves. You’ve learned only half of the tickets. What is your optimal strategy to maximize your chances to pass the exam? Should you go first? Should you go last? Should you wait till there are only 50 tickets left? What are you probabilities to pass the exam when you follow these strategies?
Absent any further information, there is no benefit of going earlier or later—you have a 50⁄100 chance of passing the class regardless of your strategy, because none of the choices given give you any information about the distribution of remaining tickets.
Counter-scenario for you:
There are 100 students and 100 tickets. The tickets are distributed between two identical bowls which are shuffled around prior to each student’s turn. The first bowl contains the numbers 1 − 10, and the second bowl contains the numbers 11-100. Any student who picks a ticket numbered 51 or higher will pass the class. Each student will pick a bowl. If that bowl contains any tickets, they will blindly pick a ticket from within that bowl. If the bowl they choose contains no tickets, they will blindly pick a ticket from the other bowl.
What is your optimal strategy to maximize your chances to pass the exam? Should you go first? Should you go last? Should you wait till there are only 50 tickets left? What are you probabilities to pass the exam when you follow these strategies?
For clarity, what I think in the case of “scenario starts with one small bag containing 1-10 and one large bag containing 1-10,000, each piece of paper goes to exactly one person, each person gets exactly one piece of paper, you are one of the people” is that, prior to seeing what number is on your piece of paper, you should think there’s a 10,00010,000+10≈0.999 chance that the paper you received came from the large bag—i.e. not a uniform prior of “50/50 small bag or large bag”.
Do you mean a situation where every piece of paper from each of the two bags is given, therefore 10010 people got a piece of paper? If so this is very much not what I’ve been talking about. We are dealing with an either/or case, where only one bag is used to give papers, but you have no idea which one. This should be obvious in the context of doomsday argument, because humanity doesn’t simultaneously has long and short history.
Anyway, as I said, lets forget about anthropics for now and deal with the marble picking example from the previous comment.
the main justification I use is “probabilities are facts about my model of the world, not facts about the world itself, and so the mechanics of my model dictate the probability, and the prior is no exception”.
Oh sure, probabilities are about the model. But we want our models of the world to correspond to the way world actually is, so that our models were useful. You want a model that systematically produces correct answers in reality, not just being self-consistent.
“A uniform prior” seems like a pretty good one in the complete absence of any other information
It sure does. I encorage you to think about why. Anyway, lets, for now, simply accept that if we do not have any information about some alternatives we assume equiprobable prior. Now consider these scenarios. I think They highlight the crux of disagreement very well:
1. There is a bag on your table. You know that it’s either bag 1 or bag 2 and you know nothing else. What should be your credence that it’s bag 1?
Here we have uniform prior between bags. P(Bag 1) = P(Bag 2) = 1⁄2
2. A marble was picked. You know that there are 20 possible marbles that could be picked: 14 red and 6 blue. What is your credence that the marble blue?
Here we have uniform prior between marbles: P(Blue) = 3⁄10
3. There is a bag on your table. You know that it’s either bag 1 which contains 9 red and 1 blue marbles or bag 2 which contains 5 red and 5 blue marbles. You see a person blindly picked a marble from the bag. What’s the probability that the marble the person picked is blue.
Are we supposed to be using uniform prior over bags or are we supposed to be using uniform prior over marbles here? This is an important question because even though In this case both produce the same answer:
If the arrangement of the marbles was different, say, bag 1 one contained 10 red and 2 blue while bag 2 contained 4 red and 4 blue, the situation changes:
So how do we decide what to do? What is the principled position here? Notice that this case has absolutely nothing to do with anthropics and is basic probability theory.
You can, but then you’d need to justify why “choose a bag, then choose from within the bag” is analogous to the correct way to do anthropic reasoning, despite failing in an obvious way in non-anthropic scenarios (e.g. you don’t have an equal probabilityof choosing any human on earth by following the procedure “pick a random country, then pick a random person within that country”, so at least to me it would be surprising if that was the correct procedure across universes).
On the contrary. This is exactly the way we do reasoning about probability every time, the idea of “mixing bags” suddenly comes up only in anthropics. Which, I suppose, can be somewhat justified by the fact that this case is isomorphic to picking from a bag with possibility not to find anything, but the point is that its what requires extra justification, not the other way around.
When we evaluate alternative theories based on our available evidence we do not combine their possible outcomes together into one set and assume equiprobable distribution over the elementary events for no reason. We assign equiprobable prior to the theories and then evaluate their likelihood conditionally on available evidence via Bayes Theorem.
We are not supposed to have equal probability. I really don’t see where are you going with this example.
I’m not entirely sure what you mean by this. Can you give a concrete, non-anthropic example?
Alternatively, in your “one bag with numbers 1-10 and one bag with numbers 1-10000” example, would you change your answer if there were instead 100,000 identical small bags with numbers 1-10 and 100 identical large bags with numbers 1-10000? If that would change your answer, then we don’t have to do any mixing at all—the anthropic argument as I understand it says that observing a 6 gives you informaron about the ratio of small bags to large bags, rather than information about your bag.
Well, the paper picking is exactly this kind of uncontroversial non-anthropic example. I specifically started from it in order for the subject to be less confusing, so its a bit ironic.
You have two alternative hypothesizes about the bag. Either it contains 10000 paper pieces or just 10. You don’t have any additional information which of the hypothesizes is more likely. Therefore, you assign equal prior probabilities to this two hypothesizes. Then, after you received the paper with 6 written on it you update your credence according to Bayes Theorem. Now the theory that there are only 10 pieces of paper in the bag seems much more likely in light of this new evidence.
It would mean that prior probabilities between the two hypothesizes are not equal but are 1000:1 instead. Naturally this would affect the resulting credence but not the logic of the update.
Now let’s say we put all of the small bags in one sack, and all of the large bags in a different sack. Now we have a sack containing 100,000 small bags with numbers 1-10, and a sack containing 100 large bags with numbers 1-10,000. Does this now change your prior when choosing a random piece of paper?
What if, instead of physically putting the small and large bags inside of sacks, we just put them in the categories “small” and “large” in our head?
Basically what I’m saying here is that “we choose at random” or “we have a uniform prior” is underspecified. Which is why I’m asking for a concrete example where the “choose at random” procedure is more clearly defined, and a justification for why we should expect that specific “choose at random” procedure to map onto the question of whether we should expect to find ourselves in a world where we are probably not among the very first or very last observers asking that kind of anthropic question.
As a note, this should all add up to normality. If everyone who evaluates the DA guesses “I am not in the chronological first 1% of people in my world to ask this question, and I am also not in the chronological last 1% of people in my world to ask this question”, we should expect 98% of those people to be correct in their guess.
Edited to add: One more intuition pump for the original scenario with the small bag of 1-10 and the large bag of 1-10,000: Instead of saying “one piece of paper is drawn and handed to you, and it happens to be a 6” you say “every piece of paper is drawn and handed to a person—each paper is handed to exactly one person, and every person has exactly one paper—and you are one of those people. None of the people can communicate with each other in any way. What probability do you assign that the paper came from the small bag? Now you observe that the number on your piece of paper is a 6. What probability do you assign that the paper came from the small bag?”. I think that intuition pump maps better to the anthropic arguments I’ve seen.
Here we are back to the 1:1 prior between theories. It doesn’t matter how many equal bags you put together. 1⁄10 = 100000/1000000.
Unless we explicitly specify it, like I did in the previous comment where I said that uniform prior is over two alternative hypothezises which we do not have any evidence for or against.
I feel that there is some confusion about fundamentals of the probability theory is going on and that’s why we are talking past each other. Lets take a step back. Forget about anthropics. Where does the idea of uniform prior even comes from? What is its justification?
Suppose you have a bag about which you know that it either contains 9 red marbles and 1 blue marble or 5 blue marbles and 5 red marbles. You have no idea which is more likely. You blindly pick one marble from the bag and its blue. How should you reason about this scenario?
Should you assume that the two “possible bags” are mixed together and there is uniform prior over the marbles from the united bag? Or should you assume that there is uniform prior over two alternative theories? Or should you assume that there are some M and N such as M bags with mostly red marbles and N bags with equal number of marbles are mixed together? Or should you assume that there is uniform prior over M+N bags? What should be different about the setting of the experiment or your knowledge state about it so that the answer was different?
Oh, you believe there is a difference between these two scenarios, don’t you? I suppose I’ll have to make a post about it as well. For now consider this problem:
There are 100 students and 100 tickets. Every student blindly picks a ticket and then its removed from the pool of available tickets, therefore every student gets a unique ticket. Students can decide the order they pick tickets among themselves. You’ve learned only half of the tickets. What is your optimal strategy to maximize your chances to pass the exam? Should you go first? Should you go last? Should you wait till there are only 50 tickets left? What are you probabilities to pass the exam when you follow these strategies?
For clarity, what I think in the case of “scenario starts with one small bag containing 1-10 and one large bag containing 1-10,000, each piece of paper goes to exactly one person, each person gets exactly one piece of paper, you are one of the people” is that, prior to seeing what number is on your piece of paper, you should think there’s a 10,00010,000+10≈0.999 chance that the paper you received came from the large bag—i.e. not a uniform prior of “50/50 small bag or large bag”.
Once you observe that your piece of paper has a 6 on it, you now know you either are the person who got the single 6 from the small bag or the person who got the single 6 from the large bag, so your posterior is 11+1=0.5 that the paper you received came from the large bag. Had you instead gotten a piece of paper that had e.g. 7194 on it, you would now be sure that your paper did not come from the small bag.
This is why I am skeptical of your choice to use the randomization procedure of “first choose the bag, then choose within the bag”—that choice of randomization procedure only works the first time you run it, but if you try to distribute the papers one per person that randomization procedure breaks down.
As a note, you could rescue your 50⁄50 prior on small bag vs large bag by saying that the small bag has 1,000 pieces of paper with a 1 on them, 1,000 pieces of paper with a 2 on them, [...], 1,000 pieces of paper with a 10 on them. But in that case, if you observe that your paper has a 6 on it, you know that there are 1,000 people who got a 6 from the small bag and 1 person who got a 6 from the large bag, and so your posterior is that there’s a 11001≈0.001 chance that your paper came from the large bag.
In both cases, observing that your paper had 6 written on it causes you to increase your expectation that the paper you received came from the small bag.
You need to plug some number in for the prior if you want to get a posterior out, and usually you have enough evidence that the number you put in for the prior doesn’t really matter. “A uniform prior” seems like a pretty good one in the complete absence of any other information (note, though, that “there are bags containing different numbers of pieces of paper” is nonzero information). There’s some theoretical justification in the form of e.g. Laplace’s Rule of Succession but honestly the main justification I use is “probabilities are facts about my model of the world, not facts about the world itself, and so the mechanics of my model dictate the probability, and the prior is no exception”.
Absent any further information, there is no benefit of going earlier or later—you have a 50⁄100 chance of passing the class regardless of your strategy, because none of the choices given give you any information about the distribution of remaining tickets.
Counter-scenario for you:
There are 100 students and 100 tickets. The tickets are distributed between two identical bowls which are shuffled around prior to each student’s turn. The first bowl contains the numbers 1 − 10, and the second bowl contains the numbers 11-100. Any student who picks a ticket numbered 51 or higher will pass the class. Each student will pick a bowl. If that bowl contains any tickets, they will blindly pick a ticket from within that bowl. If the bowl they choose contains no tickets, they will blindly pick a ticket from the other bowl.
What is your optimal strategy to maximize your chances to pass the exam? Should you go first? Should you go last? Should you wait till there are only 50 tickets left? What are you probabilities to pass the exam when you follow these strategies?
Do you mean a situation where every piece of paper from each of the two bags is given, therefore 10010 people got a piece of paper? If so this is very much not what I’ve been talking about. We are dealing with an either/or case, where only one bag is used to give papers, but you have no idea which one. This should be obvious in the context of doomsday argument, because humanity doesn’t simultaneously has long and short history.
Anyway, as I said, lets forget about anthropics for now and deal with the marble picking example from the previous comment.
Oh sure, probabilities are about the model. But we want our models of the world to correspond to the way world actually is, so that our models were useful. You want a model that systematically produces correct answers in reality, not just being self-consistent.
It sure does. I encorage you to think about why. Anyway, lets, for now, simply accept that if we do not have any information about some alternatives we assume equiprobable prior. Now consider these scenarios. I think They highlight the crux of disagreement very well:
1. There is a bag on your table. You know that it’s either bag 1 or bag 2 and you know nothing else. What should be your credence that it’s bag 1?
Here we have uniform prior between bags. P(Bag 1) = P(Bag 2) = 1⁄2
2. A marble was picked. You know that there are 20 possible marbles that could be picked: 14 red and 6 blue. What is your credence that the marble blue?
Here we have uniform prior between marbles: P(Blue) = 3⁄10
3. There is a bag on your table. You know that it’s either bag 1 which contains 9 red and 1 blue marbles or bag 2 which contains 5 red and 5 blue marbles. You see a person blindly picked a marble from the bag. What’s the probability that the marble the person picked is blue.
Are we supposed to be using uniform prior over bags or are we supposed to be using uniform prior over marbles here? This is an important question because even though In this case both produce the same answer:
Over bags:
P(Bag 1) = P(Bag 2) = 1⁄2
P(Blue) = P(Blue|Bag 1)P(Bag 1) + P(Blue| Bag 2)P(Bag 2) = 1⁄20 + 1⁄4 = 6⁄20 = 3⁄10
Over marbles:
P(Blue) = 3⁄10
If the arrangement of the marbles was different, say, bag 1 one contained 10 red and 2 blue while bag 2 contained 4 red and 4 blue, the situation changes:
Over bags:
P(Blue) = P(Blue|Bag 1)P(Bag 1) + P(Blue| Bag 2)P(Bag 2) = 1⁄22 + 1⁄4 = 6⁄20 = 13⁄44
Over marbles:
P(Blue) = 3⁄10
So how do we decide what to do? What is the principled position here? Notice that this case has absolutely nothing to do with anthropics and is basic probability theory.