Instead of saying “one piece of paper is drawn and handed to you, and it happens to be a 6” you say “every piece of paper is drawn and handed to a person—each paper is handed to exactly one person, and every person has exactly one paper—and you are one of those people. None of the people can communicate with each other in any way. What probability do you assign that the paper came from the small bag? Now you observe that the number on your piece of paper is a 6. What probability do you assign that the paper came from the small bag?”
Oh, you believe there is a difference between these two scenarios, don’t you? I suppose I’ll have to make a post about it as well. For now consider this problem:
For clarity, what I think in the case of “scenario starts with one small bag containing 1-10 and one large bag containing 1-10,000, each piece of paper goes to exactly one person, each person gets exactly one piece of paper, you are one of the people” is that, prior to seeing what number is on your piece of paper, you should think there’s a 10,00010,000+10≈0.999 chance that the paper you received came from the large bag—i.e. not a uniform prior of “50/50 small bag or large bag”.
Once you observe that your piece of paper has a 6 on it, you now know you either are the person who got the single 6 from the small bag or the person who got the single 6 from the large bag, so your posterior is 11+1=0.5 that the paper you received came from the large bag. Had you instead gotten a piece of paper that had e.g. 7194 on it, you would now be sure that your paper did not come from the small bag.
This is why I am skeptical of your choice to use the randomization procedure of “first choose the bag, then choose within the bag”—that choice of randomization procedure only works the first time you run it, but if you try to distribute the papers one per person that randomization procedure breaks down.
As a note, you could rescue your 50⁄50 prior on small bag vs large bag by saying that the small bag has 1,000 pieces of paper with a 1 on them, 1,000 pieces of paper with a 2 on them, [...], 1,000 pieces of paper with a 10 on them. But in that case, if you observe that your paper has a 6 on it, you know that there are 1,000 people who got a 6 from the small bag and 1 person who got a 6 from the large bag, and so your posterior is that there’s a 11001≈0.001 chance that your paper came from the large bag.
In both cases, observing that your paper had 6 written on it causes you to increase your expectation that the paper you received came from the small bag.
Forget about anthropics. Where does the idea of uniform prior even comes from? What is its justification?
You need to plug some number in for the prior if you want to get a posterior out, and usually you have enough evidence that the number you put in for the prior doesn’t really matter. “A uniform prior” seems like a pretty good one in the complete absence of any other information (note, though, that “there are bags containing different numbers of pieces of paper” is nonzero information). There’s some theoretical justification in the form of e.g. Laplace’s Rule of Succession but honestly the main justification I use is “probabilities are facts about my model of the world, not facts about the world itself, and so the mechanics of my model dictate the probability, and the prior is no exception”.
There are 100 students and 100 tickets. Every student blindly picks a ticket and then its removed from the pool of available tickets, therefore every student gets a unique ticket. Students can decide the order they pick tickets among themselves. You’ve learned only half of the tickets. What is your optimal strategy to maximize your chances to pass the exam? Should you go first? Should you go last? Should you wait till there are only 50 tickets left? What are you probabilities to pass the exam when you follow these strategies?
Absent any further information, there is no benefit of going earlier or later—you have a 50⁄100 chance of passing the class regardless of your strategy, because none of the choices given give you any information about the distribution of remaining tickets.
Counter-scenario for you:
There are 100 students and 100 tickets. The tickets are distributed between two identical bowls which are shuffled around prior to each student’s turn. The first bowl contains the numbers 1 − 10, and the second bowl contains the numbers 11-100. Any student who picks a ticket numbered 51 or higher will pass the class. Each student will pick a bowl. If that bowl contains any tickets, they will blindly pick a ticket from within that bowl. If the bowl they choose contains no tickets, they will blindly pick a ticket from the other bowl.
What is your optimal strategy to maximize your chances to pass the exam? Should you go first? Should you go last? Should you wait till there are only 50 tickets left? What are you probabilities to pass the exam when you follow these strategies?
For clarity, what I think in the case of “scenario starts with one small bag containing 1-10 and one large bag containing 1-10,000, each piece of paper goes to exactly one person, each person gets exactly one piece of paper, you are one of the people” is that, prior to seeing what number is on your piece of paper, you should think there’s a 10,00010,000+10≈0.999 chance that the paper you received came from the large bag—i.e. not a uniform prior of “50/50 small bag or large bag”.
Do you mean a situation where every piece of paper from each of the two bags is given, therefore 10010 people got a piece of paper? If so this is very much not what I’ve been talking about. We are dealing with an either/or case, where only one bag is used to give papers, but you have no idea which one. This should be obvious in the context of doomsday argument, because humanity doesn’t simultaneously has long and short history.
Anyway, as I said, lets forget about anthropics for now and deal with the marble picking example from the previous comment.
the main justification I use is “probabilities are facts about my model of the world, not facts about the world itself, and so the mechanics of my model dictate the probability, and the prior is no exception”.
Oh sure, probabilities are about the model. But we want our models of the world to correspond to the way world actually is, so that our models were useful. You want a model that systematically produces correct answers in reality, not just being self-consistent.
“A uniform prior” seems like a pretty good one in the complete absence of any other information
It sure does. I encorage you to think about why. Anyway, lets, for now, simply accept that if we do not have any information about some alternatives we assume equiprobable prior. Now consider these scenarios. I think They highlight the crux of disagreement very well:
1. There is a bag on your table. You know that it’s either bag 1 or bag 2 and you know nothing else. What should be your credence that it’s bag 1?
Here we have uniform prior between bags. P(Bag 1) = P(Bag 2) = 1⁄2
2. A marble was picked. You know that there are 20 possible marbles that could be picked: 14 red and 6 blue. What is your credence that the marble blue?
Here we have uniform prior between marbles: P(Blue) = 3⁄10
3. There is a bag on your table. You know that it’s either bag 1 which contains 9 red and 1 blue marbles or bag 2 which contains 5 red and 5 blue marbles. You see a person blindly picked a marble from the bag. What’s the probability that the marble the person picked is blue.
Are we supposed to be using uniform prior over bags or are we supposed to be using uniform prior over marbles here? This is an important question because even though In this case both produce the same answer:
If the arrangement of the marbles was different, say, bag 1 one contained 10 red and 2 blue while bag 2 contained 4 red and 4 blue, the situation changes:
So how do we decide what to do? What is the principled position here? Notice that this case has absolutely nothing to do with anthropics and is basic probability theory.
For clarity, what I think in the case of “scenario starts with one small bag containing 1-10 and one large bag containing 1-10,000, each piece of paper goes to exactly one person, each person gets exactly one piece of paper, you are one of the people” is that, prior to seeing what number is on your piece of paper, you should think there’s a 10,00010,000+10≈0.999 chance that the paper you received came from the large bag—i.e. not a uniform prior of “50/50 small bag or large bag”.
Once you observe that your piece of paper has a 6 on it, you now know you either are the person who got the single 6 from the small bag or the person who got the single 6 from the large bag, so your posterior is 11+1=0.5 that the paper you received came from the large bag. Had you instead gotten a piece of paper that had e.g. 7194 on it, you would now be sure that your paper did not come from the small bag.
This is why I am skeptical of your choice to use the randomization procedure of “first choose the bag, then choose within the bag”—that choice of randomization procedure only works the first time you run it, but if you try to distribute the papers one per person that randomization procedure breaks down.
As a note, you could rescue your 50⁄50 prior on small bag vs large bag by saying that the small bag has 1,000 pieces of paper with a 1 on them, 1,000 pieces of paper with a 2 on them, [...], 1,000 pieces of paper with a 10 on them. But in that case, if you observe that your paper has a 6 on it, you know that there are 1,000 people who got a 6 from the small bag and 1 person who got a 6 from the large bag, and so your posterior is that there’s a 11001≈0.001 chance that your paper came from the large bag.
In both cases, observing that your paper had 6 written on it causes you to increase your expectation that the paper you received came from the small bag.
You need to plug some number in for the prior if you want to get a posterior out, and usually you have enough evidence that the number you put in for the prior doesn’t really matter. “A uniform prior” seems like a pretty good one in the complete absence of any other information (note, though, that “there are bags containing different numbers of pieces of paper” is nonzero information). There’s some theoretical justification in the form of e.g. Laplace’s Rule of Succession but honestly the main justification I use is “probabilities are facts about my model of the world, not facts about the world itself, and so the mechanics of my model dictate the probability, and the prior is no exception”.
Absent any further information, there is no benefit of going earlier or later—you have a 50⁄100 chance of passing the class regardless of your strategy, because none of the choices given give you any information about the distribution of remaining tickets.
Counter-scenario for you:
There are 100 students and 100 tickets. The tickets are distributed between two identical bowls which are shuffled around prior to each student’s turn. The first bowl contains the numbers 1 − 10, and the second bowl contains the numbers 11-100. Any student who picks a ticket numbered 51 or higher will pass the class. Each student will pick a bowl. If that bowl contains any tickets, they will blindly pick a ticket from within that bowl. If the bowl they choose contains no tickets, they will blindly pick a ticket from the other bowl.
What is your optimal strategy to maximize your chances to pass the exam? Should you go first? Should you go last? Should you wait till there are only 50 tickets left? What are you probabilities to pass the exam when you follow these strategies?
Do you mean a situation where every piece of paper from each of the two bags is given, therefore 10010 people got a piece of paper? If so this is very much not what I’ve been talking about. We are dealing with an either/or case, where only one bag is used to give papers, but you have no idea which one. This should be obvious in the context of doomsday argument, because humanity doesn’t simultaneously has long and short history.
Anyway, as I said, lets forget about anthropics for now and deal with the marble picking example from the previous comment.
Oh sure, probabilities are about the model. But we want our models of the world to correspond to the way world actually is, so that our models were useful. You want a model that systematically produces correct answers in reality, not just being self-consistent.
It sure does. I encorage you to think about why. Anyway, lets, for now, simply accept that if we do not have any information about some alternatives we assume equiprobable prior. Now consider these scenarios. I think They highlight the crux of disagreement very well:
1. There is a bag on your table. You know that it’s either bag 1 or bag 2 and you know nothing else. What should be your credence that it’s bag 1?
Here we have uniform prior between bags. P(Bag 1) = P(Bag 2) = 1⁄2
2. A marble was picked. You know that there are 20 possible marbles that could be picked: 14 red and 6 blue. What is your credence that the marble blue?
Here we have uniform prior between marbles: P(Blue) = 3⁄10
3. There is a bag on your table. You know that it’s either bag 1 which contains 9 red and 1 blue marbles or bag 2 which contains 5 red and 5 blue marbles. You see a person blindly picked a marble from the bag. What’s the probability that the marble the person picked is blue.
Are we supposed to be using uniform prior over bags or are we supposed to be using uniform prior over marbles here? This is an important question because even though In this case both produce the same answer:
Over bags:
P(Bag 1) = P(Bag 2) = 1⁄2
P(Blue) = P(Blue|Bag 1)P(Bag 1) + P(Blue| Bag 2)P(Bag 2) = 1⁄20 + 1⁄4 = 6⁄20 = 3⁄10
Over marbles:
P(Blue) = 3⁄10
If the arrangement of the marbles was different, say, bag 1 one contained 10 red and 2 blue while bag 2 contained 4 red and 4 blue, the situation changes:
Over bags:
P(Blue) = P(Blue|Bag 1)P(Bag 1) + P(Blue| Bag 2)P(Bag 2) = 1⁄22 + 1⁄4 = 6⁄20 = 13⁄44
Over marbles:
P(Blue) = 3⁄10
So how do we decide what to do? What is the principled position here? Notice that this case has absolutely nothing to do with anthropics and is basic probability theory.