The paradox arises because the second assumption is somewhat artificial, and when describing the problem in an actual setting things get a bit sticky. Just how do we know that “at least” one is a boy? One description of the problem states that we look into a window, see only one child and it is a boy. This sounds like the same assumption. However, this one is equivalent to “sampling” the distribution (i.e. removing one child from the urn, ascertaining that it is a boy, then replacing). Let’s call the statement “the sample is a boy” proposition “b”. Now we have:
P(BB|b) = P(b|BB) P(BB) / P(b) = 1 1⁄4 / 1⁄2 = 1⁄2.
The difference here is the P(b), which is just the probability of drawing a boy from all possible cases (i.e. without the “at least”), which is clearly 0.5.
The Bayesian analysis generalizes easily to the case in which we relax the 50⁄50 population assumption. If we have no information about the populations then we assume a “flat prior”, i.e. P(GG) = P(BB) = P(G.B) = 1⁄3. In this case the “at least” assumption produces the result P(BB|B) = 1⁄2, and the sampling assumption produces P(BB|b) = 2⁄3, a result also derivable from the Rule of Succession.
We have no general population information here. We have one man with at least one boy.
I’m not at all sure you understand that quote. Lets stick with the coin flips:
Do you understand why these two questions are different:
I tell you- “I flipped two coins, at least one of them came out heads, what is the probability that I flipped two heads?” A:1/3
AND
“I flipped two coins, you choose one at random and look at it, its heads.What is the probability I flipped two heads” A: 1⁄2
For the record, I’m sure this is frustrating as all getout for you, but this whole argument has really clarified things for me, even though I still think I’m right about which question we are answering.
Many of my arguments in previous posts are wrong (or at least incomplete and a bit naive), and it didn’t click until the last post or two.
Like I said, I still think I’m right, but not because my prior analysis was any good. The 1⁄3 case was a major hole in my reasoning. I’m happily waiting to see if you’re going to destroy my latest analysis, but I think it is pretty solid.
Yes, and we are dealing with the second question here.
Is that not what I said before?
We don’t have 1000 families with two children, from which we’ve selected all families that have at least one boy (which gives 1⁄3 probability). We have one family with two children. Then we are told one of the children is a boy, and given zero other information. The probability that the second is a boy is 1⁄2, so the probability that both are boys is 1⁄2.
The possible options for the “Boy born on Tuesday” are not Boy/Girl, Girl/Boy, Boy/Boy. That would be the case in the selection of 1000 families above.
The possible options are Boy (Tu) / Girl, Girl / Boy (Tu), Boy (Tu) / Boy, Boy / Boy (Tu).
There are two Boy/Boy combinations, not one. You don’t have enough information to throw one of them out.
As long as you realize there is a difference between those two questions, fine. We can disagree about what assumptions the wording should lead us to, thats irrelevant to the actual statistics and can be an agree-to-disagree situation. Its just important to realize that what the question means/how you get the information is important.
We don’t have 1000 families with two children, from which we’ve selected all families that have at least one boy (which gives 1⁄3 probability). We have one family with two children. Then we are told one of the children is a boy, and given zero other information.
If we have one family with two children, of which one is a boy, they are (by definition) a member of the set “all families that have at least one boy.” So it matters how we got the information.
If we got that information by grabbing a kid at random and looking at it (so we have information about one specific child), that is sampling, and it leads to the 1⁄2 probability.
If we got that information by having someone check both kids, and tell us “at least one is a boy” we have different information (its information about the set of kids the parents have, not information about one specific kid).
This is NOT a case of sampling.
If it IS sampling (if I grab a kid at random and say “whats your Birthday?” and it happens to be Tuesday), then the probability is 1⁄2. (we have information about the specific kid’s birthday).
If instead, I ask the parents to tell me the birthday of one of their children, and the parent says ‘I have at least one boy born on Tuesday’, then we get, instead, information about their set of kids, and the probability is the larger number.
Sampling is what leads to the answer you are supporting.
The answer I’m supporting is based on flat priors, not sampling. I’m saying there are two possible Boy/Boy combinations, not one, and therefore it takes up half the probability space, not 1⁄3.
Sampling to the “Boy on Tuesday” problem gives roughly 48% (as per the original article), not 50%.
We are simply told that the man has a boy who was born on tuesday. We aren’t told how he chose that boy, whether he’s older or younger, etc. Therefore we have four possibilites, like I outlined above.
Is my analysis that the possibilities are Boy (Tu) /Girl, Girl / Boy (Tu), Boy (Tu)/Boy, Boy/Boy (Tu) correct?
If so, is not the probability for some combination of Boy/Boy 1/2? If not, why not? I don’t see it.
BTW, contrary to my previous posts, having the information about the boy born on Tuesday is critical because it allows us (and in fact requires us) to distinguish between the two boys.
That was in fact the point of the original article, which I now disagree with significantly less. In fact, I agree with the major premise that the tuesday information pushes the odds of Boy/Boy closer 50%, I just disagree that you can’t reason that it pushes it to exactly 50%.
Is my analysis that the possibilities are Boy (Tu) /Girl, Girl / Boy (Tu), Boy (Tu)/Boy, Boy/Boy (Tu) correct?
No. For any day of the week EXCEPT Tuesday, boy and girl are equivalent. For the case of both children born on Tuesday you have for girls: Boy(tu)/Girl(tu),Girl(tu)/Boy(tu), and for boys: boy(tu)/boy(tu).
That was in fact the point of the original article, which I now disagree with significantly less. In fact, I agree with the major premise that the tuesday information pushes the odds of Boy/Boy closer 50%, I just disagree that you can’t reason that it pushes it to exactly 50%.
This statement leads me to believe you are still confused. Do you agree that if I know a family has two kids, I knock on the door and a boy answers and says “I was born on a Tuesday,” that the probability of the second kid being a girl is 1/2? And in this case, Tuesday is irrelevant? (This the wikipedia called “sampling”)
Do you agree that if, instead, the parents give you the information “one of my two kids is a boy born on a Tuesday”, that this is a different sort of information, information about the set of their children, and not about a specific child?
This statement leads me to believe you are still confused. Do you agree that if I know a family has two kids, I knock on the door and a boy answers and says “I was born on a Tuesday,” that the probability of the second kid being a girl is 1/2? And in this case, Tuesday is irrelevant? (This the wikipedia called “sampling”)
I agree with this.
Do you agree that if, instead, the parents give you the information “one of my two kids is a boy born on a Tuesday”, that this is a different sort of information, information about the set of their children, and not about a specific child?
I agree with this if they said something along the lines of “One and only one of them was born on Tuesday”. If not, I don’t see how the Boy(tu)/Boy(tu) configuration has the same probability as the others, because it’s twice as likely as the other two configurations that that is the configuration they are talking about when they say “One was born on Tuesday”.
Here’s my breakdown with 1000 families, to try to make it clear what I mean:
1000 Families with two children, 750 have boys.
Of the 750, 500 have one boy and one girl. Of these 500, 1⁄7, or roughly 71 have a boy born on Tuesday.
Of the 750, 250 have two boys. Of these 250, 2⁄7, or roughly 71 have a boy born on Tuesday.
71 = 71, so it’s equally likely that there are two boys as there are a boy and a girl.
Having two boys doubles the probability that one boy was born on Tuesday compared to having just one boy.
And I don’t think I’m confused about the sampling, because I didn’t use the sampling reasoning to get my result*, but I’m not super confident about that so if I am just keep giving me numbers and hopefully it will click.
*I mean in the previous post, not specifically this post.
Of these 250, 2⁄7, or roughly 71 have a boy born on Tuesday.
This is wrong. With two boys each with a probability of 1⁄7 to be born on Tuesday, the probability of at least one on a Tuesday isn’t 2⁄7, its 1-(6/7)^2
How can that be? There is a 1⁄7 chance that one of the two is born on Tuesday, and there is a 1⁄7 chance that the other is born on Tuesday. 1⁄7 + 1⁄7 is 2⁄7.
There is also a 1⁄49 chance that both are born on tuesday, but how does that subtract from the other two numbers? It doesn’t change the probability that either of them are born on Tuesday, and both of those probabilities add.
You overcount, the both on Tuesday is overcounted there. Think of it this way- if I have 8 kids do I have a better than 100% probability of having a kid born on Tuesday?
There is a 1/7x6/7 chance the first is born on Tuesday and the second is born on another day. There is a 1/7x6/7 chance the second is born on Tuesday and the first is born on another day. And there is a 1⁄49 chance that both are born on Tuesday.
All together thats 13⁄49. Alternatively, there is a (6/7)^2 chance that both are born not-on-Tuesday, so 1-(6/7)^2 tells you the complementary probability.
I’ve seen that same explanation at least five times and it didn’t click until just now. You can’t distinguish between the two on tuesday, so you can only count it once for the pair.
Which means the article I said was wrong was absolutely right, and if you were told that, say one boy was born on January 17th, the chances of both being born on the same day are 1-(364/365)^2 (ignoring leap years), which gives a final probability of roughly 49.46% that both are boys.
Thanks for your patience!
ETA: I also think I see where I’m going wrong with the terminology—sampling vs not sampling, but I’m not 100% there yet.
The relevant quote from the Wiki:
We have no general population information here. We have one man with at least one boy.
I’m not at all sure you understand that quote. Lets stick with the coin flips:
Do you understand why these two questions are different: I tell you- “I flipped two coins, at least one of them came out heads, what is the probability that I flipped two heads?” A:1/3 AND “I flipped two coins, you choose one at random and look at it, its heads.What is the probability I flipped two heads” A: 1⁄2
For the record, I’m sure this is frustrating as all getout for you, but this whole argument has really clarified things for me, even though I still think I’m right about which question we are answering.
Many of my arguments in previous posts are wrong (or at least incomplete and a bit naive), and it didn’t click until the last post or two.
Like I said, I still think I’m right, but not because my prior analysis was any good. The 1⁄3 case was a major hole in my reasoning. I’m happily waiting to see if you’re going to destroy my latest analysis, but I think it is pretty solid.
Yes, and we are dealing with the second question here.
Is that not what I said before?
We don’t have 1000 families with two children, from which we’ve selected all families that have at least one boy (which gives 1⁄3 probability). We have one family with two children. Then we are told one of the children is a boy, and given zero other information. The probability that the second is a boy is 1⁄2, so the probability that both are boys is 1⁄2.
The possible options for the “Boy born on Tuesday” are not Boy/Girl, Girl/Boy, Boy/Boy. That would be the case in the selection of 1000 families above.
The possible options are Boy (Tu) / Girl, Girl / Boy (Tu), Boy (Tu) / Boy, Boy / Boy (Tu).
There are two Boy/Boy combinations, not one. You don’t have enough information to throw one of them out.
This is NOT a case of sampling.
As long as you realize there is a difference between those two questions, fine. We can disagree about what assumptions the wording should lead us to, thats irrelevant to the actual statistics and can be an agree-to-disagree situation. Its just important to realize that what the question means/how you get the information is important.
If we have one family with two children, of which one is a boy, they are (by definition) a member of the set “all families that have at least one boy.” So it matters how we got the information.
If we got that information by grabbing a kid at random and looking at it (so we have information about one specific child), that is sampling, and it leads to the 1⁄2 probability.
If we got that information by having someone check both kids, and tell us “at least one is a boy” we have different information (its information about the set of kids the parents have, not information about one specific kid).
If it IS sampling (if I grab a kid at random and say “whats your Birthday?” and it happens to be Tuesday), then the probability is 1⁄2. (we have information about the specific kid’s birthday).
If instead, I ask the parents to tell me the birthday of one of their children, and the parent says ‘I have at least one boy born on Tuesday’, then we get, instead, information about their set of kids, and the probability is the larger number.
Sampling is what leads to the answer you are supporting.
The answer I’m supporting is based on flat priors, not sampling. I’m saying there are two possible Boy/Boy combinations, not one, and therefore it takes up half the probability space, not 1⁄3.
Sampling to the “Boy on Tuesday” problem gives roughly 48% (as per the original article), not 50%.
We are simply told that the man has a boy who was born on tuesday. We aren’t told how he chose that boy, whether he’s older or younger, etc. Therefore we have four possibilites, like I outlined above.
Is my analysis that the possibilities are Boy (Tu) /Girl, Girl / Boy (Tu), Boy (Tu)/Boy, Boy/Boy (Tu) correct?
If so, is not the probability for some combination of Boy/Boy 1/2? If not, why not? I don’t see it.
BTW, contrary to my previous posts, having the information about the boy born on Tuesday is critical because it allows us (and in fact requires us) to distinguish between the two boys.
That was in fact the point of the original article, which I now disagree with significantly less. In fact, I agree with the major premise that the tuesday information pushes the odds of Boy/Boy closer 50%, I just disagree that you can’t reason that it pushes it to exactly 50%.
No. For any day of the week EXCEPT Tuesday, boy and girl are equivalent. For the case of both children born on Tuesday you have for girls: Boy(tu)/Girl(tu),Girl(tu)/Boy(tu), and for boys: boy(tu)/boy(tu).
This statement leads me to believe you are still confused. Do you agree that if I know a family has two kids, I knock on the door and a boy answers and says “I was born on a Tuesday,” that the probability of the second kid being a girl is 1/2? And in this case, Tuesday is irrelevant? (This the wikipedia called “sampling”)
Do you agree that if, instead, the parents give you the information “one of my two kids is a boy born on a Tuesday”, that this is a different sort of information, information about the set of their children, and not about a specific child?
I agree with this.
I agree with this if they said something along the lines of “One and only one of them was born on Tuesday”. If not, I don’t see how the Boy(tu)/Boy(tu) configuration has the same probability as the others, because it’s twice as likely as the other two configurations that that is the configuration they are talking about when they say “One was born on Tuesday”.
Here’s my breakdown with 1000 families, to try to make it clear what I mean:
1000 Families with two children, 750 have boys.
Of the 750, 500 have one boy and one girl. Of these 500, 1⁄7, or roughly 71 have a boy born on Tuesday.
Of the 750, 250 have two boys. Of these 250, 2⁄7, or roughly 71 have a boy born on Tuesday.
71 = 71, so it’s equally likely that there are two boys as there are a boy and a girl.
Having two boys doubles the probability that one boy was born on Tuesday compared to having just one boy.
And I don’t think I’m confused about the sampling, because I didn’t use the sampling reasoning to get my result*, but I’m not super confident about that so if I am just keep giving me numbers and hopefully it will click.
*I mean in the previous post, not specifically this post.
This is wrong. With two boys each with a probability of 1⁄7 to be born on Tuesday, the probability of at least one on a Tuesday isn’t 2⁄7, its 1-(6/7)^2
How can that be? There is a 1⁄7 chance that one of the two is born on Tuesday, and there is a 1⁄7 chance that the other is born on Tuesday. 1⁄7 + 1⁄7 is 2⁄7.
There is also a 1⁄49 chance that both are born on tuesday, but how does that subtract from the other two numbers? It doesn’t change the probability that either of them are born on Tuesday, and both of those probabilities add.
The problem is that you’re counting that 1/49th chance twice. Once for the first brother and once for the second.
I see that now, it took a LOT for me to get it for some reason.
You overcount, the both on Tuesday is overcounted there. Think of it this way- if I have 8 kids do I have a better than 100% probability of having a kid born on Tuesday?
There is a 1/7x6/7 chance the first is born on Tuesday and the second is born on another day. There is a 1/7x6/7 chance the second is born on Tuesday and the first is born on another day. And there is a 1⁄49 chance that both are born on Tuesday.
All together thats 13⁄49. Alternatively, there is a (6/7)^2 chance that both are born not-on-Tuesday, so 1-(6/7)^2 tells you the complementary probability.
Wow.
I’ve seen that same explanation at least five times and it didn’t click until just now. You can’t distinguish between the two on tuesday, so you can only count it once for the pair.
Which means the article I said was wrong was absolutely right, and if you were told that, say one boy was born on January 17th, the chances of both being born on the same day are 1-(364/365)^2 (ignoring leap years), which gives a final probability of roughly 49.46% that both are boys.
Thanks for your patience!
ETA: I also think I see where I’m going wrong with the terminology—sampling vs not sampling, but I’m not 100% there yet.