I don’t think you’re modeling your problem correctly, unless I misunderstood the question you’re trying to answer. You have those following random variables :
X_1 is bernoulli, first child is a boy
X_2 is bernouilli, second child is a boy
Y_1 is uniform, weekday of birth of the first child
Y_2 is uniform, weekday of birth of the second child
D is a random variable which corresponds to the weekday in the sentence “one of them is a boy, born a (D)”. There is many ways to construct one like this, but we only require that if X_1=1 or X_2=1, then D=Y_1 or D=Y_2, and that D=Y_i implies X_i=1.
Then what you’re looking for is not P(X_1=1,X_2=1 | (X_1=1,Y_1=monday) or (X_2=1,Y_2=monday)) (which, indeed, is not 1⁄3), but P(X_1=1,X_2=1 | ((X_1=1,D_1=D) or (X_2=1,D_2=D)) and D=monday). This is still 1⁄3, as illustrated by this Python snippet (I’m too lazy to properly demonstrate this formally) : https://gist.github.com/sloonz/faf3565c3ddf059960807ac0e2223200
There wass a similar paradox presented on old lesswrong. If someone can manage to find it (a quick google search returned nothing, but i may have misremembered the exact terms of the problem…), the solution would be way better presented there :
Alice, Bob and Charlie are accused of treason. To make an example, one of them, chosen randomly, will be executed tomorrow. Alice ask for a guard, and give him a letter with those instructions : “At least Bob or Charlie will not be executed. Please give him this letter. If I am to be executed and both live, give the letter to any one of them”. The guard leaves, returns and tell Alice : “I gave the letter to Bob”.
Alice is unable to sleep the following night : “Before doing this, I had a 1⁄3 chance of being executed. Now that it’s either me or Charlie, I have a 1⁄2 chance of being executed. I shouldn’t have written that letter”.
Yes, that’s kind of my point. There’s two wildly different problems that looks the same on the surface, but they are not. One gives the answer of your post, the other is 1⁄3. I suspect that your initial confusion is your brain trying to interpret the first problem as an instance of the second. My brain sure did, initially.
On the first one, you go and interview 1000 fathers having two children. You ask them the question “Do you have at least one boy born on a Monday ?”. If they answer yes, you then ask then “Do you have two boys ?”. You ask the probability that the second answer is yes, conditioning on the event that the first one is yes. The answer is the one of your post.
On the second one, you send one survey to 1000 fathers having two children. It reads something like that. “1. Do you have at least one boy ? 2. Give the weekday of birth of the boy. If you have two, pick any one. 3. Do you have two boys ?”. Now the question is, conditioning on the event that the first answer is yes, and on the random variable given by the second answer, what is the probability that the third answer is yes ? The answer is 1⁄3.
My main point is that none of the answers are counter-intuitive. In the first problem, your conditioning on Monday is like always selecting a specific child, like always picking the youngest one (in the sentence “I have two children, and the youngest one is a boy”, which gives then a probability of 1⁄2 for two boys). With low n, the specificity is low and you’re close to the problem without selecting a specific child and get 1⁄3. With large n, the specificity is high and you’re close to the problem of selecting a specific child (eg the youngest one) and get 1⁄2. In the second problem, the “born on the monday” piece of information is indeed irrelevant and get factored out.