Note that a slightly different worded problem gives the intuitive result :
A_k is the event “I roll a dice k times, and it end up with 66, with no earlier 66 sequence”.
B_k be the event “I roll a dice k times, and it end up with a 6, and one and only one 6 before that (but not necessarily the roll just before the end : 16236 works)”.
C_k is the event “I roll a dice k times, and I only get even numbers”.
In this case we do have the intuitive result (that I think most mathematicians intuitively translate this problem into) :
Σ[k * P(A_k|C_k)] > Σ[k * P(B_k|C_k)]
Now the question is : why are not the two formulations equivalent ? How would you write “expected number of runs” more formally, in a way that would not yield the above formula, and would reproduce the numbers of your Python program ?
(this is what I hate in probability theory, where slightly different worded problems, seemingly equivalent, yields completely different results for no obvious reason).
Also, the difference between the two processes is not small :
Expected rolls until two 6s in a row (given all even): 2.725588
Expected rolls until second 6 (given all even): 2.999517
Note that a slightly different worded problem gives the intuitive result :
A_k is the event “I roll a dice k times, and it end up with 66, with no earlier 66 sequence”.
B_k be the event “I roll a dice k times, and it end up with a 6, and one and only one 6 before that (but not necessarily the roll just before the end : 16236 works)”.
C_k is the event “I roll a dice k times, and I only get even numbers”.
In this case we do have the intuitive result (that I think most mathematicians intuitively translate this problem into) :
Σ[k * P(A_k|C_k)] > Σ[k * P(B_k|C_k)]
Now the question is : why are not the two formulations equivalent ? How would you write “expected number of runs” more formally, in a way that would not yield the above formula, and would reproduce the numbers of your Python program ?
(this is what I hate in probability theory, where slightly different worded problems, seemingly equivalent, yields completely different results for no obvious reason).
Also, the difference between the two processes is not small :
vs (n = 10 millions)