But I don’t really expect to see 3½ as an outcome of the roll. I expect to see either 1, or 2, or 3, or 4, or 5, or 6. But certainly not 3½.
If my model says that 0.2 coins are heads-only and 0.8 coins are fair, in 1000 flips I expect to see either 1000 heads (probability 0.2) or cca 500 heads (probability 0.8). But I don’t expect to see cca 600 heads. Yet, the expected value of the number of heads in 1000 flips is 600.
You can only multiply out P(next result is heads) * ( number of tosses) to get the expected number of heads if you believe those tosses are independent trials. The case of a biased coin toss explicitly violates this assumption.
But the tosses are independent trials, even for the biased coin. I think you mean the P(heads) is not 0.6, it’s either 0.5 or 1, you just don’t know which one it is.
Which means that P(heads on toss after next|heads on next toss) != P(heads on toss after next|tails on next toss). Independence of A and B means that P(A|B) = P(A).
As long as you’re using the same coin, P(heads on toss after next|heads on next toss) == P(heads on toss after next|tails on next toss).
You’re confusing the probability of coin toss outcome with your knowledge about it.
Consider a RNG which generates independent samples from a normal distrubution centered on some—unknown to you—value mu. As you see more samples you get a better idea of what mu is and your expectations about what numbers you are going to see next change. But these samples do not become dependent just because your knowledge of mu changes.
We have a coin that is heads-only with probability 20%, and fair with probability 80%. We’ve already conducted exactly one flip of this coin, which came out heads (causing out update from the prior of 10/80/10 to 20/80/0), but no further flips yet.
For simplicity, event A will be “heads on next toss” (toss number 2), and B will be “heads on toss after next” (toss number 3).
It’s awful that you were downvoted in this thread when you were mostly right and the others were mostly wrong. I’m updating my estimate of LW’s average intelligence downward.
No it doesn’t! A coin biased towards heads can have p(H) = 0.6, p(T) = 0.4, and each flip can be an independent trial. The total results from many flips will then be Poisson distributed.
I don’t think so. None of the available potential coin-states would generate an expected value of 600 heads.
p = 0.6 → 600 expected heads is the many-trials (where each trial is 1000 flips) expected value given the prior and the result of the first flip, but this is different from the expectation of this trial, which is bimodally distributed at [1000]x0.2 and [central limit around 500]x0.8
Using my simplest example, because it’s simplest to calculate:
Prior:
0.8 fair coin, 0.1 heads-only coin, 0.1 tails-only coin
probability “next is head” = 0.5
probability “next 1000 flips are approximately 500:500” ~ 0.8
Posterior:
0.8 fair coin, 0.2 heads-only coin
probability “next is head” = 0.6 (increased)
probability “next 1000 flips are approximately 500:500” ~ 0.8 (didn’t change)
Um.
Probability of a head = 0.5 necessarily means that the expected number of heads in 1000 tosses is 500.
Probability of a head = 0.6 necessarily means that the expected number of heads in 1000 tosses is 600.
Are you playing with two different meanings of the word “expected” here?
If I roll a 6-sided die, the expected value is 3½.
But I don’t really expect to see 3½ as an outcome of the roll. I expect to see either 1, or 2, or 3, or 4, or 5, or 6. But certainly not 3½.
If my model says that 0.2 coins are heads-only and 0.8 coins are fair, in 1000 flips I expect to see either 1000 heads (probability 0.2) or cca 500 heads (probability 0.8). But I don’t expect to see cca 600 heads. Yet, the expected value of the number of heads in 1000 flips is 600.
No, I’m just using the word in the statistical-standard sense of “expected value”.
Lumifer was using the word “expected” correctly.
You can only multiply out P(next result is heads) * ( number of tosses) to get the expected number of heads if you believe those tosses are independent trials. The case of a biased coin toss explicitly violates this assumption.
But the tosses are independent trials, even for the biased coin. I think you mean the P(heads) is not 0.6, it’s either 0.5 or 1, you just don’t know which one it is.
Which means that P(heads on toss after next|heads on next toss) != P(heads on toss after next|tails on next toss). Independence of A and B means that P(A|B) = P(A).
As long as you’re using the same coin, P(heads on toss after next|heads on next toss) == P(heads on toss after next|tails on next toss).
You’re confusing the probability of coin toss outcome with your knowledge about it.
Consider a RNG which generates independent samples from a normal distrubution centered on some—unknown to you—value mu. As you see more samples you get a better idea of what mu is and your expectations about what numbers you are going to see next change. But these samples do not become dependent just because your knowledge of mu changes.
Please actually do your math here.
We have a coin that is heads-only with probability 20%, and fair with probability 80%. We’ve already conducted exactly one flip of this coin, which came out heads (causing out update from the prior of 10/80/10 to 20/80/0), but no further flips yet.
For simplicity, event A will be “heads on next toss” (toss number 2), and B will be “heads on toss after next” (toss number 3).
P(A) = 0.2 1 + 0.8 0.5 = 0.6 P(B) = 0.2 1 + 0.8 0.5 = 0.6
P(A & B) = 0.2 1 1 + 0.8 0.5 0.5 = 0.4
Note that this is not the same as P(A) P(B), which is 0.6 0.6 = 0.36.
The definition of independence is that A and B are independent iff P(A & B) = P(A) * P(B). These events are not independent.
Turning the math crank without understanding what you are doing is worse than useless.
Our issue is about how to understand probability, not which numbers come out of chute.
It’s awful that you were downvoted in this thread when you were mostly right and the others were mostly wrong. I’m updating my estimate of LW’s average intelligence downward.
No it doesn’t! A coin biased towards heads can have p(H) = 0.6, p(T) = 0.4, and each flip can be an independent trial. The total results from many flips will then be Poisson distributed.
I don’t think so. None of the available potential coin-states would generate an expected value of 600 heads.
p = 0.6 → 600 expected heads is the many-trials (where each trial is 1000 flips) expected value given the prior and the result of the first flip, but this is different from the expectation of this trial, which is bimodally distributed at [1000]x0.2 and [central limit around 500]x0.8