Substantial? No—it adds up to normality. Interesting? Yes.
I don’t understand what you mean by this.
Imagine two situations of equal improbability:
In one, Alice flips a coin N times in front of a crowd, and achieves some specific sequence M.
In the other, Alice flips a coin N / 2 times in front of a crowd, and achieves some specific sequence Q; she then >opens an envelope, and reveals a prediction of exactly the sequence that she just flipped.
These two end results are equally improbable (both end results encode N bits of information—to see this, imagine that the envelope contained a different sequence than she flipped), but we attach significance to one result (appropriately) and not the other. What’s the difference between the two situations?
It is important to note that to capture this problem entirely we must make it explicit that the person observing the coin flips has not only a prior over sequences of coin flips, but a prior over world-models that produce the sequences. It is implicit, and often explicitly assumed, in any coin flip example, that a normal human flipping a fair coin is something like our null hypothesis. Most coins seem fair in our everyday experience. Alice correctly predicting the sequence that she achieves is evidence that causes a substantial update on our distribution over world models, even if the two sequences are assigned equal probability in our distribution over sequences given that we consider the null hypothesis most likely.
You can also imagine it as the problem of finding an efficient encoding of sequences of coin flips. If you know that certain subsequences are more likely than others, then you should find a way to encode more probable subsequences with less bits. Actually doing this is equivalent to forming beliefs about the world. (Like ‘The coin is biased in this particular way’, or ‘Alice is clairvoyant.’)
I don’t understand what you mean by this.
It is important to note that to capture this problem entirely we must make it explicit that the person observing the coin flips has not only a prior over sequences of coin flips, but a prior over world-models that produce the sequences. It is implicit, and often explicitly assumed, in any coin flip example, that a normal human flipping a fair coin is something like our null hypothesis. Most coins seem fair in our everyday experience. Alice correctly predicting the sequence that she achieves is evidence that causes a substantial update on our distribution over world models, even if the two sequences are assigned equal probability in our distribution over sequences given that we consider the null hypothesis most likely.
You can also imagine it as the problem of finding an efficient encoding of sequences of coin flips. If you know that certain subsequences are more likely than others, then you should find a way to encode more probable subsequences with less bits. Actually doing this is equivalent to forming beliefs about the world. (Like ‘The coin is biased in this particular way’, or ‘Alice is clairvoyant.’)