I was pretty skeptical about your Bead Jar post, but now have changed my mind. This stuff is interesting, even disturbing. In the bead jar game it seems that one should assign tiny credence to “cerulean” and thus huge credence to “non-cerulean”, but not be surprised when “non-cerulean” fails to occur. Do you have some kind of general theory of when surprise occurs or should occur?
I’m working on it. Clearly, “none of the above” situations—or the latter “Bingo” case—can rightly yield surprise.
Perhaps surprise is warranted when a well-supported model, rather than a well-calibrated probability, is disconfirmed. That doesn’t explain why we should be surprised about a personal friend winning the lottery, though. That seems to be surprising solely because of astronomically low odds and the specialness of the outcome.
And lucky outcome could be defined by the difference between your previous expectation and updated expectation for the actual prize. But in both cases, I think you’d need to work with something like “knowledge about prior reviewed in light of new evidence” (reviewed knowledge about prior, not updated prior=posterior), compared with “knowledge about prior before that”.
Because you assign the all-heads sequence a probability significantly higher than 2^-10, so your Bayes score is higher than you expected. Surprise!
Edit: I didn’t notice that you said the coin is fair. Well, I’ll bite the bullet and claim that if you really assign a probability of 1 to the coin being fair, then you won’t feel surprised no matter how many times it comes up heads.
Agree. In practice, I’d bet that our pattern-seeking minds really do put more weight on simple fixed-coin hypotheses than we’re consciously aware of; after only three heads in a row, such a hypothesis would pop into my head (though I’d consciously dismiss it), and after three or four more heads, I’d start to consciously consider it.
I wonder if the definition of “surprise” isn’t a problem here—we may need two distinct words. The amount of information added to your knowledge of the universe is large when a low-probability event is observed. This is “surprising” in the sense of number of bits it takes to encode.
It’s not “surprising” in the human emotional sense, because you’ve already aggregated a number of probabilities, and this is a common result.
Likewise, Omega drawing a cerulean bead is informative, but not surprising. Drawing the Battleship Potemkin is surprising, because you’ve (incorrectly) assigned it a zero probability.
The amount of information added to your knowledge of the universe is large when a low-probability event is observed. This is “surprising” in the sense of number of bits it takes to encode.
I don’t think it necessarily takes a lot of bits to encode low-probability events. If I take out the ten of diamonds and the ace of diamonds and have you pick one of the two, the probability of ◊10 is 50%; if I leave all the cards in the deck, the probability of drawing ◊10 is 1⁄52, but it doesn’t take more space to write ◊10 depending on whence the card came.
Drawing the Battleship Potemkin out of Omega’s jar would be surprising because it messes with the definition of Omega, who said the jar contained solid-colored beads. A boat (or a film, I’m not sure which you meant), which is not a bead, disconfirms the model of Omega. (Or the model of oneself as a an agent who can remember things Omega says.)
Let me explain what Dagon meant, using your example. The total information required to select 1 card out of 52, e.g. ◊10, is about 6 bits (think of it as 6 divisions in half). In the first case you receive 5 of those bits when you’re told what the two cards are, and 1 more bit when you actually draw the card. Only that last bit depends on the random event. In the second case you receive all 6 at once, so all 6 depend on the random event.
That doesn’t explain why we should be surprised about a personal friend winning the lottery, though. That seems to be surprising solely because of astronomically low odds and the specialness of the outcome.
And the outcome is special only because the set of people categorized as “personal friend” is determined before the lottery winner is announced.
If you draw a card at random from a shuffled deck, whatever card you get had a 1⁄52 chance of being selected; this is only surprising if you predicted in advance that it would be that specific card.
Far trickier is how to determine “surprisingness” in cases where the space of possible outcomes is partially or completely unknown.
If you draw a card at random from a shuffled deck, whatever card you get had a 1⁄52 chance of being selected; this is only surprising if you predicted in advance that it would be that specific card.
So clearly, if I write another post on this, I’ll have to call it “Was Your Card the Ten of Diamonds?”
Maybe it does explain why we’re surprised about a personal friend winning the lottery—if we identify the “well-supported model” we were relying on.
Note, it need not be a model which is well-supported in terms of epistemic rationality. Merely that the model has been instrumentally useful: i.e. “my personal friend won’t become incredibly wealthy without warning.”
Alternatively, maybe it is worth considering different types of surprise which have some things in common but some differences.
IIRC, the conclusion was that “surprise” is when some low-probability complex hypothesis suddenly rises to prominence. Thus it’s not something that describes one of 100000 same-probability events happening, but something that describes one winning a lottery 10 times in a row, or an old lame horse winning the race, in which case you start suspecting that something is going on. If a low-probability event doesn’t give a hint that something unexpected is going on, there is no surprise.
The emotion of surprise itself is possibly an adaptation that tells the brain to pay attention, to try to figure out what that new unexpected phenomenon might be and what else that entails.
I was pretty skeptical about your Bead Jar post, but now have changed my mind. This stuff is interesting, even disturbing. In the bead jar game it seems that one should assign tiny credence to “cerulean” and thus huge credence to “non-cerulean”, but not be surprised when “non-cerulean” fails to occur. Do you have some kind of general theory of when surprise occurs or should occur?
I’m working on it. Clearly, “none of the above” situations—or the latter “Bingo” case—can rightly yield surprise.
Perhaps surprise is warranted when a well-supported model, rather than a well-calibrated probability, is disconfirmed. That doesn’t explain why we should be surprised about a personal friend winning the lottery, though. That seems to be surprising solely because of astronomically low odds and the specialness of the outcome.
I think surprise might have to do with the difference between your expected and your actual Bayes score.
And lucky outcome could be defined by the difference between your previous expectation and updated expectation for the actual prize. But in both cases, I think you’d need to work with something like “knowledge about prior reviewed in light of new evidence” (reviewed knowledge about prior, not updated prior=posterior), compared with “knowledge about prior before that”.
Then I don’t understand why we’d be surprised to see a fair coin fall heads ten times in a row.
Because you assign the all-heads sequence a probability significantly higher than 2^-10, so your Bayes score is higher than you expected. Surprise!
Edit: I didn’t notice that you said the coin is fair. Well, I’ll bite the bullet and claim that if you really assign a probability of 1 to the coin being fair, then you won’t feel surprised no matter how many times it comes up heads.
Agree. In practice, I’d bet that our pattern-seeking minds really do put more weight on simple fixed-coin hypotheses than we’re consciously aware of; after only three heads in a row, such a hypothesis would pop into my head (though I’d consciously dismiss it), and after three or four more heads, I’d start to consciously consider it.
I wonder if the definition of “surprise” isn’t a problem here—we may need two distinct words. The amount of information added to your knowledge of the universe is large when a low-probability event is observed. This is “surprising” in the sense of number of bits it takes to encode.
It’s not “surprising” in the human emotional sense, because you’ve already aggregated a number of probabilities, and this is a common result.
Likewise, Omega drawing a cerulean bead is informative, but not surprising. Drawing the Battleship Potemkin is surprising, because you’ve (incorrectly) assigned it a zero probability.
I don’t think it necessarily takes a lot of bits to encode low-probability events. If I take out the ten of diamonds and the ace of diamonds and have you pick one of the two, the probability of ◊10 is 50%; if I leave all the cards in the deck, the probability of drawing ◊10 is 1⁄52, but it doesn’t take more space to write ◊10 depending on whence the card came.
Drawing the Battleship Potemkin out of Omega’s jar would be surprising because it messes with the definition of Omega, who said the jar contained solid-colored beads. A boat (or a film, I’m not sure which you meant), which is not a bead, disconfirms the model of Omega. (Or the model of oneself as a an agent who can remember things Omega says.)
Let me explain what Dagon meant, using your example. The total information required to select 1 card out of 52, e.g. ◊10, is about 6 bits (think of it as 6 divisions in half). In the first case you receive 5 of those bits when you’re told what the two cards are, and 1 more bit when you actually draw the card. Only that last bit depends on the random event. In the second case you receive all 6 at once, so all 6 depend on the random event.
ETA: I didn’t downvote you.
You may want to read about minimum description length.
Thanks, I shall.
And the outcome is special only because the set of people categorized as “personal friend” is determined before the lottery winner is announced.
If you draw a card at random from a shuffled deck, whatever card you get had a 1⁄52 chance of being selected; this is only surprising if you predicted in advance that it would be that specific card.
Far trickier is how to determine “surprisingness” in cases where the space of possible outcomes is partially or completely unknown.
So clearly, if I write another post on this, I’ll have to call it “Was Your Card the Ten of Diamonds?”
Maybe it does explain why we’re surprised about a personal friend winning the lottery—if we identify the “well-supported model” we were relying on.
Note, it need not be a model which is well-supported in terms of epistemic rationality. Merely that the model has been instrumentally useful: i.e. “my personal friend won’t become incredibly wealthy without warning.”
Alternatively, maybe it is worth considering different types of surprise which have some things in common but some differences.
IIRC, the conclusion was that “surprise” is when some low-probability complex hypothesis suddenly rises to prominence. Thus it’s not something that describes one of 100000 same-probability events happening, but something that describes one winning a lottery 10 times in a row, or an old lame horse winning the race, in which case you start suspecting that something is going on. If a low-probability event doesn’t give a hint that something unexpected is going on, there is no surprise.
The emotion of surprise itself is possibly an adaptation that tells the brain to pay attention, to try to figure out what that new unexpected phenomenon might be and what else that entails.