The amount of information added to your knowledge of the universe is large when a low-probability event is observed. This is “surprising” in the sense of number of bits it takes to encode.
I don’t think it necessarily takes a lot of bits to encode low-probability events. If I take out the ten of diamonds and the ace of diamonds and have you pick one of the two, the probability of ◊10 is 50%; if I leave all the cards in the deck, the probability of drawing ◊10 is 1⁄52, but it doesn’t take more space to write ◊10 depending on whence the card came.
Drawing the Battleship Potemkin out of Omega’s jar would be surprising because it messes with the definition of Omega, who said the jar contained solid-colored beads. A boat (or a film, I’m not sure which you meant), which is not a bead, disconfirms the model of Omega. (Or the model of oneself as a an agent who can remember things Omega says.)
Let me explain what Dagon meant, using your example. The total information required to select 1 card out of 52, e.g. ◊10, is about 6 bits (think of it as 6 divisions in half). In the first case you receive 5 of those bits when you’re told what the two cards are, and 1 more bit when you actually draw the card. Only that last bit depends on the random event. In the second case you receive all 6 at once, so all 6 depend on the random event.
I don’t think it necessarily takes a lot of bits to encode low-probability events. If I take out the ten of diamonds and the ace of diamonds and have you pick one of the two, the probability of ◊10 is 50%; if I leave all the cards in the deck, the probability of drawing ◊10 is 1⁄52, but it doesn’t take more space to write ◊10 depending on whence the card came.
Drawing the Battleship Potemkin out of Omega’s jar would be surprising because it messes with the definition of Omega, who said the jar contained solid-colored beads. A boat (or a film, I’m not sure which you meant), which is not a bead, disconfirms the model of Omega. (Or the model of oneself as a an agent who can remember things Omega says.)
Let me explain what Dagon meant, using your example. The total information required to select 1 card out of 52, e.g. ◊10, is about 6 bits (think of it as 6 divisions in half). In the first case you receive 5 of those bits when you’re told what the two cards are, and 1 more bit when you actually draw the card. Only that last bit depends on the random event. In the second case you receive all 6 at once, so all 6 depend on the random event.
ETA: I didn’t downvote you.
You may want to read about minimum description length.
Thanks, I shall.