Wait, why doesn’t the entropy of your posterior distribution capture this effect? In the basic example where we get to see samples from a bernoulli process, the posterior is a beta distribution that gets ever sharper around the truth. If you compute the entropy of the posterior, you might say something like “I’m unlikely to change my mind about this, my posterior only has 0.2 bits to go until zero entropy”. That’s already a quantity which estimates how much future evidence will influence your beliefs.
The thing that distinguishes the coin case from the wind case is how hard it is to gather additional information, not how much more information could be gathered in principle. In theory you could run all sorts of simulations that would give you informative data about an individual flip of the coin, it’s just that it would be really hard to do so/very few people are able to do so. I don’t think the entropy of the posterior captures this dynamic.
Wait, why doesn’t the entropy of your posterior distribution capture this effect? In the basic example where we get to see samples from a bernoulli process, the posterior is a beta distribution that gets ever sharper around the truth. If you compute the entropy of the posterior, you might say something like “I’m unlikely to change my mind about this, my posterior only has 0.2 bits to go until zero entropy”. That’s already a quantity which estimates how much future evidence will influence your beliefs.
The thing that distinguishes the coin case from the wind case is how hard it is to gather additional information, not how much more information could be gathered in principle. In theory you could run all sorts of simulations that would give you informative data about an individual flip of the coin, it’s just that it would be really hard to do so/very few people are able to do so. I don’t think the entropy of the posterior captures this dynamic.