Suppose you and I have two different models, and my model is less wrong than yours. Suppose that my model assigns a 40% probability to event X, and your model assigns a 60%, we disagree and bet, and event X happens. If I had an oracle over the true distribution of X, my write-up would consist of saying “this falls into the 40% of cases, as predicted by my model”, which doesn’t seem very useful. In the absence of an oracle, I would end up writing up praise for, and updating towards, your more wrong model, which is obviously not what we want.
This approach might lead to over updating on single bets. You’d need to record your bets, and the odds on those bets over time to see how calibrated you were. If your calibration over time is poor, then you should be updating your model. Perhaps we can weaken the suggestion in the post to writing a post-mortem on why you may be wrong. Then when you reflect over multiple bets over time, you could try to tease out common patterns and deficits in your model making.
“In the absence of an oracle, I would end up writing up praise for, and updating towards, your more wrong model, which is obviously not what we want.”
Perhaps I’m missing something, but I think that’s exactly what we want. It leads to eventual consistency / improved estimates of odds, which is all we can look for without oracles or in the presence of noise.
First, strength of priors will limit the size of the bettor’s updates. Let’s say we both used beta distributions, and had weak beliefs. Your prior was Beta(4,6), and mine was Beta(6,4). These get updated to B(5,6) and B(7,4). That sounds fine—you weren’t very sure initially, and you still won’t over-correct much. If the priors are stronger, say, B(12,18) and B(18,12), the updates are smaller as well, as they should be given our clearer world models and less willingness to abandon them due to weak evidence.
Second, we can look at the outside observer’s ability to update. If the expectation is 40% vs. 60%, unless there are very strong priors, I would assume neither side is interested in making huge bets, or giving large odds—that is, if this bet would happen at all, given transaction costs, etc. This should implicitly limit the size of the update other people make from such bets.
Another idea on this: both sides could do pre-mortems, “if I lose, …”. They could look back at this when doing post-mortems. Obviously this increases the effort involved.
Yeah, pre-mortem is another name for pre-hindsight, and murphyjitsu is just the idea of alternating between making pre-mortems and fixing your plans to prevent whatever problem you envisioned in the pre-mortem.
Thinking about this makes me think people should record not just their bets, but the probabilities. If I think the probability is 1% and you think it’s 99%, then one of us is going to make a fairly big update. If you think it’s 60% and I think it’s 50%, yeah, not so much. As a rough rule of thumb, anyway. (Obviously I could be super confident in a 1% estimate in a similar way to how you describe being super confident in a 40%.)
But OTOH I think in many cases, by the time the bet is resolved there will also be a lot of other relevant evidence which determines questions related to a bet. So the warranted update will actually be much larger than would be justified just the one piece of information. In other words, if two Bayesians have different world-models and make a bet about something much into the future, by the time the actual bet is resolved they’ll often have seen much more decisive evidence deciding between the two models (not necessarily in the same direction as the bet gets decided).
Suppose you and I have two different models, and my model is less wrong than yours. Suppose that my model assigns a 40% probability to event X, and your model assigns a 60%, we disagree and bet, and event X happens. If I had an oracle over the true distribution of X, my write-up would consist of saying “this falls into the 40% of cases, as predicted by my model”, which doesn’t seem very useful. In the absence of an oracle, I would end up writing up praise for, and updating towards, your more wrong model, which is obviously not what we want.
This approach might lead to over updating on single bets. You’d need to record your bets, and the odds on those bets over time to see how calibrated you were. If your calibration over time is poor, then you should be updating your model. Perhaps we can weaken the suggestion in the post to writing a post-mortem on why you may be wrong. Then when you reflect over multiple bets over time, you could try to tease out common patterns and deficits in your model making.
“In the absence of an oracle, I would end up writing up praise for, and updating towards, your more wrong model, which is obviously not what we want.”
Perhaps I’m missing something, but I think that’s exactly what we want. It leads to eventual consistency / improved estimates of odds, which is all we can look for without oracles or in the presence of noise.
First, strength of priors will limit the size of the bettor’s updates. Let’s say we both used beta distributions, and had weak beliefs. Your prior was Beta(4,6), and mine was Beta(6,4). These get updated to B(5,6) and B(7,4). That sounds fine—you weren’t very sure initially, and you still won’t over-correct much. If the priors are stronger, say, B(12,18) and B(18,12), the updates are smaller as well, as they should be given our clearer world models and less willingness to abandon them due to weak evidence.
Second, we can look at the outside observer’s ability to update. If the expectation is 40% vs. 60%, unless there are very strong priors, I would assume neither side is interested in making huge bets, or giving large odds—that is, if this bet would happen at all, given transaction costs, etc. This should implicitly limit the size of the update other people make from such bets.
Another idea on this: both sides could do pre-mortems, “if I lose, …”. They could look back at this when doing post-mortems. Obviously this increases the effort involved.
Seems similar to Murphyjitsu
Yeah, pre-mortem is another name for pre-hindsight, and murphyjitsu is just the idea of alternating between making pre-mortems and fixing your plans to prevent whatever problem you envisioned in the pre-mortem.
I really like the idea of doing a pre-mortem here.
Thinking about this makes me think people should record not just their bets, but the probabilities. If I think the probability is 1% and you think it’s 99%, then one of us is going to make a fairly big update. If you think it’s 60% and I think it’s 50%, yeah, not so much. As a rough rule of thumb, anyway. (Obviously I could be super confident in a 1% estimate in a similar way to how you describe being super confident in a 40%.)
But OTOH I think in many cases, by the time the bet is resolved there will also be a lot of other relevant evidence which determines questions related to a bet. So the warranted update will actually be much larger than would be justified just the one piece of information. In other words, if two Bayesians have different world-models and make a bet about something much into the future, by the time the actual bet is resolved they’ll often have seen much more decisive evidence deciding between the two models (not necessarily in the same direction as the bet gets decided).
Still, yeah, I agree with your concern.