Using a reasonable calibration method*, the set of predictions made in this thread will receive a better score than the set of those in the previous thread from 10 years ago (80%)
Nonetheless, lowering each confidence stated by a relative 10% (i.e. 70% to 63% etc.) will yield better total calibration (60%)
* I don’t know the math for this, but I’m assuming there is one that inputs a set of predictions and their truth values and outputs some number, such that the number measures calibration and doesn’t predictably increase or decrease with more predictions.
I believe that the second one can technically lead to a paradox, but it’s highly unlikely for that to occur.
I don’t think you want to lower all predictions uniformly; some predictions here are stated with figures below 50%, for instance.
One better approach might be to reduce the log odds by some factor. If we pick 10% then we get substantially smaller changes than your proposal gives; maybe reduce the log odds by 25%? So if someone thinks X is 70% likely, that’s 7:3 odds; we’d reduce that to (7:3)^0.75 which is the equivalent of a probability of about 65.4%. If they think X is 90% likely it would become 83.9%; if they think X is 50% likely, that wouldn’t change at all.
(Arguably simpler but seems less natural to me: reduce proportionally not the probability but the difference from 50% of the probability. Say we reduce that by 25%; then 70% becomes 50% + 0.75*20% or 65%, quite similar to the fancy log-odds proposal in the previous paragraph. Things diverge more for more extreme probabilities: 90% turns into 50% + 0.75*40% or 80%, and 100% turns into 87.5% where the log-odds reduction leaves it unchanged.)
That might or might not be a better proxy for the kind of overconfidence I’ve been meaning to predict.
The reason why it might not: my formulation relied on the idea that most people will formulate their predictions such that the positive statement corresponds to the smaller subset of positive future space. In that case, even if it’s a < 50% prediction, I would still suspect it’s overconfident. For example:
6) South Korea and Philippines change alliance from USA to China and support it’s 9 dash line claims. Taiwan war with mainland China. 35%
Now I’ve no idea about the substance matter here, but across all such predictions, I predict that they’ll come true less often than the probability indicates. So if we use either of the methods you suggested here, the 35% figure moves upward rather than downward; however I think it should go down.
Fair enough! I suspect some low-probability predictions will be of that sort and some of the other, in which case there’s no simple way to adjust for overconfidence.
Using a reasonable calibration method*, the set of predictions made in this thread will receive a better score than the set of those in the previous thread from 10 years ago (80%)
Nonetheless, lowering each confidence stated by a relative 10% (i.e. 70% to 63% etc.) will yield better total calibration (60%)
* I don’t know the math for this, but I’m assuming there is one that inputs a set of predictions and their truth values and outputs some number, such that the number measures calibration and doesn’t predictably increase or decrease with more predictions.
I believe that the second one can technically lead to a paradox, but it’s highly unlikely for that to occur.
I don’t think you want to lower all predictions uniformly; some predictions here are stated with figures below 50%, for instance.
One better approach might be to reduce the log odds by some factor. If we pick 10% then we get substantially smaller changes than your proposal gives; maybe reduce the log odds by 25%? So if someone thinks X is 70% likely, that’s 7:3 odds; we’d reduce that to (7:3)^0.75 which is the equivalent of a probability of about 65.4%. If they think X is 90% likely it would become 83.9%; if they think X is 50% likely, that wouldn’t change at all.
(Arguably simpler but seems less natural to me: reduce proportionally not the probability but the difference from 50% of the probability. Say we reduce that by 25%; then 70% becomes 50% + 0.75*20% or 65%, quite similar to the fancy log-odds proposal in the previous paragraph. Things diverge more for more extreme probabilities: 90% turns into 50% + 0.75*40% or 80%, and 100% turns into 87.5% where the log-odds reduction leaves it unchanged.)
That might or might not be a better proxy for the kind of overconfidence I’ve been meaning to predict.
The reason why it might not: my formulation relied on the idea that most people will formulate their predictions such that the positive statement corresponds to the smaller subset of positive future space. In that case, even if it’s a < 50% prediction, I would still suspect it’s overconfident. For example:
Now I’ve no idea about the substance matter here, but across all such predictions, I predict that they’ll come true less often than the probability indicates. So if we use either of the methods you suggested here, the 35% figure moves upward rather than downward; however I think it should go down.
Fair enough! I suspect some low-probability predictions will be of that sort and some of the other, in which case there’s no simple way to adjust for overconfidence.