Calibration Practice: Retrodictions on Metaculus
Edit: there is now a cool app for this over here https://www.quantifiedintuitions.org/pastcasting
I’ve been working on improving my calibration. I practiced on things like the Credence Game and OpenPhil’s Calibration Game, and both were fine. But I felt like they didn’t quite capture the felt senses from making real-world predictions about topics that I really cared about.
After exploring the options, I started making predictions on PredictionBook – it was simple, lightweight, and let me enter dates for prediction-resolution using intuitive English phrases like “a week from now” or “August 1st” instead of having to go hunting through the date picker.
I’ve found it generally useful to notice when I’ve formed an implicit prediction (for example: “My coworker isn’t going to get their project done today”, or “Covid cases will be lower-per-capita at this particular town that I’m planning to move to”), then actually enter it into PredictionBook.
If I notice an implied prediction while I’m not near my computer, I sometimes make the prediction as if I were at PredictionBook, and then see it resolve, and then enter everything into PredictionBook after the fact to see how it affects my overall calibration. (One example, when I had taken a walk and gotten lost: “At this next road I’m supposed to turn right”, which turned out to be correct)
Retrodictions on Metaculus
My prediction-habits still didn’t add up to that many real-world predictions. But, I found a neat way to practice calibration on a bunch of real-world questions in a row is to go to Metaculus, pick a category with a lot of questions, and then filter by resolved questions.
By default this will show you questions along with the resolution, but you can hide the resolutions by creating a custom style using the Stylus browser extension (chrome / firefox). The relevant CSS is:
.question-table__header > :nth-child(2), .question-table__row > :nth-child(2) {
display:none
}
.question-table__header > :nth-child(3), .question-table__row > :nth-child(3) {
display:none
}
Some questions were ones I already knew the answers to, but for many of them there was still some uncertainty about exactly how some event had played out, enough such that practicing calibration on it was useful.
Overall, the number of questions turned out to be not be as large as I’d originally thought, but still enough to make it a good exercise.
Happy Calibrating.
Note to future people: there is now a cool app for this over here:
https://www.quantifiedintuitions.org/pastcasting
I’ve touched on this before, but it would be wise to take your meta-certainty into account when calibrating. It wouldn’t be hard for me to claim 99.9% accurate calibration by just making a bunch of very easy predictions (an extreme example would be buying a bunch of different dice and making predictions about how they’re going to roll). My post goes into more detail but TLDR by trying to predict how accurate your prediction is going to be you can start to distinguish between “harder” and “easier” phenomena. This makes it easier to compare different peoples calibration and allows you to check how good you really are at making predictions.