Calibration Practice: Retrodictions on Metaculus

Edit: there is now a cool app for this over here https://www.quantifiedintuitions.org/pastcasting

I’ve been working on improving my calibration. I practiced on things like the Credence Game and OpenPhil’s Calibration Game, and both were fine. But I felt like they didn’t quite capture the felt senses from making real-world predictions about topics that I really cared about.

After exploring the options, I started making predictions on PredictionBook – it was simple, lightweight, and let me enter dates for prediction-resolution using intuitive English phrases like “a week from now” or “August 1st” instead of having to go hunting through the date picker.

I’ve found it generally useful to notice when I’ve formed an implicit prediction (for example: “My coworker isn’t going to get their project done today”, or “Covid cases will be lower-per-capita at this particular town that I’m planning to move to”), then actually enter it into PredictionBook.

If I notice an implied prediction while I’m not near my computer, I sometimes make the prediction as if I were at PredictionBook, and then see it resolve, and then enter everything into PredictionBook after the fact to see how it affects my overall calibration. (One example, when I had taken a walk and gotten lost: “At this next road I’m supposed to turn right”, which turned out to be correct)

Retrodictions on Metaculus

My prediction-habits still didn’t add up to that many real-world predictions. But, I found a neat way to practice calibration on a bunch of real-world questions in a row is to go to Metaculus, pick a category with a lot of questions, and then filter by resolved questions.

By default this will show you questions along with the resolution, but you can hide the resolutions by creating a custom style using the Stylus browser extension (chrome / firefox). The relevant CSS is:

.question-table__header > :nth-child(2), .question-table__row > :nth-child(2) {
    display:none
}

.question-table__header > :nth-child(3), .question-table__row > :nth-child(3) {
    display:none
}

Some questions were ones I already knew the answers to, but for many of them there was still some uncertainty about exactly how some event had played out, enough such that practicing calibration on it was useful.

Overall, the number of questions turned out to be not be as large as I’d originally thought, but still enough to make it a good exercise.

Happy Calibrating.