I personally like this two player calibration game, which I was introduced to by Paul Christiano at a meetup a couple of years ago:
think of an unknown quantity (What year was the first woman elected to the US Congress?)
Player 1 comes up with a 50% confidence interval (I guess, technically, this is a credible interval...).
Player 2 chooses whether they want to take the “in” or the “out” side of the bet.
There’s no need to choose a minimum width confidence interval (is there a technical term for that?) e.g. “before 1920” would be an acceptable confidence interval for the question given above.
The big advantage of 50% confidence intervals over 90% confidence intervals (other than that they make a nice easy structure for the game) is that you get much faster feedback. 20 trials can meaningfully tell you that your 50% confidence intervals are off in one direction or the other. 20 trials is enough to tell you if you’re overconfident, but it can’t tell you if you’re underconfident.
The big disadvantage is that 50% confidence intervals somehow don’t feel as useful as 90% confidence intervals. I’m not sure this is really true, as there’s nothing special about 90% (by my reckoning 50% is about as far away from 90% as 90% is from 98%), but it feels true. Of course, it’s pretty trivial to change the game so it works with intervals other than 50%, but you have to play longer, and it gets more complicated.
I personally like this two player calibration game, which I was introduced to by Paul Christiano at a meetup a couple of years ago:
think of an unknown quantity (What year was the first woman elected to the US Congress?)
Player 1 comes up with a 50% confidence interval (I guess, technically, this is a credible interval...).
Player 2 chooses whether they want to take the “in” or the “out” side of the bet.
There’s no need to choose a minimum width confidence interval (is there a technical term for that?) e.g. “before 1920” would be an acceptable confidence interval for the question given above.
The big advantage of 50% confidence intervals over 90% confidence intervals (other than that they make a nice easy structure for the game) is that you get much faster feedback. 20 trials can meaningfully tell you that your 50% confidence intervals are off in one direction or the other. 20 trials is enough to tell you if you’re overconfident, but it can’t tell you if you’re underconfident.
The big disadvantage is that 50% confidence intervals somehow don’t feel as useful as 90% confidence intervals. I’m not sure this is really true, as there’s nothing special about 90% (by my reckoning 50% is about as far away from 90% as 90% is from 98%), but it feels true. Of course, it’s pretty trivial to change the game so it works with intervals other than 50%, but you have to play longer, and it gets more complicated.