Calibration

TagLast edit: 2 Apr 2025 9:53 UTC by gustaf

Someone is well-calibrated if the things they predict with X% chance of happening in fact occur X% of the time. Importantly, calibration is not the same as accuracy. Calibration is about accurately assessing how good your predictions are, not making good predictions. Person A, whose predictions are marginally better than chance (60% of them come true when choosing from two options) and who is precisely 60% confident in their choices, is perfectly calibrated. In contrast, Person B, who is 99% confident in their predictions, and right 90% of the time, is more accurate than Person A, but less well-calibrated.

Being well-calibrated has value for rationalists separately from accuracy. Among other things, being well-calibrated lets you make good bets / make good decisions, communicate information helpfully to others if they know you to be well-calibrated (See Group Rationality), and helps prioritize which information is worth acquiring.

Note that all expressions of quantified confidence in beliefs can be well- or poorly- calibrated. For example, calibration applies to whether a person’s 95% confidence intervals captures the true outcome 95% of the time.

List of Calibration Exercises
based on this post. Todo: find more & sort & new post for visibility in search engines?

Exercises that are dead/unmaintained

https://www.metaculus.com/tutorials (dead link)
http://web.archive.org/web/20100529074053/http://www.acceleratingfuture.com/tom/?p=129
http://credencecalibration.com (dead link)
https://calibration.lazdini.lv (dead link)
http://web.archive.org/web/20161020032514/http://calibratedprobabilityassessment.org/
https://predictionbook.com/credence_games/try (deprecated; see also this Github Issue)
https://calibration-training.netlify.app (dead link)

List of Probability Calibration Exercises

Isaac King23 Jan 2022 2:12 UTC

76 points

12 comments1 min readLW link

Information Charts

Rafael Harth13 Nov 2020 16:12 UTC

29 points

6 comments13 min readLW link

Anki with Uncertainty: Turn any flashcard deck into a calibration training tool

Sage Future22 Mar 2023 17:26 UTC

14 points

2 comments1 min readLW link

(www.quantifiedintuitions.org)

Use Normal Predictions

Jan Christian Refsgaard9 Jan 2022 15:01 UTC

149 points

67 comments6 min readLW link

Concrete benefits of making predictions

Jonny Spicer and Sage Future

17 Oct 2024 14:23 UTC

36 points

5 comments6 min readLW link

(fatebook.io)

Calibrate your self-assessments

Scott Alexander9 Oct 2011 23:26 UTC

102 points

122 comments6 min readLW link

The Sin of Underconfidence

Eliezer Yudkowsky20 Apr 2009 6:30 UTC

119 points

187 comments6 min readLW link

Calibration Trivia

Screwtape4 Aug 2022 22:31 UTC

12 points

9 comments4 min readLW link

I didn’t think I’d take the time to build this calibration training game, but with websim it took roughly 30 seconds, so here it is!

mako yass2 Aug 2024 22:35 UTC

24 points

2 comments5 min readLW link

Hammertime Day 9: Time Calibration

alkjash7 Feb 2018 1:40 UTC

20 points

11 comments2 min readLW link

(radimentary.wordpress.com)

Giving calibrated time estimates can have social costs

Alex_Altair3 Apr 2022 21:23 UTC

99 points

16 comments5 min readLW link

Procrastination Drill

silentbob28 Jul 2025 20:54 UTC

48 points

8 comments2 min readLW link

Introducing Pastcasting: A tool for forecasting practice

Sage Future11 Aug 2022 17:38 UTC

95 points

10 comments2 min readLW link 2 reviews

Paper: Forecasting world events with neural nets

Owain_Evans, Dan H and Joe Kwon

1 Jul 2022 19:40 UTC

39 points

3 comments4 min readLW link

Aumann Agreement Game

abramdemski9 Oct 2015 17:14 UTC

34 points

17 comments1 min readLW link

Credence Calibration Icebreaker Game

Ruby14 Aug 2014 21:01 UTC

42 points

1 comment2 min readLW link

Cambridge Prediction Game

NoSignalNoNoise25 Jan 2020 3:57 UTC

13 points

3 comments2 min readLW link

Simultaneous Overconfidence and Underconfidence

abramdemski3 Jun 2015 21:04 UTC

37 points

6 comments5 min readLW link

Takeaways from calibration training

Olli Järviniemi29 Jan 2023 19:09 UTC

45 points

2 comments3 min readLW link 1 review

The Bayesian Tyrant

abramdemski20 Aug 2020 0:08 UTC

144 points

21 comments6 min readLW link 1 review

Paper: Teaching GPT3 to express uncertainty in words

Owain_Evans31 May 2022 13:27 UTC

97 points

7 comments4 min readLW link

Suspiciously balanced evidence

gjm12 Feb 2020 17:04 UTC

50 points

24 comments4 min readLW link

Qualitatively Confused

Eliezer Yudkowsky14 Mar 2008 17:01 UTC

71 points

85 comments4 min readLW link

What is calibration?

AlexMennen13 Mar 2023 6:30 UTC

27 points

1 comment4 min readLW link

Calibration Test with database of 150,000+ questions

Nanashi14 Mar 2015 11:22 UTC

54 points

32 comments1 min readLW link

How the Equivalent Bet Test Actually Works

Erich_Grunewald18 Dec 2021 11:17 UTC

4 points

1 comment4 min readLW link

(www.erichgrunewald.com)

A Subtle Selection Effect in Overconfidence Studies

Kevin Dorst3 Jul 2023 14:43 UTC

24 points

0 comments6 min readLW link

(kevindorst.substack.com)

We Change Our Minds Less Often Than We Think

Eliezer Yudkowsky3 Oct 2007 18:14 UTC

114 points

120 comments1 min readLW link

Do LLMs know what they’re capable of? Why this matters for AI safety, and initial findings

Casey Barkan, Sid Black and Oliver Sourbut

13 Jul 2025 19:54 UTC

50 points

4 comments18 min readLW link

The Case for Overconfidence is Overstated

Kevin Dorst28 Jun 2023 17:21 UTC

50 points

13 comments8 min readLW link

(kevindorst.substack.com)

Prediction Contest 2018

jbeshir30 Apr 2018 18:26 UTC

9 points

4 comments3 min readLW link

Kurzweil’s predictions: good accuracy, poor self-calibration

Stuart_Armstrong11 Jul 2012 9:55 UTC

50 points

39 comments9 min readLW link

Fair Collective Efficient Altruism

Jobst Heitzig25 Nov 2022 9:38 UTC

2 points

1 comment5 min readLW link

Overconfident Pessimism

lukeprog24 Nov 2012 0:47 UTC

37 points

38 comments4 min readLW link

Placing Yourself as an Instance of a Class

abramdemski3 Oct 2017 19:10 UTC

36 points

5 comments3 min readLW link

Test Your Calibration!

alyssavance11 Nov 2009 22:03 UTC

25 points

34 comments2 min readLW link

Advancing Certainty

komponisto18 Jan 2010 9:51 UTC

44 points

110 comments4 min readLW link

ChatGPT challenges the case for human irrationality

Kevin Dorst22 Aug 2023 12:46 UTC

3 points

10 comments7 min readLW link

(kevindorst.substack.com)

Picking favourites is hard

dkl94 Dec 2024 20:46 UTC

11 points

3 comments1 min readLW link

(dkl9.net)

Prediction Contest 2018: Scores and Retrospective

jbeshir27 Jan 2019 17:20 UTC

28 points

5 comments1 min readLW link

Markets Are Information—Beating the Sportsbooks at Their Own Game

JJXW7 Nov 2024 20:58 UTC

9 points

1 comment2 min readLW link

(thehobbyist.substack.com)

Horrible LHC Inconsistency

Eliezer Yudkowsky22 Sep 2008 3:12 UTC

34 points

33 comments1 min readLW link

Proposal: Tune LLMs to Use Calibrated Language

OneManyNone7 Jun 2023 21:05 UTC

9 points

0 comments5 min readLW link

Breaking Rank (Calibration Game)

jenn7 Mar 2023 15:40 UTC

11 points

0 comments2 min readLW link

Climate-contingent Finance, and A Generalized Mechanism for X-Risk Reduction Financing

John Nay26 Sep 2022 13:23 UTC

0 points

2 comments26 min readLW link

Lawful Uncertainty

Eliezer Yudkowsky10 Nov 2008 21:06 UTC

143 points

57 comments4 min readLW link

Say It Loud

Eliezer Yudkowsky19 Sep 2008 17:34 UTC

62 points

20 comments2 min readLW link

Behavior Cloning is Miscalibrated

leogao5 Dec 2021 1:36 UTC

77 points

3 comments3 min readLW link

[Question] How to best measure if and to what degree you’re too pessimistic or too optimistic?

CstineSublime31 Mar 2024 0:57 UTC

4 points

3 comments1 min readLW link

Bayes-Up: An App for Sharing Bayesian-MCQ

Louis Faucon6 Feb 2020 19:01 UTC

53 points

9 comments1 min readLW link

[Question] Is there a.. more exact.. way of scoring a predictor’s calibration?

mako yass16 Jan 2019 8:19 UTC

22 points

6 comments1 min readLW link

Illusion of Transparency: Why No One Understands You

Eliezer Yudkowsky20 Oct 2007 23:49 UTC

181 points

52 comments3 min readLW link

RFC on an open problem: how to determine probabilities in the face of social distortion

ialdabaoth7 Oct 2017 22:04 UTC

6 points

3 comments2 min readLW link

Social Calibration

SimulatedCrow20 May 2021 23:22 UTC

3 points

4 comments4 min readLW link

[Question] Calibration training for ‘percentile rankings’?

david reinstein14 Sep 2024 21:51 UTC

3 points

0 comments2 min readLW link

[Question] Are (Motor)sports like F1 a good thing to calibrate estimates against?

CstineSublime24 Mar 2024 9:07 UTC

4 points

2 comments1 min readLW link

Introducing Fatebook: the fastest way to make and track predictions

Adam B and Sage Future

11 Jul 2023 15:28 UTC

132 points

41 comments1 min readLW link 2 reviews

(fatebook.io)

Calibrate—New Chrome Extension for hiding numbers so you can guess

chanamessinger7 Oct 2022 11:21 UTC

59 points

16 comments1 min readLW link

(chrome.google.com)

A Motorcycle (and Calibration?) Accident

boggler18 Mar 2018 22:21 UTC

25 points

11 comments2 min readLW link

[Question] Has Someone Checked The Cold-Water-In-Left-Ear Thing?

Maloew28 Dec 2024 20:15 UTC

11 points

0 comments1 min readLW link

Calibration for continuous quantities

Cyan21 Nov 2009 4:53 UTC

30 points

13 comments3 min readLW link

Anthropically Blind: the anthropic shadow is reflectively inconsistent

Christopher King29 Jun 2023 2:36 UTC

43 points

40 comments10 min readLW link

Raising the forecasting waterline (part 1)

Morendil9 Oct 2012 15:49 UTC

51 points

107 comments6 min readLW link

Outrangeous (Calibration Game)

jenn7 Mar 2023 15:29 UTC

38 points

3 comments9 min readLW link

Prediction and Calibration—Part 1

Jan Christian Refsgaard8 May 2021 19:48 UTC

6 points

10 comments4 min readLW link

(www.badprior.com)

[Question] What are good ML/AI related prediction / calibration questions for 2019?

james_t4 Jan 2019 2:40 UTC

19 points

4 comments2 min readLW link

Quantified Intuitions: An epistemics training website including a new EA-themed calibration app

Sage Future and elifland

20 Sep 2022 22:25 UTC

28 points

2 comments2 min readLW link

Why I’m Pouring Cold Water in My Left Ear, and You Should Too

Maloew24 Jan 2025 23:13 UTC

12 points

0 comments2 min readLW link

How to reach 80% of your goals. Exactly 80%.

Bart Bussmann10 Oct 2020 17:33 UTC

36 points

11 comments1 min readLW link

the gears to ascension 12 Sep 2023 10:50 UTC
2 points
0
@Jim Fisher what’s your reasoning for removing the archive.org links?
- Jim Fisher 25 Sep 2023 18:44 UTC
  1 point
  0
  Parent
  I don’t think I removed any archive.org links. My intention was to remove dead links, and to make it clearer which links are worth visiting. Please revert if I’ve made a mistake or you disagree with my intention.