Tip: you can ask ChatGPT to include confidence scores in its replies
Interactions with ChatGPT can be customized durably in the options, for example you can add the following instructions: “include a confidence rating at the end of your response in the format ‘Confidence: X%’. If your confidence is below 80%, briefly explain why”.
Here is a sample conversation demonstrating this and showing what ChatGPT has to say about its calibration:
Me: Are you calibrated, by which I mean, when you output a confidence X as a percentage, are you right X times out of 100?
ChatGPT:
Reasoned about confidence calibration for 27 seconds.
I don’t have calibration in that strict sense. When I mention a confidence percentage, it’s a heuristic internal signal rather than a guarantee that I’ll be right exactly that proportion of the time. Research shows that language models often exhibit miscalibration, meaning a stated confidence (say 70%) doesn’t empirically translate to being correct 70% of the time. I’m about 90% confident in this explanation.
Me:
How bad is the miscalibration?
ChatGPT:
Thought for a second.
LLMs—especially those fine‐tuned with RLHF—are often notably overconfident. For example, a model might claim 90% confidence while its actual accuracy is closer to 70–80%. Studies report expected calibration errors (ECE) of roughly 5–10% or more, and effective calibration methods can sometimes cut these errors by about half, underscoring that miscalibration is a significant issue. I’m about 90% confident in this summary, based on multiple recent calibration studies.
Me:
If you are predictably overconfident, and you know it, do you think you can notice that before stating your confidence and correct for it?
ChatGPT:
Reasoned about overconfidence and calibration adjustment for 4 seconds.
I don’t have a real-time “internal monitor” to detect my own overconfidence before stating it. Instead, I rely on calibration techniques—like temperature scaling and post-hoc adjustments—that help align my confidence levels with actual accuracy. In other words, I can be tuned to be less overconfident, but I don’t consciously correct my confidence on the fly. I’m about 90% confident in this explanation based on current research findings.
Tip: you can ask ChatGPT to include confidence scores in its replies
Interactions with ChatGPT can be customized durably in the options, for example you can add the following instructions: “include a confidence rating at the end of your response in the format ‘Confidence: X%’. If your confidence is below 80%, briefly explain why”.
Here is a sample conversation demonstrating this and showing what ChatGPT has to say about its calibration:
Me: Are you calibrated, by which I mean, when you output a confidence X as a percentage, are you right X times out of 100?
ChatGPT: Reasoned about confidence calibration for 27 seconds. I don’t have calibration in that strict sense. When I mention a confidence percentage, it’s a heuristic internal signal rather than a guarantee that I’ll be right exactly that proportion of the time. Research shows that language models often exhibit miscalibration, meaning a stated confidence (say 70%) doesn’t empirically translate to being correct 70% of the time. I’m about 90% confident in this explanation.
Me: How bad is the miscalibration?
ChatGPT: Thought for a second. LLMs—especially those fine‐tuned with RLHF—are often notably overconfident. For example, a model might claim 90% confidence while its actual accuracy is closer to 70–80%. Studies report expected calibration errors (ECE) of roughly 5–10% or more, and effective calibration methods can sometimes cut these errors by about half, underscoring that miscalibration is a significant issue. I’m about 90% confident in this summary, based on multiple recent calibration studies.
Me: If you are predictably overconfident, and you know it, do you think you can notice that before stating your confidence and correct for it?
ChatGPT: Reasoned about overconfidence and calibration adjustment for 4 seconds. I don’t have a real-time “internal monitor” to detect my own overconfidence before stating it. Instead, I rely on calibration techniques—like temperature scaling and post-hoc adjustments—that help align my confidence levels with actual accuracy. In other words, I can be tuned to be less overconfident, but I don’t consciously correct my confidence on the fly. I’m about 90% confident in this explanation based on current research findings.