For calibration, it isn’t very useful to score events at 50%
People answering a diverse set of questions are always perfectly calibrated at 50%. But in your case you, answering the same question over and over — will it pass the test — you might be systematically overconfident and so it is useful.
People answering a diverse set of questions are always perfectly calibrated at 50%. But in your case you, answering the same question over and over — will it pass the test — you might be systematically overconfident and so it is useful.