Do you think this analysis works for the fact that a well-calibrated Beauty answers “1/3”? Do you think there’s a problem with our methods of judging calibration?
Do you think this analysis works for the fact that a well-calibrated Beauty answers “1/3”? Do you think there’s a problem with our methods of judging calibration?