This is not a human universal—people who put even a small amount of training into calibration can become very well calibrated very quickly. This is a sign that most Less Wrongers continue to neglect the very basics of rationality and are incapable of judging how much evidence they have on a given issue. Veterans of the site do no better than newbies on this measure.
Can someone who’s done calibration training comment on whether it really seems to represent the ability to “judge how much evidence you have on a given issue”, as opposed to accurately translate brain-based probability estimates in to numerical probability estimates?
Can someone who’s done calibration training comment on whether it really seems to represent the ability to “judge how much evidence you have on a given issue”, as opposed to accurately translate brain-based probability estimates in to numerical probability estimates?
As I interpret it, the two are distinct but calibration training does both. That is, there’s both a “subjective feeling of certainty”->”probability number” model that’s being trained, and that model probably ought to be trained for every field independently (that is, determining how much subjective feeling of certainty you should have in different cases). There appears to be some transfer but I don’t think it’s as much as Yvain seems to be postulating.
Have you done calibration training? Do you recommend it? I think I remember someone from CFAR saying that it was kind of a niche skill, and to my knowledge it hasn’t been incorporated in to their curriculum (although Andrew Critch created a calibration android app?)
I’ve done a moderate amount of training. I think that the credence game is fun enough to put an hour or two into, but I think the claim that it’s worth putting serious effort into rests on the claim that it transfers (or that probabilistic judgments are common enough in your professional field that it makes sense to train those sorts of judgments).
I tried the game for a while… many of the questions are pretty hard IMO (especially the “which of these top-10 ranked things was ranked higher” ones), which makes it a bit difficult to learn to differentiate easy & hard questions.
I think that’s necessary? You want to have questions you can answer at every credence band between 99% and 50%, and that implies questions you’re only 60% sure of the answer of.
Imagine someone who acted appropriately towards particular risks (maybe not very artificial ones like betting, but someone who did things like saving an appropriate proportion of their income, spent an appropriate amount of their free time doing fun-but-dangerous things like skydiving), but couldn’t translate their risk attitudes into numbers.
I’m having difficulty replacing your quotation with its referent. Could you describe an activity I could do that would demonstrate that I was judging how much evidence I have on a given issue?
So, folks like Chris Ferguson are presumably doing both activities (judging how much evidence as well as accurately translating brain estimates to numerical estimates).
But if I go find a consistently successful poker player who does not translate brain estimates to numerical estimates, then I could see how that person does on calibration exercises. That sounds like a fun experiment. Now I just need to get the grant money …
Sidenote, but how would I narrow down to the successful poker players who don’t translate brain estimates to numerical estimates? I mean, I could always ask them up front, but how would I interpret an answer like “I don’t really use numbers all that much. I just go by feel.” Is that a brain that’s translating brain-based estimates to numerical estimates, then throwing away the numbers because of childhood mathematical scarring? Or is that a brain that’s doing something totally outside translating brain-based estimates to numerical estimates?
Can someone who’s done calibration training comment on whether it really seems to represent the ability to “judge how much evidence you have on a given issue”, as opposed to accurately translate brain-based probability estimates in to numerical probability estimates?
As I interpret it, the two are distinct but calibration training does both. That is, there’s both a “subjective feeling of certainty”->”probability number” model that’s being trained, and that model probably ought to be trained for every field independently (that is, determining how much subjective feeling of certainty you should have in different cases). There appears to be some transfer but I don’t think it’s as much as Yvain seems to be postulating.
Have you done calibration training? Do you recommend it? I think I remember someone from CFAR saying that it was kind of a niche skill, and to my knowledge it hasn’t been incorporated in to their curriculum (although Andrew Critch created a calibration android app?)
I’ve done a moderate amount of training. I think that the credence game is fun enough to put an hour or two into, but I think the claim that it’s worth putting serious effort into rests on the claim that it transfers (or that probabilistic judgments are common enough in your professional field that it makes sense to train those sorts of judgments).
CFAR calibration games
I tried the game for a while… many of the questions are pretty hard IMO (especially the “which of these top-10 ranked things was ranked higher” ones), which makes it a bit difficult to learn to differentiate easy & hard questions.
Other calibration quizzes
I think that’s necessary? You want to have questions you can answer at every credence band between 99% and 50%, and that implies questions you’re only 60% sure of the answer of.
Are these two things significantly different?
Imagine someone who acted appropriately towards particular risks (maybe not very artificial ones like betting, but someone who did things like saving an appropriate proportion of their income, spent an appropriate amount of their free time doing fun-but-dangerous things like skydiving), but couldn’t translate their risk attitudes into numbers.
I’m having difficulty replacing your quotation with its referent. Could you describe an activity I could do that would demonstrate that I was judging how much evidence I have on a given issue?
Make money playing poker, maybe?
Ah! That sounds like a great one!
So, folks like Chris Ferguson are presumably doing both activities (judging how much evidence as well as accurately translating brain estimates to numerical estimates).
But if I go find a consistently successful poker player who does not translate brain estimates to numerical estimates, then I could see how that person does on calibration exercises. That sounds like a fun experiment. Now I just need to get the grant money …
Sidenote, but how would I narrow down to the successful poker players who don’t translate brain estimates to numerical estimates? I mean, I could always ask them up front, but how would I interpret an answer like “I don’t really use numbers all that much. I just go by feel.” Is that a brain that’s translating brain-based estimates to numerical estimates, then throwing away the numbers because of childhood mathematical scarring? Or is that a brain that’s doing something totally outside translating brain-based estimates to numerical estimates?