This sounds roughly right to me. I think concretely this wouldn’t catch people off guard very often. We have a lot of experience trying to model the thoughts of other people, in large part because we need to do this to communicate with them. I’d feel pretty comfortable basically saying, “I bet I could predict what Stuart will think in areas of Anthropology, but I really don’t know his opinions of British politics”.
If forecasters are calibrated, then on average they shouldn’t be overconfident. It’s expected there will be pockets where they are, but I think the damage caused here isn’t particularly high.
But it seems like you’re just saying the issue I’m gesturing at shouldn’t cause mis-calibration or overconfidence, rather than that it won’t reduce the resolution/accuracy or the practical usefulness of a system based on X predicting what Y will think?
That sounds right. However, I think that being properly calibrated is a really big deal, and a major benefit compared to other approaches.
On the part:
But I’d guess that more explicit, less “black box” approaches for predicting what Y will say will tend to either be more robust to distributional shift or more able to fail gracefully, such as recognising that uncertainty is now much higher and there’s a need to think more carefully.
If there are good additional approaches that are less black-box, I see them ideally being additions to this rough framework. There are methods to encourage discussion and information sharing, including with the Judge / the person’s beliefs who is being predicted.
This sounds roughly right to me. I think concretely this wouldn’t catch people off guard very often. We have a lot of experience trying to model the thoughts of other people, in large part because we need to do this to communicate with them. I’d feel pretty comfortable basically saying, “I bet I could predict what Stuart will think in areas of Anthropology, but I really don’t know his opinions of British politics”.
If forecasters are calibrated, then on average they shouldn’t be overconfident. It’s expected there will be pockets where they are, but I think the damage caused here isn’t particularly high.
That makes sense to me.
But it seems like you’re just saying the issue I’m gesturing at shouldn’t cause mis-calibration or overconfidence, rather than that it won’t reduce the resolution/accuracy or the practical usefulness of a system based on X predicting what Y will think?
That sounds right. However, I think that being properly calibrated is a really big deal, and a major benefit compared to other approaches.
On the part:
If there are good additional approaches that are less black-box, I see them ideally being additions to this rough framework. There are methods to encourage discussion and information sharing, including with the Judge / the person’s beliefs who is being predicted.