One thing you could do instead of scoring people against expert assessments is also potentially score people against the final aggregate and extremized distribution.
I think that an efficient use of expert assessments would be for them to see the aggregate, and then basically adjust that as is necessary, but to try to not do much original research. I just wrote a more recent shortform post about this.
One issue with any framework like this is that general calibration may be very different than calibration at the tails.
I think that we can get calibration to be as good as experts can figure out, and that could be enough to be really useful.
Good points!
Also, thanks for the link, that’s pretty neat.
I think that an efficient use of expert assessments would be for them to see the aggregate, and then basically adjust that as is necessary, but to try to not do much original research. I just wrote a more recent shortform post about this.
I think that we can get calibration to be as good as experts can figure out, and that could be enough to be really useful.