tailcalled comments on High Status Eschews Quantification of Performance

tailcalled 20 Mar 2023 10:25 UTC
4 points
2
Couldn’t you max out on calibration by guessing 50% for everything?
- niplav 20 Mar 2023 10:35 UTC
  5 points
  0
  Parent
  Yes. (To nitpick, with existing platforms one would max out calibration in expectation by guessing 17.5% or 39% or 29%.)
  
  Ideally we’d want to use a proper scoring rule, but this brings up other Goodharting issues: if people can select the questions they predict on, this will incentivize predicting on easier questions, and people who have made very few forecasts will often appear at the top of the ranking, so we’d like to use something like a credibility formula. I plan on writing something up on this.
  - Viliam 20 Mar 2023 12:27 UTC
    4 points
    0
    Parent
    if people can select the questions they predict on, this will incentivize predicting on easier questions
    True. On the other hand, if I publicly say that I consider myself an expert on X and ignorant on Y, should my self-assessment on X be penalized just because I got the answers on Y wrong?
    - tailcalled 20 Mar 2023 14:13 UTC
      4 points
      2
      Parent
      Depends on the correlation in accuracy within X vs between X and Y.
  - tailcalled 20 Mar 2023 10:39 UTC
    2 points
    0
    Parent
    What pool of questions would people make predictions on?