Great comment—these were both things I thought about putting in the post, but didn’t quite fit.
Goodhart, in particular, is a huge reason to avoid relying on many bits of selection, even aside from the exponential problem. Of course we also have to be careful of Goodhart when designing training programs, but at least there we have more elbow room to iterate and examine the results, and less incentive for the trainees to hack the process.
Great comment—these were both things I thought about putting in the post, but didn’t quite fit.
Goodhart, in particular, is a huge reason to avoid relying on many bits of selection, even aside from the exponential problem. Of course we also have to be careful of Goodhart when designing training programs, but at least there we have more elbow room to iterate and examine the results, and less incentive for the trainees to hack the process.