I’m still pretty confused by “You get what you measure” being framed as a distinct threat model from power-seeking AI (rather than as another sub-threat model)
I also consider catastrophic versions of “you get what you measure” to be a subset/framing/whatever of “misaligned power-seeking.” I think misaligned power-seeking is the main way the problem is locked in.
To a lesser extent, “you get what you measure” may also be an obstacle to using AI systems to help us navigate complex challenges without quick feedback, like improving governance. But I don’t think that’s an x-risk in itself, more like a missed opportunity to do better. This is in the same category as e.g. failures of the education system, though it’s plausibly better-leveraged if you have EA attitudes about AI being extremely important/leveraged. (ETA: I also view AI coordination, and differential capability progress, in a similar way.)
I also consider catastrophic versions of “you get what you measure” to be a subset/framing/whatever of “misaligned power-seeking.” I think misaligned power-seeking is the main way the problem is locked in.
To a lesser extent, “you get what you measure” may also be an obstacle to using AI systems to help us navigate complex challenges without quick feedback, like improving governance. But I don’t think that’s an x-risk in itself, more like a missed opportunity to do better. This is in the same category as e.g. failures of the education system, though it’s plausibly better-leveraged if you have EA attitudes about AI being extremely important/leveraged. (ETA: I also view AI coordination, and differential capability progress, in a similar way.)