This looks great, and I think some of the work started by Scott Garrabrant on categorizing Goodhart’s law addresses one component of this—but I think there is more and useful work on further categorizing the failure modes. This list looks like a good resource for doing so. On the other hand, I think the categorization would be helped by looking at non AI/ML systems to see how gaming occurs more widely. (Humans game systems differently than AI currently does, in part because we’re still smarter. That means looking at humans might help understand how else AI can game metrics.)
This looks great, and I think some of the work started by Scott Garrabrant on categorizing Goodhart’s law addresses one component of this—but I think there is more and useful work on further categorizing the failure modes. This list looks like a good resource for doing so. On the other hand, I think the categorization would be helped by looking at non AI/ML systems to see how gaming occurs more widely. (Humans game systems differently than AI currently does, in part because we’re still smarter. That means looking at humans might help understand how else AI can game metrics.)
For a take on the topic that goes much farther outside the realm of AI examples, see my older blog post on measurement: https://www.ribbonfarm.com/2016/06/09/goodharts-law-and-why-measurement-is-hard/