Isnasene comments on All I know is Goodhart

Isnasene 25 Oct 2019 3:27 UTC
3 points
I just wanted to add that, technically speaking, there are two levels of Goodhart’s Law worth discussing here:
1. Goodhart’s Law as traditionally defined: “When a measure becomes a target, it ceases to be a good measure.” AKA, proxies of utility functions, when optimized too aggressively, stop being proxies for those utility functions.
2. Goodhart’s Law as we deal with it in AI Safety: When a measure becomes a target, it actively causes you to miss the target. AKA, proxies of utility functions, when optimized too aggressively, will reduce the value outputs of those utility functions from where they were originally.
The traditional Goodhart’s Law strikes me as pretty general over a broad range of agents trying to optimize things.
The AI Safety version strikes me as pretty common too. But it’s contingent on the agent’s relationship with the universe in a way that the traditional version is not (ie, that agent already having an unusually high utility function relative to what you’d expect from the universe they’re in)
- Stuart_Armstrong 25 Oct 2019 11:41 UTC
  2 points
  Parent
  Yep, those are the two levels I mentioned :-)
  
  But I like your phrasing.