Dagon comments on Rigging is a form of wireheading

Dagon 3 May 2018 19:56 UTC
4 points
see https://en.wikipedia.org/wiki/Goodhart%27s_law
If your measurement/reward is based on “knows the password” or “gives correct password”, you’re measuring a very poor proxy for what you actually want, more like “only people I’ve authorized to access it”. Harder to encode and measure, but also harder to game.
- Stuart_Armstrong 4 May 2018 8:39 UTC
  6 points
  Parent
  All forms of wireheading are variants of Goodhart’s law.