Is there a good bijection between specification gaming and wireheading vs different types of Goodhart’s law?
Seems like this has been done already.
https://www.alignmentforum.org/posts/yXPT4nr4as7JvxLQa/classifying-specification-problems-as-variants-of-goodhart-s
Is there a good bijection between specification gaming and wireheading vs different types of Goodhart’s law?
Seems like this has been done already.
https://www.alignmentforum.org/posts/yXPT4nr4as7JvxLQa/classifying-specification-problems-as-variants-of-goodhart-s