Um. Said utility function requires that you already know the true underlying value function[1].
If you already know the true underlying value function, Goodhart’s law doesn’t apply anyway. The tricky bit with Goodhart’s law is trying to find said true underlying value function in the first place—close is not good enough.
Well, strictly speaking it needs to know both the proxy and the difference between the proxy and the true underlying value function, which is sufficient to recreate the true underlying value function.
Um. Said utility function requires that you already know the true underlying value function[1].
If you already know the true underlying value function, Goodhart’s law doesn’t apply anyway. The tricky bit with Goodhart’s law is trying to find said true underlying value function in the first place—close is not good enough.
Well, strictly speaking it needs to know both the proxy and the difference between the proxy and the true underlying value function, which is sufficient to recreate the true underlying value function.