Have P proxy and V value. Based on past observances P is correlated with V.
Increase P! (Either directly or by introducing a reward to the agents inside the system for increasing P, who cares)
Two cases:
P does not cause V
P causes V
Case 1: Wow, Goodhart is a genius! Even though I had a correlation, I increased one variable and the other did not increase!
Case 2: Wow, you are pedantic. Obviously if the relationship between the variables is so special that P causes V, Goodhart’s law won’t apply. If I increase the amount of weight lifted (proxy), then obviously I will get visibly bigger muscles (value). Booring! (Also, I’m really good at seeing causal relationships even when they don’t exist (human universal), so I will basically never feel surprise when I actually find one. That will be the expected outcome, so I will look strangely at anyone trying to test Goodhart’s law on any two pair of variables which have even a sliver of a chance of being in a causal relationship)
You are right, that is also a possibility. I only considered cases with one intervention, because the examples I’ve heard given for Goodhart’s law only contain one (I’m thinking of UK monetary policy, Soviet nail factory and other cases where some “manager” introduces an incentive toward a proxy to the system). However, multiple intervention cases can also be interesting. Do you know of a real world example where the first intervention on the proxy raised the target value, but the second, more extreme one, did not (or vica versa)? My intuition suggests that in the real world those type of causal influences are rare and also, I don’t think we can say that “P causes V” in those cases. Do you think that is too narrow of a definition?
Do you know of a real world example where the first intervention on the proxy raised the target value, but the second, more extreme one, did not (or vica versa)?
Here’s a fictional story:
You decide to study more. Your grades go up. You like that, so you decide to study really really hard. You get burnt out. Your grades go down. (There’s also an argument here that the metric—grades—isn’t necessarily ideal, but that’s a different thing.)*
*There might be a less extreme version involving ‘you stay up late studying’, and ‘because you get less sleep it has less effect (memory stuff)’.
This isn’t meant as an unsolvable problem—it’s just that:
You have limits
and
You can grow
are both true.
Maybe this style of mechanism, or ‘causal influence’ is rare. But its (biological) nature arguably, may characterize a domain (life). So in that area at least, it’s worth taking note of.
I guess I’m saying, if you want to know if you have to be worried about Goodhart’s Law, in general, I think it depends. Just spend time optimizing your metric, and spend time optimizing for you metric, and see what happens. If you want more specific feedback, I think you’ll probably have to be more specific.
Have P proxy and V value. Based on past observances P is correlated with V.
Increase P! (Either directly or by introducing a reward to the agents inside the system for increasing P, who cares)
Two cases:
P does not cause V
P causes V
Case 1: Wow, Goodhart is a genius! Even though I had a correlation, I increased one variable and the other did not increase!
Case 2: Wow, you are pedantic. Obviously if the relationship between the variables is so special that P causes V, Goodhart’s law won’t apply. If I increase the amount of weight lifted (proxy), then obviously I will get visibly bigger muscles (value). Booring! (Also, I’m really good at seeing causal relationships even when they don’t exist (human universal), so I will basically never feel surprise when I actually find one. That will be the expected outcome, so I will look strangely at anyone trying to test Goodhart’s law on any two pair of variables which have even a sliver of a chance of being in a causal relationship)
If you keep increasing P, the connection might break.
You are right, that is also a possibility. I only considered cases with one intervention, because the examples I’ve heard given for Goodhart’s law only contain one (I’m thinking of UK monetary policy, Soviet nail factory and other cases where some “manager” introduces an incentive toward a proxy to the system). However, multiple intervention cases can also be interesting. Do you know of a real world example where the first intervention on the proxy raised the target value, but the second, more extreme one, did not (or vica versa)? My intuition suggests that in the real world those type of causal influences are rare and also, I don’t think we can say that “P causes V” in those cases. Do you think that is too narrow of a definition?
Here’s a fictional story:
You decide to study more. Your grades go up. You like that, so you decide to study really really hard. You get burnt out. Your grades go down. (There’s also an argument here that the metric—grades—isn’t necessarily ideal, but that’s a different thing.)*
*There might be a less extreme version involving ‘you stay up late studying’, and ‘because you get less sleep it has less effect (memory stuff)’.
This isn’t meant as an unsolvable problem—it’s just that:
You have limits
and
You can grow
are both true.
Maybe this style of mechanism, or ‘causal influence’ is rare. But its (biological) nature arguably, may characterize a domain (life). So in that area at least, it’s worth taking note of.
I guess I’m saying, if you want to know if you have to be worried about Goodhart’s Law, in general, I think it depends. Just spend time optimizing your metric, and spend time optimizing for you metric, and see what happens. If you want more specific feedback, I think you’ll probably have to be more specific.
Even in your weightlifting example, there is a point where adding more weight no longer improves your outcome.