Having tried to play with this, I’ll strongly agree that random functions on R^N aren’t a good place to start. But I’ve simulated random nodes in the middle of a causal DAG, or selecting ones for high correlation, and realized that they aren’t particularly useful either; people have some appreciation of causal structure, and they aren’t picking metrics randomly for high correlation—they are simply making mistakes in their causal reasoning, or missing potential ways that the metric can be intercepted. (But I was looking for specific things about how the failures manifested, and I was not thinking about gradient descent, so maybe I’m missing your point.)
Another piece I’d guess is relevant here is generalized efficient markets. If you generate a DAG and start out with random parameters, then start optimizing for a proxy node right away, then you’re not going to be near any sort of pareto frontier, so trade-offs won’t be an issue. You won’t see a Goodhart effect.
In practice, most of the systems we deal with already have some optimization pressure. They may not be optimal for our main objective, but they’ll at least be pareto-optimal for any cross-section of nodes. Physically, that’s because people do just fine locally optimizing whatever node they’re in charge of—it’s the nonlocal tradeoffs between distant nodes that are tough to deal with (at least without competitive price mechanisms).
So if you want to see Goodhart effects, first you have to push up to that pareto frontier. Otherwise, changes applied to optimize the proxy are not going to have systematically negative impact on other nodes in parallel to the proxy; the impacts will just be random.
Having tried to play with this, I’ll strongly agree that random functions on R^N aren’t a good place to start. But I’ve simulated random nodes in the middle of a causal DAG, or selecting ones for high correlation, and realized that they aren’t particularly useful either; people have some appreciation of causal structure, and they aren’t picking metrics randomly for high correlation—they are simply making mistakes in their causal reasoning, or missing potential ways that the metric can be intercepted. (But I was looking for specific things about how the failures manifested, and I was not thinking about gradient descent, so maybe I’m missing your point.)
Another piece I’d guess is relevant here is generalized efficient markets. If you generate a DAG and start out with random parameters, then start optimizing for a proxy node right away, then you’re not going to be near any sort of pareto frontier, so trade-offs won’t be an issue. You won’t see a Goodhart effect.
In practice, most of the systems we deal with already have some optimization pressure. They may not be optimal for our main objective, but they’ll at least be pareto-optimal for any cross-section of nodes. Physically, that’s because people do just fine locally optimizing whatever node they’re in charge of—it’s the nonlocal tradeoffs between distant nodes that are tough to deal with (at least without competitive price mechanisms).
So if you want to see Goodhart effects, first you have to push up to that pareto frontier. Otherwise, changes applied to optimize the proxy are not going to have systematically negative impact on other nodes in parallel to the proxy; the impacts will just be random.