Davidmanheim comments on Embedded World-Models

Davidmanheim 6 Nov 2018 9:01 UTC
6 points
...the weirdness of the injunction to optimize over a space containing every procedure you could ever do, including all of the optimization procedures you could ever do.
My most recent preprint discusses multi-agent Goodhart ( https://arxiv.org/abs/1810.10862 ) and uses the example of poker, along with a different argument somewhat related to the embedded agent problem, to say why the optimization over strategies needs to include optimizing over the larger solution space.
To summarize and try to clarify how I think it relates, strategies for game-playing must at least implicitly include a model of the other player’s actions, so that an agent can tell which strategies will work against them. We need uncertainty in that model, because if we do something silly like assume they are rational Bayesian agents, we are likely to act non-optimally against their actual strategy. But the model of the other agent itself needs to account for their model of our strategy, including uncertainty about our search procedure for strategies—otherwise the space is clearly much too large to optimize over.
Does this make sense? (I may need to expand on this and clarify my thinking...)