Important point: neither of the models in this post are really “the optimizer’s model of the world”. M1 is an observer’s model of the world (or the “God’s-eye view”); the world “is being optimized” according to that model, and there isn’t even necessarily “an optimizer” involved.M2 says what the world is being-optimized-toward.
To bring “an optimizer” into the picture, we’d probably want to say that there’s some subsystem which “chooses”/determines θ′, in such a way that E[−logP[X|M2]|M1(θ′)]≤E[−logP[X|M2]|M1(θ)], compared to some other θ-values. We might also want to require this to work robustly, across a range of environments, although the expectation does that to some extent already. Then the interesting hypothesis is that there’s probably a limit to how low such a subsystem can make the expected-description-length without making θ′ depend on other variables in the environment. To get past that limit, the subsystem needs things like “knowledge” and a “model” of its own—the basic purpose of knowledge/models for an optimizer is to make the output depend on the environment. And it’s that model/knowledge which seems likely to converge on a similar shared model/encoding of the world.
Important point: neither of the models in this post are really “the optimizer’s model of the world”. M1 is an observer’s model of the world (or the “God’s-eye view”); the world “is being optimized” according to that model, and there isn’t even necessarily “an optimizer” involved.M2 says what the world is being-optimized-toward.
To bring “an optimizer” into the picture, we’d probably want to say that there’s some subsystem which “chooses”/determines θ′, in such a way that E[−logP[X|M2]|M1(θ′)]≤E[−logP[X|M2]|M1(θ)], compared to some other θ-values. We might also want to require this to work robustly, across a range of environments, although the expectation does that to some extent already. Then the interesting hypothesis is that there’s probably a limit to how low such a subsystem can make the expected-description-length without making θ′ depend on other variables in the environment. To get past that limit, the subsystem needs things like “knowledge” and a “model” of its own—the basic purpose of knowledge/models for an optimizer is to make the output depend on the environment. And it’s that model/knowledge which seems likely to converge on a similar shared model/encoding of the world.
Thanks! I’m still wrapping my mind around a lot of this, but this gives me some new directions to think about.