johnswentworth comments on Utility Maximization = Description Length Minimization

johnswentworth 27 May 2021 23:34 UTC
4 points
Important point: neither of the models in this post are really “the optimizer’s model of the world”. $M_{1}$ is an observer’s model of the world (or the “God’s-eye view”); the world “is being optimized” according to that model, and there isn’t even necessarily “an optimizer” involved. $M_{2}$ says what the world is being-optimized-toward.
To bring “an optimizer” into the picture, we’d probably want to say that there’s some subsystem which “chooses”/determines $θ^{'}$ , in such a way that $E [- log P [X | M_{2}] | M_{1} (θ^{'})] \leq E [- log P [X | M_{2}] | M_{1} (θ)]$ , compared to some other $θ$ -values. We might also want to require this to work robustly, across a range of environments, although the expectation does that to some extent already. Then the interesting hypothesis is that there’s probably a limit to how low such a subsystem can make the expected-description-length without making $θ^{'}$ depend on other variables in the environment. To get past that limit, the subsystem needs things like “knowledge” and a “model” of its own—the basic purpose of knowledge/models for an optimizer is to make the output depend on the environment. And it’s that model/knowledge which seems likely to converge on a similar shared model/encoding of the world.
- Jeffrey Ladish 28 May 2021 21:10 UTC
  3 points
  Parent
  Thanks! I’m still wrapping my mind around a lot of this, but this gives me some new directions to think about.