I found this post interesting but somewhat confusing. You start by talking about UDT in order to talk about importance. But really the only connection from UDT to importance is the utility function, so you might as well start with that. And then you ignore utility functions in the rest of your post when you talk about Schmidhuber’s theory.
It just has a utility function which specifies what actions it should take in all of the possible worlds it finds itself in.
Not quite. The utility function doesn’t specify what action to take, it specifies what worlds are desirable. UDT also requires a prior over worlds and a specification of how the agent interacts with the world (like the Python programs here). The combination of this prior and the expected value computations that UDT does would constitute “beliefs”.
Informally, your decision policy tells you what options or actions to pay most attention to, or what possibilities are most important.
I don’t see how this is. Your decision policy tells you what to do once you already know what you can do. If you’re using “important” to mean “valuable” just say that instead.
I do like the idea of modelling the mind as an approximate compression engine. This is great for reducing some thought processes to algorithms. For example I think property dualism can be thought of as a way to compress the fact that I am me rather than some other person, or at least make explicit the fact that this must be compressed.
Schmidhuber’s theory is interesting but incomplete. You can create whatever compression problem you want through a template, e.g. a pseudorandom sequence you can only compress by guessing the seed. Yet repetitions of the same problem template are not necessarily interesting. It seems that some bits are more important than other bits; physicists are very interested in compressing the number of spacial dimensions in the universe even though this quantity can be specified in a few bits. I don’t know any formal approaches to quantifying the importance of compressing different things.
I wrote a paper on this subject (compression as it relates to theory of the mind). I also wrote this LessWrong post about using compression to learn values.
I found this post interesting but somewhat confusing. You start by talking about UDT in order to talk about importance. But really the only connection from UDT to importance is the utility function, so you might as well start with that. And then you ignore utility functions in the rest of your post when you talk about Schmidhuber’s theory.
Not quite. The utility function doesn’t specify what action to take, it specifies what worlds are desirable. UDT also requires a prior over worlds and a specification of how the agent interacts with the world (like the Python programs here). The combination of this prior and the expected value computations that UDT does would constitute “beliefs”.
I don’t see how this is. Your decision policy tells you what to do once you already know what you can do. If you’re using “important” to mean “valuable” just say that instead.
I do like the idea of modelling the mind as an approximate compression engine. This is great for reducing some thought processes to algorithms. For example I think property dualism can be thought of as a way to compress the fact that I am me rather than some other person, or at least make explicit the fact that this must be compressed.
Schmidhuber’s theory is interesting but incomplete. You can create whatever compression problem you want through a template, e.g. a pseudorandom sequence you can only compress by guessing the seed. Yet repetitions of the same problem template are not necessarily interesting. It seems that some bits are more important than other bits; physicists are very interested in compressing the number of spacial dimensions in the universe even though this quantity can be specified in a few bits. I don’t know any formal approaches to quantifying the importance of compressing different things.
I wrote a paper on this subject (compression as it relates to theory of the mind). I also wrote this LessWrong post about using compression to learn values.