Right now I am trying to better understand future AI systems, by first thinking about what sort of abilities I expect every system of high cognitive power will have, and second, trying to find a concrete practical implementation of this ability. One ability is building a model of the world, that has certain desiderata. For example, if we have multiple agents in the world, then we can factor the world, such that we can build just one model of the agent, and point to this model in our description of the world two times. This is something that Solomonoff induction can also do. I am interested in constraining the world model, such that we always get out a world model that has a similar structure, such that the world model becomes more interpretable. I.e. I try to find a way for building a world model, where we mainly need to understand the world model’s content, as it is easy to understand how the content is organized.
Right now I am trying to better understand future AI systems, by first thinking about what sort of abilities I expect every system of high cognitive power will have, and second, trying to find a concrete practical implementation of this ability. One ability is building a model of the world, that has certain desiderata. For example, if we have multiple agents in the world, then we can factor the world, such that we can build just one model of the agent, and point to this model in our description of the world two times. This is something that Solomonoff induction can also do. I am interested in constraining the world model, such that we always get out a world model that has a similar structure, such that the world model becomes more interpretable. I.e. I try to find a way for building a world model, where we mainly need to understand the world model’s content, as it is easy to understand how the content is organized.