Alex Flint comments on Clarifying the Agent-Like Structure Problem

Alex Flint 1 Oct 2022 18:16 UTC
LW: 6 AF: 5
1
AF

I think the fourth example is the one that I’m most confused about. Natural selection kind of has a world model, in the sense that the organisms have DNA which is adapted to the world. Natural selection also kind of has a planning process, it’s just a super myopic one on the time-scale of evolution (involving individuals making mating choices). But it definitely feels like “natural selection has a world model and planning process” is a sentence that comes with caveats, which makes me suspect that these may not be the right concepts.

Yeah I think you’ve said it well here.

Another similar example: Consider a computer that trains robots and deploys a new one. Suppose for the sake of this example that the individual robots definitely do not do planning or have a world model, but still can execute some simple policy such as “go to this place, collect this resource, construct this simple structure, etc”. The computer that trains and deploys the robots does so by taking all the robots that were deployed on the previous day, selecting the ones that performed best according to a certain objective such as collecting a certain resource, and deploying more robots like that. This is a basic evolutionary algorithm.

Like in the case of evolution, it’s a bit difficult to say where the “world model” and “planning process” are in this example. If they are anywhere, they are kind of distributed through the computer/robot/world system.

OK now consider a modification to the above example. The previous example is going to optimize very slowly. Suppose we make the optimization go faster in the following way: we collect video data from each of the robots, and the central computer uses the data collected by each of the robots on the previous day to train, using reinforcement learning rather than evolutionary search, the robots for the next day. To do this, it trains, using supervised learning on the raw video data, a predictor that maps robot policies to predicted outcomes, and then, using reinforcement learning, searches for robot policies that are predicted to perform well. Now we have a very clear world model and planning process—the world model is the trained prediction function and the planning process is the search over robot policies with respect to that prediction function. But the way we got here was as a performance optimization of a process that had a very unclear world model and planning process.

It seems to me that human AI engineers have settled on a certain architecture for optimizing design processes. That architecture, roughly speaking, is to form an explicit world model and do explicit search over it. But I suspect this is just one architecture by which one can organize information in order to take an action. It seems like a very clean architecture to me, but I’m not sure that all natural processes that organize information in order to take an action will do so using this architecture.