Fair point about implementation. I was imagining a non-consequentialist AI simulating consequentialist agents that would make plans of the form “run this piece of code and it will take care of the implementation” but there’s really no reason to assume that would be the case.
As far as architecture search, “search space” does seem like the right term, but I think long-term planning is potentially useful in a search space as much as it is in a stateful environment. If you think about the way a human researcher generates neural net architectures, they’re not just “trying things” in order to explore the search space… they generate abstract theories of how and why different approaches work, experiment with different approaches in order to test those theories, and then iterate. A really good NAS system would do the same, and “generate plausible hypotheses and find efficient ways to test them” is a planning problem.
they generate abstract theories of how and why different approaches work, experiment with different approaches in order to test those theories, and then iterate.
This description makes it sound like the researcher looks ahead about 1 step. I think that’s short-term planning, not long-term planning.
My intuition is that the most important missing puzzle pieces for AGI involve the “generate abstract theories of how and why different approaches work” part. Once you’ve figured that out, there’s a second step of searching for an experiment which will let you distinguish between your current top few theories. In terms of competitiveness, I think the “long-term planning free” approach of looking ahead just 1 step will likely prove just as competitive if not more so than trying to look ahead multiple steps. (Doing long-term planning means spending a lot of time refining theories about hypothetical data points you haven’t yet gathered! That seems a bit wasteful, since most possible data points won’t actually get gathered. Why not spend that compute gathering data instead?)
But I also think this may all be beside the point. Remember my claim from further up this thread:
In machine learning, we search the space of models, trying to find models which do a good job of explaining the data. Attaining new resources means searching the space of plans, trying to find a plan which does a good job of attaining new resources. (And then executing that plan!) These are different search tasks with different objective functions.
For the sake of argument, I’ll assume we’ll soon see major gains from long-term planning and modify my statement so it reads:
In machine learning++, we make plans for collecting data and refining theories about that data. Attaining new resources means making plans for manipulating the physical world. (And then executing that plan!) These are different search tasks with different objective functions.
Even in a world where long-term planning is a critical element of machine learning++, it seems to me that the state space that these plans act on is an abstract state space corresponding to states of knowledge of the system. It’s not making plans for acting in the physical world, except accidentally insofar as it does computations which are implemented in the physical world. Despite its superhuman planning abilities, AlphaGo did not make any plans for e.g. manipulating humans in the physical world, because the state space it did its planning over only involved Go stones.
Fair point about implementation. I was imagining a non-consequentialist AI simulating consequentialist agents that would make plans of the form “run this piece of code and it will take care of the implementation” but there’s really no reason to assume that would be the case.
As far as architecture search, “search space” does seem like the right term, but I think long-term planning is potentially useful in a search space as much as it is in a stateful environment. If you think about the way a human researcher generates neural net architectures, they’re not just “trying things” in order to explore the search space… they generate abstract theories of how and why different approaches work, experiment with different approaches in order to test those theories, and then iterate. A really good NAS system would do the same, and “generate plausible hypotheses and find efficient ways to test them” is a planning problem.
This description makes it sound like the researcher looks ahead about 1 step. I think that’s short-term planning, not long-term planning.
My intuition is that the most important missing puzzle pieces for AGI involve the “generate abstract theories of how and why different approaches work” part. Once you’ve figured that out, there’s a second step of searching for an experiment which will let you distinguish between your current top few theories. In terms of competitiveness, I think the “long-term planning free” approach of looking ahead just 1 step will likely prove just as competitive if not more so than trying to look ahead multiple steps. (Doing long-term planning means spending a lot of time refining theories about hypothetical data points you haven’t yet gathered! That seems a bit wasteful, since most possible data points won’t actually get gathered. Why not spend that compute gathering data instead?)
But I also think this may all be beside the point. Remember my claim from further up this thread:
For the sake of argument, I’ll assume we’ll soon see major gains from long-term planning and modify my statement so it reads:
Even in a world where long-term planning is a critical element of machine learning++, it seems to me that the state space that these plans act on is an abstract state space corresponding to states of knowledge of the system. It’s not making plans for acting in the physical world, except accidentally insofar as it does computations which are implemented in the physical world. Despite its superhuman planning abilities, AlphaGo did not make any plans for e.g. manipulating humans in the physical world, because the state space it did its planning over only involved Go stones.