I agree with a lot of those points, but suspect there may be fundamental limits to planning capabilities related to the unidirectionality of current feed forward networks.
If we look at something even as simple as how a mouse learns to navigate a labyrinth, there’s both a learning of the route to the reward but also a learning of how to get back to the start which adjusts according to the evolving learned layout of the former (see paper: https://elifesciences.org/articles/66175 ).
I don’t see the SotA models doing well at that kind of reverse planning, and expect that nonlinear tasks are going to pose significant agentic challenges until architectures shift to something new.
So it could be 3-5 years to get to AGI depending on hardware and architecture advances, or we might just end up in a sort of weird “bit of both” world where we have models that are beyond expert human level superintelligent in specific scopes but below average in other tasks.
But when we finally do get models that in both training and operation exhibit bidirectional generation across large context windows, I think it will only be a very short time until some rather unbelievable goalposts are passed by.
I agree with a lot of those points, but suspect there may be fundamental limits to planning capabilities related to the unidirectionality of current feed forward networks.
If we look at something even as simple as how a mouse learns to navigate a labyrinth, there’s both a learning of the route to the reward but also a learning of how to get back to the start which adjusts according to the evolving learned layout of the former (see paper: https://elifesciences.org/articles/66175 ).
I don’t see the SotA models doing well at that kind of reverse planning, and expect that nonlinear tasks are going to pose significant agentic challenges until architectures shift to something new.
So it could be 3-5 years to get to AGI depending on hardware and architecture advances, or we might just end up in a sort of weird “bit of both” world where we have models that are beyond expert human level superintelligent in specific scopes but below average in other tasks.
But when we finally do get models that in both training and operation exhibit bidirectional generation across large context windows, I think it will only be a very short time until some rather unbelievable goalposts are passed by.