I agree with almost everything you’ve said about LLMs.
I still think we’re getting human-level AGI soonish. The LLM part doesn’t need to be any better than it is.
A human genius with no one-shot memory (severe anterograde amnesia) and very poor executive function (ability to stay on task and organize their thinking) would be almost useless—just like LLMs are.
LLMs replicate only part of humans’ general intelligence. It’s the biggest part, but it just wouldn’t work very well without the other contributing brain systems. Human intelligence, and its generality (in particular our ability to solve truly novel problems) is an emergent property of interactions among multiple brain systems (or a complex property if you don’t like that term).
In brief, LLMs are like a human posterior cortex. A human with only a posterior cortex would be about as little use as an LLM (of course this analogy is imperfect but it’s close). We need a prefrontal cortex (for staying on task, “executive function”), a medial temporal cortex and hippocampus for one-shot learning, and a basal ganglia for making better decisions than just whatever first comes to mind.
The executive function part is largely covered by RL on CoT in situations where there’s a verifiable correct answer. That’s a route humans don’t have available; our RL is clumsier than that; we have to notice specific strategies to recognize their value.
So in areas where that won’t work, there’s another process of self-directed learning that humans use. This is open to language model cognitive architectures. Off-the-shelf systems and techniques, including RAG (for one-shot memory) and fine-tuning (for learning new important knowledge/skills) should work in theory, and people are making progress on both.
So I think you’re right about LLMs showing sharply diminishing returns, and I think this hardly slows down progress to human-level AGI at all.
The only major difference in this path is that human-level AGI will still be trammeled by using a human-like approach. It won’t go from genius-level to supergenius to superhuman (at general problem-solving or specific domains) overnight. It could take years to make progress in a more human-like style.
That could even give time for novel techniques/approaches to surpass it.
I’m pretty sure the analogy to human cognition is a very obvious route to progress as people keep trying to develop LLM-based agents to do actual valuable work. So I don’t think there’s any reasonable hope that the many bright-to-brilliant people working on AI wouldn’t take that route if it works.
This is an interesting model, and I know you acknowledged that progress could take years, but my impression is that this would be even more difficult than you’re implying. Here are the problems I see, and I apologize in advance if this doesn’t all make sense as I am a non-technical newb.
Wouldn’t it take insane amounts of compute to process all of this? LLM + CoT already uses a lot of compute (see: o3 solving ARC puzzles for $1mil). Combining this with processing images/screenshots/video/audio, plus using tokens for incorporating saved episodic memories into working memory, plus tokens for the decision-making (basal ganglia) module = a lot of tokens. Can this all fit into a context window and be processed with the amount of compute that will be available? Even if one extremely expensive system could run this, could you have millions of agents running this system for long periods of time?
How do you train this? LLMs are superhuman at language processing due to training on billions of pieces of text. How do you train an agent similarly? We don’t have billions of examples of a system like this being used to achieve goals. I don’t think we have any examples. You could put together a system like this today, but it would be bad (see: Claude playing Pokemon). How does it improve? I think it would have to actually carry out tasks and RL on them. In order for it to improve on long-horizon tasks, it would take long-horizon timeframes to get reinforcement signals. You could run simulations, but will they come anywhere close to matching the complexity of the real world? And then there’s the issue of scalable RL only working for tasks with a defined goal: how would it improve on open-ended problems?
If an LLM is at the core of the system, do hallucinations from the LLM “poison the well” so to speak? You can give it tools, but if the LLM at the core doesn’t know what’s true or false, how does it effectively use them? I’ve seen examples like: an LLM got a math problem wrong, then the user asked it to solve using a python script. The python script produced the correct answer, but then the LLM just repeated the wrong answer and added a fake floating point error that it said came from the python script. So it seems like hallucinated errors from the LLM could derail the whole system. I suppose that being trained and RL’d on solving tasks would eventually lead it to learning what works from what doesn’t, but I guess I’m just pointing out why adding additional modules and tools doesn’t automatically solve the issues of LLMs.
I don’t think this path is easy; I think immense effort and money will be directed at it by default, since there’s so much money to be made by replacing human labor with agents. And I think no breakthroughs are necessary, just work in fairly obvious directions. That’s why I think this is likely to lead to human-level agents.
I don’t think it would take insane amounts of compute, but compute costs will be substantial. They’ll be roughly like costs for OpenAIs Operator; it runs autonomously, making calls to frontier LLMs and vision models essentially continuously. Costs are low enough that $200/month covers unlimited use. (although that thing is so useless people probably aren’t using it much. So the compute costs of o1 pro thinking away continuously are probably a better indicator; Altman said $200/mo doesn’t quite cover the average, driven by some users keeping as many going constantly as they can.
It can’t all be fit into a context window for complex tasks. And it’s costly even when the whole task would fit. That’s why additional memory systems are needed. There are already context window management techniques in play for existing limited agents. And RAG systems seem to already be adequate to serve as episodic memory; humans use much fewer memory “tokens” to accomplish complex tasks than the large amount of documentation stored in current RAG systems used for non-agentic retrieval assisted generation of answers to questions that rely on documented information.
So I’d estimate something like $20-30 for an agent to run all day. This could come down a lot if you managed to have many of its calls use smaller/cheaper LLMs than whatever is the current latest and greatest.
Humans train themselves to act agentically by assembling small skills (pick up the food and put it in your mouth, run forward, look for tracks) into long time horizon tasks (hunting). We do not learn by performing RL on long sequences and applying the learning to everything we did to get there. We do something like RL, but it’s very targeted on specific hypotheses that we produced about how to accomplish this task.
Thinking about how humans learn new tasks provides a pretty direct analogy. We make explicit hypotheses about what we need to learn, then specific strategies for learning it. This is as an adult; the pretraining of the LLM gives roughly adult-level performance of simple tasks that were well-represented in the training set.
Claude playing Pokemon is bad in large part because it has no episodic memory, a great illustration. It wouldn’t be great even with it. It would need a real self-directed learning process. People have only barely started to implement these (to my limited knowledge- some company in stealth mode might be well along, but I doubt it—they’d need to be public to get adequate funding for real progress.
Hallucinations are much less of an issue in current-gen LLMs than older generations. They’re still an issue. Agents would need to do what humans do: ask themselves “am I sure? How can I check?” for important pieces of information. The human brain hallucinates just like LLMs if you just go with the first answer that springs to mind like LLMs usually do. You need to implement a routine for deciding which knowledge is important and how you can use multiple sources of information and thinking to check if it’s right. Humans do this only by learning cognitive strategies; kids do accept hallucinations and so are pretty useless for getting things done :), just like current LLM agents.
It won’t go from genius-level to supergenius to superhuman (at general problem-solving or specific domains) overnight. It could take years to make progress in a more human-like style.
But AI speed advantage? It’s 100x-1000x faster, so years become days to weeks. Compute for experiments is plausibly a bottleneck that makes it take longer, but at genius human level decades of human theory and software development progress (things not bottlenecked on experiments) will be made by AIs in months. That should help a lot in making years of physical time unlikely to be necessary, to unlock more compute efficient and scalable ways of creating smarter AIs.
Yes, probably. The progression thus far is that the same level of intelligence gets more efficient—faster or cheaper.
I actually think current systems don’t really think much faster than humans—they’re just faster at putting words to thoughts, since their thinking is more closely tied to text. But if they don’t keep getting smarter, they will still likely keep getting faster and cheaper.
I kinda agree with this as well. Except that it seems completely unclear to me whether recreating the missing human capabilities/brain systems takes two years or two decades or even longer.
It doesn’t seem to me to be a single missing thing and for each separate step holds: That it hasn’t been done yet is evidence that it’s not that easy.
I agree with almost everything you’ve said about LLMs.
I still think we’re getting human-level AGI soonish. The LLM part doesn’t need to be any better than it is.
A human genius with no one-shot memory (severe anterograde amnesia) and very poor executive function (ability to stay on task and organize their thinking) would be almost useless—just like LLMs are.
LLMs replicate only part of humans’ general intelligence. It’s the biggest part, but it just wouldn’t work very well without the other contributing brain systems. Human intelligence, and its generality (in particular our ability to solve truly novel problems) is an emergent property of interactions among multiple brain systems (or a complex property if you don’t like that term).
See Capabilities and alignment of LLM cognitive architectures
In brief, LLMs are like a human posterior cortex. A human with only a posterior cortex would be about as little use as an LLM (of course this analogy is imperfect but it’s close). We need a prefrontal cortex (for staying on task, “executive function”), a medial temporal cortex and hippocampus for one-shot learning, and a basal ganglia for making better decisions than just whatever first comes to mind.
The executive function part is largely covered by RL on CoT in situations where there’s a verifiable correct answer. That’s a route humans don’t have available; our RL is clumsier than that; we have to notice specific strategies to recognize their value.
So in areas where that won’t work, there’s another process of self-directed learning that humans use. This is open to language model cognitive architectures. Off-the-shelf systems and techniques, including RAG (for one-shot memory) and fine-tuning (for learning new important knowledge/skills) should work in theory, and people are making progress on both.
So I think you’re right about LLMs showing sharply diminishing returns, and I think this hardly slows down progress to human-level AGI at all.
The only major difference in this path is that human-level AGI will still be trammeled by using a human-like approach. It won’t go from genius-level to supergenius to superhuman (at general problem-solving or specific domains) overnight. It could take years to make progress in a more human-like style.
That could even give time for novel techniques/approaches to surpass it.
I’m pretty sure the analogy to human cognition is a very obvious route to progress as people keep trying to develop LLM-based agents to do actual valuable work. So I don’t think there’s any reasonable hope that the many bright-to-brilliant people working on AI wouldn’t take that route if it works.
This is an interesting model, and I know you acknowledged that progress could take years, but my impression is that this would be even more difficult than you’re implying. Here are the problems I see, and I apologize in advance if this doesn’t all make sense as I am a non-technical newb.
Wouldn’t it take insane amounts of compute to process all of this? LLM + CoT already uses a lot of compute (see: o3 solving ARC puzzles for $1mil). Combining this with processing images/screenshots/video/audio, plus using tokens for incorporating saved episodic memories into working memory, plus tokens for the decision-making (basal ganglia) module = a lot of tokens. Can this all fit into a context window and be processed with the amount of compute that will be available? Even if one extremely expensive system could run this, could you have millions of agents running this system for long periods of time?
How do you train this? LLMs are superhuman at language processing due to training on billions of pieces of text. How do you train an agent similarly? We don’t have billions of examples of a system like this being used to achieve goals. I don’t think we have any examples. You could put together a system like this today, but it would be bad (see: Claude playing Pokemon). How does it improve? I think it would have to actually carry out tasks and RL on them. In order for it to improve on long-horizon tasks, it would take long-horizon timeframes to get reinforcement signals. You could run simulations, but will they come anywhere close to matching the complexity of the real world? And then there’s the issue of scalable RL only working for tasks with a defined goal: how would it improve on open-ended problems?
If an LLM is at the core of the system, do hallucinations from the LLM “poison the well” so to speak? You can give it tools, but if the LLM at the core doesn’t know what’s true or false, how does it effectively use them? I’ve seen examples like: an LLM got a math problem wrong, then the user asked it to solve using a python script. The python script produced the correct answer, but then the LLM just repeated the wrong answer and added a fake floating point error that it said came from the python script. So it seems like hallucinated errors from the LLM could derail the whole system. I suppose that being trained and RL’d on solving tasks would eventually lead it to learning what works from what doesn’t, but I guess I’m just pointing out why adding additional modules and tools doesn’t automatically solve the issues of LLMs.
I don’t think this path is easy; I think immense effort and money will be directed at it by default, since there’s so much money to be made by replacing human labor with agents. And I think no breakthroughs are necessary, just work in fairly obvious directions. That’s why I think this is likely to lead to human-level agents.
I don’t think it would take insane amounts of compute, but compute costs will be substantial. They’ll be roughly like costs for OpenAIs Operator; it runs autonomously, making calls to frontier LLMs and vision models essentially continuously. Costs are low enough that $200/month covers unlimited use. (although that thing is so useless people probably aren’t using it much. So the compute costs of o1 pro thinking away continuously are probably a better indicator; Altman said $200/mo doesn’t quite cover the average, driven by some users keeping as many going constantly as they can.
It can’t all be fit into a context window for complex tasks. And it’s costly even when the whole task would fit. That’s why additional memory systems are needed. There are already context window management techniques in play for existing limited agents. And RAG systems seem to already be adequate to serve as episodic memory; humans use much fewer memory “tokens” to accomplish complex tasks than the large amount of documentation stored in current RAG systems used for non-agentic retrieval assisted generation of answers to questions that rely on documented information.
So I’d estimate something like $20-30 for an agent to run all day. This could come down a lot if you managed to have many of its calls use smaller/cheaper LLMs than whatever is the current latest and greatest.
Humans train themselves to act agentically by assembling small skills (pick up the food and put it in your mouth, run forward, look for tracks) into long time horizon tasks (hunting). We do not learn by performing RL on long sequences and applying the learning to everything we did to get there. We do something like RL, but it’s very targeted on specific hypotheses that we produced about how to accomplish this task.
Thinking about how humans learn new tasks provides a pretty direct analogy. We make explicit hypotheses about what we need to learn, then specific strategies for learning it. This is as an adult; the pretraining of the LLM gives roughly adult-level performance of simple tasks that were well-represented in the training set.
Claude playing Pokemon is bad in large part because it has no episodic memory, a great illustration. It wouldn’t be great even with it. It would need a real self-directed learning process. People have only barely started to implement these (to my limited knowledge- some company in stealth mode might be well along, but I doubt it—they’d need to be public to get adequate funding for real progress.
Hallucinations are much less of an issue in current-gen LLMs than older generations. They’re still an issue. Agents would need to do what humans do: ask themselves “am I sure? How can I check?” for important pieces of information. The human brain hallucinates just like LLMs if you just go with the first answer that springs to mind like LLMs usually do. You need to implement a routine for deciding which knowledge is important and how you can use multiple sources of information and thinking to check if it’s right. Humans do this only by learning cognitive strategies; kids do accept hallucinations and so are pretty useless for getting things done :), just like current LLM agents.
But AI speed advantage? It’s 100x-1000x faster, so years become days to weeks. Compute for experiments is plausibly a bottleneck that makes it take longer, but at genius human level decades of human theory and software development progress (things not bottlenecked on experiments) will be made by AIs in months. That should help a lot in making years of physical time unlikely to be necessary, to unlock more compute efficient and scalable ways of creating smarter AIs.
Yes, probably. The progression thus far is that the same level of intelligence gets more efficient—faster or cheaper.
I actually think current systems don’t really think much faster than humans—they’re just faster at putting words to thoughts, since their thinking is more closely tied to text. But if they don’t keep getting smarter, they will still likely keep getting faster and cheaper.
I kinda agree with this as well. Except that it seems completely unclear to me whether recreating the missing human capabilities/brain systems takes two years or two decades or even longer.
It doesn’t seem to me to be a single missing thing and for each separate step holds: That it hasn’t been done yet is evidence that it’s not that easy.