I think this fits your framing and I left it out of my long comment on Egg’s other excellent post on LLM generality: LLM agents don’t have to be that good at full generality to outperform people. I don’t think humans truly do that well at it either.
Limited generality can cover arbitrarily huge portions of task space. And we’re not going for most efficient algorithm here, we’re going for first available route to exceed human capabilities within reasonable per-cognition end-user budgets.
To outmatch humans in every way and pose an x-risk, they’ve got to advance toward full generality as humans do, by solving totally novel problems when necessary.
I think humans almost never do true out-of-distribution generalization. We learn and deploy abstract concepts that make what was formerly out-of-training-set into within-training-set. Usually, we learn that concept from another human. Once in a while, we derive our own new abstract concepts.
LLMs can’t do this yet. But it might not be hard to scaffold them to be as good at it as humans are, because we’re not as good as we’d like to imagine.
Pattern-matching reasoning using problem-solving formulas we’ve learned from others covers the vast majority of important tasks in the current world. So even if they’re not fully general, LLMs might exceed human capabilities in most tasks. And they might be scaffolded to derive and deploy new concepts as well or better than humans. We don’t actually succeed at it often.
I think humans only do this, come up with genuinely new concepts and thereby reason in totally novel domains or achieve full generality, maybe a few times in a lifetime, and certainly not daily. We did not understand Newtonian physics easily, despite the readily available data, nor quantum physics once the relevant data was available. If you watch a young child work to match shapes to holes in those toys, you will be either horrified or amused. It takes them a very long time (weeks, not hours) to understand what looks drop-dead simple to us, because we’ve already learned the relevant problem-solving algorithm.
We’re not as smart as we’d like to think. We can arrive at genuine new insights, but the process is clumsy. We make lots of wrong guesses at useful new concepts, and they’re almost always recombinations of old concepts (and thereby probably describable in language—not that foundation model agents are strictly limited to that sort).
What we do better than LLMs is test our clumsy guesses against data. But that cognitive process might be drop-dead easy to add with scaffolding.
And if that turns out to not be easy, what about asking a human for a hand with the few novel problems for which strategies aren’t available in some writing? Solving 99% of the problems might result in a nearly 100x speedup in productivity, including in AGI research.
For other reasons, some given in my response to Egg’s framing, others in my tangenting deleted comment (which I’ll try to turn into a quick take since it got off topic for your excellent question), I think language model cognitive architectures are quite likely to achieve full AGI. That’s a very different but actually less scary prospect than them not achieving competent autonomous AGI but speeding up AGI research by 100x within a few years.
If that happens, we’ll get a different sort of AGI that’s probably not going to have translucent thoughts or act and think based on core goals that we gave it in nice English sentences. Those advantages aren’t a guaranteed win, but they seem like huge advantages over trying to align AGI without those properties. So I’m leaning toward hoping language models make it, even though that has likely faster timelines.
I think this fits your framing and I left it out of my long comment on Egg’s other excellent post on LLM generality: LLM agents don’t have to be that good at full generality to outperform people. I don’t think humans truly do that well at it either.
Limited generality can cover arbitrarily huge portions of task space. And we’re not going for most efficient algorithm here, we’re going for first available route to exceed human capabilities within reasonable per-cognition end-user budgets.
To outmatch humans in every way and pose an x-risk, they’ve got to advance toward full generality as humans do, by solving totally novel problems when necessary.
I think humans almost never do true out-of-distribution generalization. We learn and deploy abstract concepts that make what was formerly out-of-training-set into within-training-set. Usually, we learn that concept from another human. Once in a while, we derive our own new abstract concepts.
LLMs can’t do this yet. But it might not be hard to scaffold them to be as good at it as humans are, because we’re not as good as we’d like to imagine.
Pattern-matching reasoning using problem-solving formulas we’ve learned from others covers the vast majority of important tasks in the current world. So even if they’re not fully general, LLMs might exceed human capabilities in most tasks. And they might be scaffolded to derive and deploy new concepts as well or better than humans. We don’t actually succeed at it often.
I think humans only do this, come up with genuinely new concepts and thereby reason in totally novel domains or achieve full generality, maybe a few times in a lifetime, and certainly not daily. We did not understand Newtonian physics easily, despite the readily available data, nor quantum physics once the relevant data was available. If you watch a young child work to match shapes to holes in those toys, you will be either horrified or amused. It takes them a very long time (weeks, not hours) to understand what looks drop-dead simple to us, because we’ve already learned the relevant problem-solving algorithm.
We’re not as smart as we’d like to think. We can arrive at genuine new insights, but the process is clumsy. We make lots of wrong guesses at useful new concepts, and they’re almost always recombinations of old concepts (and thereby probably describable in language—not that foundation model agents are strictly limited to that sort).
What we do better than LLMs is test our clumsy guesses against data. But that cognitive process might be drop-dead easy to add with scaffolding.
And if that turns out to not be easy, what about asking a human for a hand with the few novel problems for which strategies aren’t available in some writing? Solving 99% of the problems might result in a nearly 100x speedup in productivity, including in AGI research.
For other reasons, some given in my response to Egg’s framing, others in my tangenting deleted comment (which I’ll try to turn into a quick take since it got off topic for your excellent question), I think language model cognitive architectures are quite likely to achieve full AGI. That’s a very different but actually less scary prospect than them not achieving competent autonomous AGI but speeding up AGI research by 100x within a few years.
If that happens, we’ll get a different sort of AGI that’s probably not going to have translucent thoughts or act and think based on core goals that we gave it in nice English sentences. Those advantages aren’t a guaranteed win, but they seem like huge advantages over trying to align AGI without those properties. So I’m leaning toward hoping language models make it, even though that has likely faster timelines.