But I think there’s still a good chance that this is the fastest and most obvious route to AGI.
Agreed that it’s quite plausible that LLMs with scaffolding basically scale to AGI. Mostly I’m just arguing that it’s an open question with important implications for safety & in particular timelines.
One is that the biggest blocker to commercial usefulness wasn’t reasoning ability, it was ability to correctly interpret a webpage or other software.
I’m very skeptical of this with respect to web pages. Some pages include images (eg charts) that are necessary to understand the page content, but for many or most pages, the important content is text in an HTML file, and we know LLMs handle HTML just fine (since they can easily create it on demand).
The second is the possibility that GPT4 and the current gen just aren’t quite smart enough to have scaffolded System 2 work.
Agreed, this seems like a totally live possibility.
So I’m thinking that alignment people should actually help make scaffolded system 2 reasoning work, which is a pretty radical proposal relative to most alignment thought.
Personally I’d have to be a lot more confident that alignment of such systems just works to favor alignment researchers advancing capabilities; to me having additional time before AGI seems much more clearly valuable.
I was also surprised that interpreting webpages was a major blocker. They’re in text and HTML, as you say.
I don’t remember who said this, but I remember believing them since they’d actually tried to make useful agents. They said that actual modern webpages are such a flaming mess of complex HTML that the LLMs get confused easily.
Your last point, whether the direction to easier-to-align AGI or more time to work on alignment is preferable is a very complex issue. I don’t have a strong opinion since I haven’t worked through it all. But I think there are very strong reasons to think LLM-based AGI is far easier to align than other forms, particularly if the successful approach doesn’t heavily rely on RL. So I think your opinion is in the majority, but nobody has worked it through carefully enough to have a really good guess. That’s a project I’d like to embark on by writing a post making the controversial suggestion that maybe we should be actively building LMA AGI as the safest of a bad set of options.
I also think we’ll get substantial info about the feasibility of LMA in the next six months. Progress on ARC-AGI will tell us a lot about LLMs as general reasoners, I think (and Redwood’s excellent new work on ARC-AGI has already updated me somewhat toward this not being a fundamental blocker). And I think GPT-5 will tell us a lot. ‘GPT-4 comes just short of being capable and reliable enough to work well for agentic scaffolding’ is a pretty plausible view. If that’s true, then we should see such scaffolding working a lot better with GPT-5; if it’s false, then we should see continued failures to make it really work.
I realized I didn’t really reply to your first point, and that it’s a really important one.
We’re in agreement that scaffolded LLMs are a possible first route to AGI, but not a guaranteed one.
If that’s the path, timelines are relatively short.
If that’s a possibility, we’d better have alignment solutions for that possible path, ASAP.
That’s why I continue to focus on aligning LMAs.
If other paths to AGI turn out to be the first routes, timelines are probably a little longer, so we’ve got a little longer to work on alignment for those types of systems. And there are more people working on RL-based alignment schemes (I think?)
Agreed that it’s quite plausible that LLMs with scaffolding basically scale to AGI. Mostly I’m just arguing that it’s an open question with important implications for safety & in particular timelines.
I’m very skeptical of this with respect to web pages. Some pages include images (eg charts) that are necessary to understand the page content, but for many or most pages, the important content is text in an HTML file, and we know LLMs handle HTML just fine (since they can easily create it on demand).
Agreed, this seems like a totally live possibility.
Personally I’d have to be a lot more confident that alignment of such systems just works to favor alignment researchers advancing capabilities; to me having additional time before AGI seems much more clearly valuable.
I was also surprised that interpreting webpages was a major blocker. They’re in text and HTML, as you say.
I don’t remember who said this, but I remember believing them since they’d actually tried to make useful agents. They said that actual modern webpages are such a flaming mess of complex HTML that the LLMs get confused easily.
Your last point, whether the direction to easier-to-align AGI or more time to work on alignment is preferable is a very complex issue. I don’t have a strong opinion since I haven’t worked through it all. But I think there are very strong reasons to think LLM-based AGI is far easier to align than other forms, particularly if the successful approach doesn’t heavily rely on RL. So I think your opinion is in the majority, but nobody has worked it through carefully enough to have a really good guess. That’s a project I’d like to embark on by writing a post making the controversial suggestion that maybe we should be actively building LMA AGI as the safest of a bad set of options.
I think that’d be a really valuable post!
I also think we’ll get substantial info about the feasibility of LMA in the next six months. Progress on ARC-AGI will tell us a lot about LLMs as general reasoners, I think (and Redwood’s excellent new work on ARC-AGI has already updated me somewhat toward this not being a fundamental blocker). And I think GPT-5 will tell us a lot. ‘GPT-4 comes just short of being capable and reliable enough to work well for agentic scaffolding’ is a pretty plausible view. If that’s true, then we should see such scaffolding working a lot better with GPT-5; if it’s false, then we should see continued failures to make it really work.
I realized I didn’t really reply to your first point, and that it’s a really important one.
We’re in agreement that scaffolded LLMs are a possible first route to AGI, but not a guaranteed one.
If that’s the path, timelines are relatively short.
If that’s a possibility, we’d better have alignment solutions for that possible path, ASAP.
That’s why I continue to focus on aligning LMAs.
If other paths to AGI turn out to be the first routes, timelines are probably a little longer, so we’ve got a little longer to work on alignment for those types of systems. And there are more people working on RL-based alignment schemes (I think?)
I’m glad you’re working on it! I think your arguments are plausible and your approach is potentially promising.