I also think we’ll get substantial info about the feasibility of LMA in the next six months. Progress on ARC-AGI will tell us a lot about LLMs as general reasoners, I think (and Redwood’s excellent new work on ARC-AGI has already updated me somewhat toward this not being a fundamental blocker). And I think GPT-5 will tell us a lot. ‘GPT-4 comes just short of being capable and reliable enough to work well for agentic scaffolding’ is a pretty plausible view. If that’s true, then we should see such scaffolding working a lot better with GPT-5; if it’s false, then we should see continued failures to make it really work.
I think that’d be a really valuable post!
I also think we’ll get substantial info about the feasibility of LMA in the next six months. Progress on ARC-AGI will tell us a lot about LLMs as general reasoners, I think (and Redwood’s excellent new work on ARC-AGI has already updated me somewhat toward this not being a fundamental blocker). And I think GPT-5 will tell us a lot. ‘GPT-4 comes just short of being capable and reliable enough to work well for agentic scaffolding’ is a pretty plausible view. If that’s true, then we should see such scaffolding working a lot better with GPT-5; if it’s false, then we should see continued failures to make it really work.