Nathan Helm-Burger comments on LLM Generality is a Timeline Crux

Nathan Helm-Burger 22 Aug 2024 17:10 UTC
3 points
1
Yes, I agree our views are quite close. My expectations closely match what you say here:
Although LLMs badly suck at reasoning, my AGI timelines are still kinda short―roughly 1 to 15 years for “real” AGI, with quasi-AGI in 2 to 6 years―mainly because so much funding is going into this, and because only one researcher needs to figure out the secret, and because so much research is being shared publicly, and because there should be many ways to do AGI, and because quasi-AGI (if invented first) might help create real AGI.
Basically I just want to point out that the progression of competence in recent models seems pretty impressive, even though the absolute values are low.
For instance, for writing code I think the following pattern of models (including only ones I’ve personally tested enough to have an opinion) shows a clear trend of increasing competence with later release dates:
Github Copilot (pre-GPT-4) < GPT-4 (the first release) < Claude 3 Opus < Claude 3.5 Sonnet
Basically, I’m holding in my mind the possibility that the next versions (GPT-5 and/or Claude Opus 4) will really impress me. I don’t feel confident of that. I am pretty confident that the version after next will impress me (e.g. GPT-6 / Claude Opus 5) and actually be useful for RSI.
From this list, Claude 3.5 Sonnet is the first one to be competent enough I find it even occasionally useful. I made myself use the others just to get familiar with their abilities, but their outputs just weren’t worth the time and effort on average.