The new Moore’s Law for AI Agents (aka More’s Law) has accelerated at around the time people in research roles started to talk a lot more about getting value from AI coding assistants. AI accelerating AI research seems like the obvious interpretation, and if true, the new exponential is here to stay. This gets us to 8 hour AIs in ~March 2026, and 1 month AIs around mid 2027.[1]
I do not expect humanity to retain relevant steering power for long in a world with one-month AIs. If we haven’t solved alignment, either iteratively or once-and-for-all[2], it’s looking like game over unless civilization ends up tripping over its shoelaces and we’ve prepared.
From my bird’s eye view of the field, having at least read the abstracts of a few papers from most organizations in the space, I would be quite surprised if we had what it takes to solve alignment in the time that graph gives us. There’s not enough people, and they’re mostly not working on things which are even trying to align a superintelligence.
My own experience is that if-statements are even 3.5′s Achilles heel and 3.7 is somehow worse (when it’s “almost” right, that’s worse than useless, it’s like reviewing pull requests when you don’t know if it’s an adversarial attack or if they mean well but are utterly incompetent in interesting, hypnotizing ways)… and that METR’s baselines more resemble a Skinner box than programming (though many people have that kind of job, I just don’t find the conditions of gig economy as “humane” and representative of what how “value” is actually created), and the sheer disconnect of what I would find “productive”, “useful projects”, “bottlenecks”, and “what I love about my job and what parts I’d be happy to automate” vs the completely different answers on How Much Are LLMs Actually Boosting Real-World Programmer Productivity?, even from people I know personally...
I find this graph indicative of how “value” is defined by the SF investment culture and disruptive economy… and I hope the AI investment bubble will collapse sooner rather than later...
But even if the bubble collapses, automating intelligence will not be undone, it won’t suddenly become “safe”, the incentives to create real AGI instead of overhyped LLMs will still exists—the danger is not in the presented economic curve going up, it’s in what economic actors see as potential, how incentivized are the corporations/governments to search for the thing that is both powerful and dangerous, no?
The new Moore’s Law for AI Agents (aka More’s Law) has accelerated at around the time people in research roles started to talk a lot more about getting value from AI coding assistants. AI accelerating AI research seems like the obvious interpretation, and if true, the new exponential is here to stay. This gets us to 8 hour AIs in ~March 2026, and 1 month AIs around mid 2027.[1]
I do not expect humanity to retain relevant steering power for long in a world with one-month AIs. If we haven’t solved alignment, either iteratively or once-and-for-all[2], it’s looking like game over unless civilization ends up tripping over its shoelaces and we’ve prepared.
An extra speed-up of the curve could well happen, for example with [obvious capability idea, nonetheless redacted to reduce speed of memetic spread].
From my bird’s eye view of the field, having at least read the abstracts of a few papers from most organizations in the space, I would be quite surprised if we had what it takes to solve alignment in the time that graph gives us. There’s not enough people, and they’re mostly not working on things which are even trying to align a superintelligence.
Note the error bars in the original
My own experience is that if-statements are even 3.5′s Achilles heel and 3.7 is somehow worse (when it’s “almost” right, that’s worse than useless, it’s like reviewing pull requests when you don’t know if it’s an adversarial attack or if they mean well but are utterly incompetent in interesting, hypnotizing ways)… and that METR’s baselines more resemble a Skinner box than programming (though many people have that kind of job, I just don’t find the conditions of gig economy as “humane” and representative of what how “value” is actually created), and the sheer disconnect of what I would find “productive”, “useful projects”, “bottlenecks”, and “what I love about my job and what parts I’d be happy to automate” vs the completely different answers on How Much Are LLMs Actually Boosting Real-World Programmer Productivity?, even from people I know personally...
I find this graph indicative of how “value” is defined by the SF investment culture and disruptive economy… and I hope the AI investment bubble will collapse sooner rather than later...
But even if the bubble collapses, automating intelligence will not be undone, it won’t suddenly become “safe”, the incentives to create real AGI instead of overhyped LLMs will still exists—the danger is not in the presented economic curve going up, it’s in what economic actors see as potential, how incentivized are the corporations/governments to search for the thing that is both powerful and dangerous, no?