I think the main reason governments may fail to take control (no comment on keeping it) is that TAI might be both the first effective wakeup call and the point when it’s too late to take control. It can be too late if there is already too much proliferation, sufficient-if-not-optimal code and theory and models already widely available, sufficient compute to compete with potential government projects already abundant and impossible to sufficiently take down. So even if the first provider of TAI is taken down, in a year everyone has TAI, and the government fails to take sufficient advantage of its year of lead time to dissuade the rest of the world.
The alternative where government control is more plausible is first making a long-horizon task capable AI that can do many jobs, but can’t itself do research or design AIs, and a little bit of further scaling or development isn’t sufficient to get there. The economic impact then acts as a wakeup call, but the AI itself isn’t yet a crucial advantage, can be somewhat safely used by all sides, and doesn’t inevitably lead to ASI a few years later. At this point governments might get themselves a monopoly on serious compute, so that any TAI projects would need to go through them.
I agree that this might happen too fast to develop a manhattan project, but do you really see a way the government fails to even seize effective control of AGI once it’s developed? It’s pretty much their job to manage huge security concerns like the one AGI presents, even if they kept their hands off the immense economic potential. The scenarios in which the government just politely stands aside or doesn’t notice even when they see human-level systems with their own eyes seem highly unlikely to me.
Seizing control of a project while it’s taking off from roughly the human to superhuman level isn’t as good as taking control of the compute to build it, but it’s better than nothing, and it feels like the type of move governments often make. They don’t even need to be public about it, just show up and say “hey let’s work together so nobody needs to discuss laws around sharing security-breaking technology with our enemies”
It depends on how much time there is between the first impactful demonstration of long-horizon task capabilities (doing many jobs) and commoditization of research capable TAI, even with governments waking up during this interval and working to extend it. It might be that by default this is already at least few years, and if the bulk of compute is seized, it extends to even longer. This seems to require long-horizon task capabilities to be found at the limits of scaling, and TAI significantly further.
But we don’t know until it’s tried if even a $3 billion training run won’t already enable long-horizon task capabilities (with appropriate post-training, even if it arrives a bit later), and we don’t know if the first long-horizon task capable AI won’t immediately be capable of research, with no need for further scaling (even if it helps). And if it’s not immediately obvious how to elicit these capabilities with post-training, there will be an overhang of sufficient compute and sufficiently strong base models in many places before the alarm is sounded. If enough of such things align, there won’t be time for anyone to prevent prompt commoditization of research capable TAI. And then there’s ASI 1-2 years later, with the least possible time for anyone to steer any of this.
long-horizon task capable AI that can do many jobs, but can’t itself do research or design AIs, and a little bit of further scaling or development isn’t sufficient to get there
This seems like something very unlikely to be possible. You didn’t say it was likely, but I think it’s worth pointing out that I have a hard time imagining this existing. I think a long-horizon task capable AI with otherwise similar capabilities as current LLMs would be quite capable of researching and creating stronger AIs. I think we are very close indeed to the threshold of capability beyond which recursive improvement will be possible.
The relevant distinction is between compute that proliferated before there were long-horizon task capable AIs, and compute that’s necessary to train autonomous researcher AIs. A lot of compute might even be needed to maintain their ability to keep working on novel problems, since an AI trained on data that didn’t include the very recent progress might be unable to make further progress, and continued training isn’t necessarily helpful enough compared to full retraining, so that stolen weights would be relatively useless for getting researcher AIs to do deep work.
There are only 2-3 OOMs of compute scaling left to explore if capabilities of AIs don’t dramatically improve, and LLMs at current scale robustly fail at long-horizon tasks. If AIs don’t become very useful at something, there won’t be further OOMs until many years pass and there are larger datacenters, possibly well-tested scalable asynchronous distributed training algorithms, more energy-efficient AI accelerators, more efficient training, ways of generating more high quality data. Now imagine if long-horizon task capable AIs were developed just before or even during this regime of stalled scaling, it took more than a year, $100 billion, and 8 gigawatts to train one, and it’s barely working well enough to unlock the extreme value of there being cheap and fast autonomous digital workers capable of routine jobs, going through long sequences of unreliable or meandering reasoning but eventually catching the systematic problems in a particular train of thought, recovering well enough to do their thing. And further scaling resulting from a new investment boom still fails to produce a researcher AI, as it might take another 2-3 OOMs and we are all out of AI accelerators and gigawatts for the time being.
In this scenario, which seems somewhat plausible, the governments both finally actually notice the astronomical power of AI, and have multiple years to get all large quantities of compute under control, so that the compute available for arbitrary non-government use gets somewhat lower than what it takes to train even a barely long-horizon task capable AI that’s not at all a researcher. Research-capable TAI then by default won’t appear in all these years, and after the transition to centralized control over compute is done, future progress towards such AI can only happen under government control.
By Leopold’s detailed analysis, of the ongoing rate of advance of training run effective compute capacities, ~40% is coming from increases in willingness to invest more money, ~20% from Moore’s Law, and ~40% from algorithmic improvements. As you correctly point out (before TAI-caused growth spikes) the current size of the economy provides a pretty clear upper bound on how long the first factor can continue, probably not very long after 2027. Moore’s Law has fairly visibly been slowing for a while (admittedly perhaps less so for GPUs than CPUs, as they’re more parallelizable): likely it will continue to gradually slow, at least until there is some major technological leap. Algorithmic improvements must eventually hit diminishing returns, but recent progress suggests to Leopold (and me) that there’s still plenty of low-hanging fruit. If one or two of those three contributing factors stops dead in the next few years, any remaining AGI-timeline at that point then moves out by roughly a factor of two (unless the only one left is Moore’s Law, when they move out five-fold, but this seems the least plausible combination to me). So for example, if Leopold is wrong about GPT-6 being AGI and it’s actually GPT-7 (a fairly plausible inference from extrapolating on his own graph with straight lines and bands on it), so that we would if effective compute capacities increase rates stayed steady hit that in 2029 not 2027 as he’s suggesting, but we run out of willingness/capacity to invest more money in 2028, then that factor of two slowdown only pushes AGI out a year to 2030 (not a difference that anyone with a high P(DOOM) is going to be very relived by).
[I think seeing how much closer GPT-5 feels to AGI compared to GPT-4 may be very informative here: I’d hope to be able to get an impression fairly fast after it comes out on whether if feels like we’re now half-way there, or only a third of the way. Of course, that won’t include a couple of years’ worth of scaffolding improvements or other things in the category Leopold calls “unhobbling”, so our initial estimate may be an underestimate, but then we may also be underestimating difficulties around more complex/abstract but important things like long-tern planning ability and the tradeoff-considerations involved in doing the experimental design part of the scientific method — the former is something that GPT-4 is so bad at that I rather suspect we’re going to need an unhobbling here. Planning ability seems plausible as something evolution might have rather heavily optimized humans for.]
You appear to be assuming either that increasing investment is the only factor driving OOM increases in effective compute, or that all three factors will stop at the same time.
The question is if research capable TAI can lag behind government-alarming long-horizon task capable AI (that does many jobs and so even Robin Hanson starts paying attention). These are two different thresholds that might both be called “AGI”, so it’s worth making a careful distinction. Even if it turns out that in practice they coincide and the same system becomes the first to qualify for both, for now we don’t know if that’s the case, and conceptually they are different.
If this lag is sufficient, governments might be able to succeed in locking down enough compute to prevent independent development of research capable TAI for many more years. This includes stopping or even reversing improvements in AI accelerators. If govenments only become alarmed once there is a research capable TAI, that gives the other possibility where TAI is developed by everyone very quickly and the opportunity to do it more carefully is lost.
Increasing investment is the crucial consideration in the sense that if research capable TAI is possible with modest investment, then there is no preventing its independent development. But if the necessary investment turns out to be sufficiently outrageous, controlling development of TAI by controlling hardware becomes feasible. Advancements in hardware are easy to control if most governments are alarmed, the supply chains are large, the datacenters are large. And algorithmic improvements have a sufficiently low ceiling to keep what would otherwise be $10 trillion training runs infeasible for independent actors even if done with better methods. The hypothetical I was describing has research capable TAI 2-3 OOMs above the $100 billion necessary for long-horizon task capable AI, which as a barrier for feasibility can survive some algorithmic improvements.
I also think the improvements themselves are probably running out. There’s only about 5x improvement in all these years for the dense transformer, a significant improvement from MoE, possibly some improvement from Mixture of Depths. All attention alternatives remain in the ballpark despite having very different architectures. Something significantly non-transformer-like is probably necessary to get more OOMs of algorithmic progress, which is also the case if LLMs can’t be scaled to research capable TAI at all.
(Recent unusually fast improvement in hardware was mostly driven by moving to lower precision, first BF16, then FP8 with H100s, and now Microscaling (FP4, FP6) with Blackwell. This process is also at an end, lower-level hardware improvement will be slower. But unlike algorithmic improvements, this point is irrelevant to the argument, since improvement in hardware available to independent actors can be stopped or reversed by governments, unlike algorithmic improvements.)
I also think the improvements themselves are probably running out.
I disagree, though this is based on some guesswork (and Leopold’s analysis, as a recently-ex-insider). I don’t know exactly how they’re doing it (improvements in training data filtering is probably part of it), but the foundation model companies have all been putting out models with lower inference costs and latencies for the same capability level (OpenAI; GPT-4 Turbo, GPT-4o vs GPT-4; Anthropic Claude 3.5 Sonnet vs. the Claude 3 generation; Google: Gemini 1.5 vs 1). I am assuming that the reason for this performance improvement is that the newer models actually had lower parameter counts (which is supported by some rumored parameter count numbers), and I’m then also assuming that means these had lower total compute to train. (The latter assumption would be false for smaller models trained via distillation from a larger model, as some of the smaller Google models almost certainly are, or heavily overtrained by Chinchilla standards, as has recently become popular for models that are not the largest member of a model family.)
Things like the effectiveness of model pruning methods suggest that there are a lot of wasted parameters inside current models, which would suggest there’s still a lot of room for performance improvements. The huge context lengths that foundation model companies are now advertising without huge cost differentials also rather suggest something architectural has happened there, which isn’t just full attention quadratic-cost classical transformers. What combination of the techniques from the academic literature, or ones not in the academic literature, that’s based on is unclear, but clearly something improved there.
Algorithmic improvements relevant to my argument are those that happen after long-horizon task capable AIs are demonstrated, in particular it doesn’t matter how much progress is happening now, other than as evidence about what happens later.
heavily overtrained by Chinchilla standards
This is necessarily part of it. It involves using more compute, not less, which is natural given that new training environments are getting online, and doesn’t need any algorithmic improvements at all to produce models that are both cheaper for inference and smarter. You can take a Chinchilla optimal model, make it 3x smaller and train it on 9x data, expending 3x more compute, and get approximately the same result. If you up the compute and data a bit more, the model will become more capable. Some current improvements are probably due to better use of pre-training data, but these things won’t survive significant further scaling intact. There are also improvements in post-training, but they are even less relevant to my argument, assuming they are not lagging behind too badly in unlocking the key thresholds of capability.
Algorithmic improvements relevant to my argument are those that happen after long-horizon task capable AIs are demonstrated, in particular it doesn’t matter how much progress is happening now, other than as evidence about what happens later
My apologies, you’re right, I had misunderstood you, and thus we’ve been talking at cross-purposes. You were discussing
…if research capable TAI can lag behind government-alarming long-horizon task capable AI (that does many jobs and so even Robin Hanson starts paying attention)
while I was instead talking about how likely it was that running out of additional money to invest slowed reaching either of these forms of AGI (which I personally view as being likely to happen quite close together, as Leopold also assumes) by enough to make more than a year-or-two’s difference.
I think the main reason governments may fail to take control (no comment on keeping it) is that TAI might be both the first effective wakeup call and the point when it’s too late to take control. It can be too late if there is already too much proliferation, sufficient-if-not-optimal code and theory and models already widely available, sufficient compute to compete with potential government projects already abundant and impossible to sufficiently take down. So even if the first provider of TAI is taken down, in a year everyone has TAI, and the government fails to take sufficient advantage of its year of lead time to dissuade the rest of the world.
The alternative where government control is more plausible is first making a long-horizon task capable AI that can do many jobs, but can’t itself do research or design AIs, and a little bit of further scaling or development isn’t sufficient to get there. The economic impact then acts as a wakeup call, but the AI itself isn’t yet a crucial advantage, can be somewhat safely used by all sides, and doesn’t inevitably lead to ASI a few years later. At this point governments might get themselves a monopoly on serious compute, so that any TAI projects would need to go through them.
I agree that this might happen too fast to develop a manhattan project, but do you really see a way the government fails to even seize effective control of AGI once it’s developed? It’s pretty much their job to manage huge security concerns like the one AGI presents, even if they kept their hands off the immense economic potential. The scenarios in which the government just politely stands aside or doesn’t notice even when they see human-level systems with their own eyes seem highly unlikely to me.
Seizing control of a project while it’s taking off from roughly the human to superhuman level isn’t as good as taking control of the compute to build it, but it’s better than nothing, and it feels like the type of move governments often make. They don’t even need to be public about it, just show up and say “hey let’s work together so nobody needs to discuss laws around sharing security-breaking technology with our enemies”
It depends on how much time there is between the first impactful demonstration of long-horizon task capabilities (doing many jobs) and commoditization of research capable TAI, even with governments waking up during this interval and working to extend it. It might be that by default this is already at least few years, and if the bulk of compute is seized, it extends to even longer. This seems to require long-horizon task capabilities to be found at the limits of scaling, and TAI significantly further.
But we don’t know until it’s tried if even a $3 billion training run won’t already enable long-horizon task capabilities (with appropriate post-training, even if it arrives a bit later), and we don’t know if the first long-horizon task capable AI won’t immediately be capable of research, with no need for further scaling (even if it helps). And if it’s not immediately obvious how to elicit these capabilities with post-training, there will be an overhang of sufficient compute and sufficiently strong base models in many places before the alarm is sounded. If enough of such things align, there won’t be time for anyone to prevent prompt commoditization of research capable TAI. And then there’s ASI 1-2 years later, with the least possible time for anyone to steer any of this.
This seems like something very unlikely to be possible. You didn’t say it was likely, but I think it’s worth pointing out that I have a hard time imagining this existing. I think a long-horizon task capable AI with otherwise similar capabilities as current LLMs would be quite capable of researching and creating stronger AIs. I think we are very close indeed to the threshold of capability beyond which recursive improvement will be possible.
The relevant distinction is between compute that proliferated before there were long-horizon task capable AIs, and compute that’s necessary to train autonomous researcher AIs. A lot of compute might even be needed to maintain their ability to keep working on novel problems, since an AI trained on data that didn’t include the very recent progress might be unable to make further progress, and continued training isn’t necessarily helpful enough compared to full retraining, so that stolen weights would be relatively useless for getting researcher AIs to do deep work.
There are only 2-3 OOMs of compute scaling left to explore if capabilities of AIs don’t dramatically improve, and LLMs at current scale robustly fail at long-horizon tasks. If AIs don’t become very useful at something, there won’t be further OOMs until many years pass and there are larger datacenters, possibly well-tested scalable asynchronous distributed training algorithms, more energy-efficient AI accelerators, more efficient training, ways of generating more high quality data. Now imagine if long-horizon task capable AIs were developed just before or even during this regime of stalled scaling, it took more than a year, $100 billion, and 8 gigawatts to train one, and it’s barely working well enough to unlock the extreme value of there being cheap and fast autonomous digital workers capable of routine jobs, going through long sequences of unreliable or meandering reasoning but eventually catching the systematic problems in a particular train of thought, recovering well enough to do their thing. And further scaling resulting from a new investment boom still fails to produce a researcher AI, as it might take another 2-3 OOMs and we are all out of AI accelerators and gigawatts for the time being.
In this scenario, which seems somewhat plausible, the governments both finally actually notice the astronomical power of AI, and have multiple years to get all large quantities of compute under control, so that the compute available for arbitrary non-government use gets somewhat lower than what it takes to train even a barely long-horizon task capable AI that’s not at all a researcher. Research-capable TAI then by default won’t appear in all these years, and after the transition to centralized control over compute is done, future progress towards such AI can only happen under government control.
By Leopold’s detailed analysis, of the ongoing rate of advance of training run effective compute capacities, ~40% is coming from increases in willingness to invest more money, ~20% from Moore’s Law, and ~40% from algorithmic improvements. As you correctly point out (before TAI-caused growth spikes) the current size of the economy provides a pretty clear upper bound on how long the first factor can continue, probably not very long after 2027. Moore’s Law has fairly visibly been slowing for a while (admittedly perhaps less so for GPUs than CPUs, as they’re more parallelizable): likely it will continue to gradually slow, at least until there is some major technological leap. Algorithmic improvements must eventually hit diminishing returns, but recent progress suggests to Leopold (and me) that there’s still plenty of low-hanging fruit. If one or two of those three contributing factors stops dead in the next few years, any remaining AGI-timeline at that point then moves out by roughly a factor of two (unless the only one left is Moore’s Law, when they move out five-fold, but this seems the least plausible combination to me). So for example, if Leopold is wrong about GPT-6 being AGI and it’s actually GPT-7 (a fairly plausible inference from extrapolating on his own graph with straight lines and bands on it), so that we would if effective compute capacities increase rates stayed steady hit that in 2029 not 2027 as he’s suggesting, but we run out of willingness/capacity to invest more money in 2028, then that factor of two slowdown only pushes AGI out a year to 2030 (not a difference that anyone with a high P(DOOM) is going to be very relived by).
[I think seeing how much closer GPT-5 feels to AGI compared to GPT-4 may be very informative here: I’d hope to be able to get an impression fairly fast after it comes out on whether if feels like we’re now half-way there, or only a third of the way. Of course, that won’t include a couple of years’ worth of scaffolding improvements or other things in the category Leopold calls “unhobbling”, so our initial estimate may be an underestimate, but then we may also be underestimating difficulties around more complex/abstract but important things like long-tern planning ability and the tradeoff-considerations involved in doing the experimental design part of the scientific method — the former is something that GPT-4 is so bad at that I rather suspect we’re going to need an unhobbling here. Planning ability seems plausible as something evolution might have rather heavily optimized humans for.]
You appear to be assuming either that increasing investment is the only factor driving OOM increases in effective compute, or that all three factors will stop at the same time.
The question is if research capable TAI can lag behind government-alarming long-horizon task capable AI (that does many jobs and so even Robin Hanson starts paying attention). These are two different thresholds that might both be called “AGI”, so it’s worth making a careful distinction. Even if it turns out that in practice they coincide and the same system becomes the first to qualify for both, for now we don’t know if that’s the case, and conceptually they are different.
If this lag is sufficient, governments might be able to succeed in locking down enough compute to prevent independent development of research capable TAI for many more years. This includes stopping or even reversing improvements in AI accelerators. If govenments only become alarmed once there is a research capable TAI, that gives the other possibility where TAI is developed by everyone very quickly and the opportunity to do it more carefully is lost.
Increasing investment is the crucial consideration in the sense that if research capable TAI is possible with modest investment, then there is no preventing its independent development. But if the necessary investment turns out to be sufficiently outrageous, controlling development of TAI by controlling hardware becomes feasible. Advancements in hardware are easy to control if most governments are alarmed, the supply chains are large, the datacenters are large. And algorithmic improvements have a sufficiently low ceiling to keep what would otherwise be $10 trillion training runs infeasible for independent actors even if done with better methods. The hypothetical I was describing has research capable TAI 2-3 OOMs above the $100 billion necessary for long-horizon task capable AI, which as a barrier for feasibility can survive some algorithmic improvements.
I also think the improvements themselves are probably running out. There’s only about 5x improvement in all these years for the dense transformer, a significant improvement from MoE, possibly some improvement from Mixture of Depths. All attention alternatives remain in the ballpark despite having very different architectures. Something significantly non-transformer-like is probably necessary to get more OOMs of algorithmic progress, which is also the case if LLMs can’t be scaled to research capable TAI at all.
(Recent unusually fast improvement in hardware was mostly driven by moving to lower precision, first BF16, then FP8 with H100s, and now Microscaling (FP4, FP6) with Blackwell. This process is also at an end, lower-level hardware improvement will be slower. But unlike algorithmic improvements, this point is irrelevant to the argument, since improvement in hardware available to independent actors can be stopped or reversed by governments, unlike algorithmic improvements.)
I disagree, though this is based on some guesswork (and Leopold’s analysis, as a recently-ex-insider). I don’t know exactly how they’re doing it (improvements in training data filtering is probably part of it), but the foundation model companies have all been putting out models with lower inference costs and latencies for the same capability level (OpenAI; GPT-4 Turbo, GPT-4o vs GPT-4; Anthropic Claude 3.5 Sonnet vs. the Claude 3 generation; Google: Gemini 1.5 vs 1). I am assuming that the reason for this performance improvement is that the newer models actually had lower parameter counts (which is supported by some rumored parameter count numbers), and I’m then also assuming that means these had lower total compute to train. (The latter assumption would be false for smaller models trained via distillation from a larger model, as some of the smaller Google models almost certainly are, or heavily overtrained by Chinchilla standards, as has recently become popular for models that are not the largest member of a model family.)
Things like the effectiveness of model pruning methods suggest that there are a lot of wasted parameters inside current models, which would suggest there’s still a lot of room for performance improvements. The huge context lengths that foundation model companies are now advertising without huge cost differentials also rather suggest something architectural has happened there, which isn’t just full attention quadratic-cost classical transformers. What combination of the techniques from the academic literature, or ones not in the academic literature, that’s based on is unclear, but clearly something improved there.
Algorithmic improvements relevant to my argument are those that happen after long-horizon task capable AIs are demonstrated, in particular it doesn’t matter how much progress is happening now, other than as evidence about what happens later.
This is necessarily part of it. It involves using more compute, not less, which is natural given that new training environments are getting online, and doesn’t need any algorithmic improvements at all to produce models that are both cheaper for inference and smarter. You can take a Chinchilla optimal model, make it 3x smaller and train it on 9x data, expending 3x more compute, and get approximately the same result. If you up the compute and data a bit more, the model will become more capable. Some current improvements are probably due to better use of pre-training data, but these things won’t survive significant further scaling intact. There are also improvements in post-training, but they are even less relevant to my argument, assuming they are not lagging behind too badly in unlocking the key thresholds of capability.
My apologies, you’re right, I had misunderstood you, and thus we’ve been talking at cross-purposes. You were discussing
while I was instead talking about how likely it was that running out of additional money to invest slowed reaching either of these forms of AGI (which I personally view as being likely to happen quite close together, as Leopold also assumes) by enough to make more than a year-or-two’s difference.