Please stop appealing to compute overhang. In a world where AI progress has wildly accelerated chip manufacture, this already-tenuous argument has become ~indefensible.
I tried to make a similar argument here, and I’m not sure it landed. I think the argument has since demonstrated even more predictive validity with e.g. the various attempts to build and restart nuclear power plants, directly motivated by nearby datacenter buildouts, on top of the obvious effects on chip production.
I’ve just read this post and the comments. Thank you for writing that; some elements of the decomposition feel really good, and I don’t know that they’ve been done elsewhere.
I think discourse around this is somewhat confused, because you actually have to do some calculation on the margin, and need a concrete proposal to do that with any confidence.
The straw-Pause rhetoric is something like “Just stop until safety catches up!” The overhang argument is usually deployed (as it is in those comments) to the effect of ‘there is no stopping.’ And yeah, in this calculation, there are in fact marginal negative externalities to the implementation of some subset of actions one might call a pause. The straw-Pause advocate really doesn’t want to look at that, because it’s messy to entertain counter-evidence to your position, especially if you don’t have a concrete enough proposal on the table to assign weights in the right places.
Because it’s so successful against straw-Pausers, the anti-pause people bring in the overhang argument like an absolute knockdown, when it’s actually just a footnote to double check the numbers and make sure your pause proposal avoids slipping into some arcane failure mode that ‘arms’ overhang scenarios. That it’s received as a knockdown is reinforced by the gearsiness of actually having numbers (and most of these conversations about pauses are happening in the abstract, in the absence of, i.e., draft policy).
But… just because your interlocutor doesn’t have the numbers at hand, doesn’t mean you can’t have a real conversation about the situations in which compute overhang takes on sufficient weight to upend the viability of a given pause proposal.
You said all of this much more elegantly here:
Arguments that overhangs are so bad that they outweigh the effects of pausing or slowing down are basically arguing that a second-order effect is more salient than the first-order effect. This is sometimes true, but before you’ve screened this consideration off by examining the object-level, I think your prior should be against.
...which feels to me like the most important part. The burden is on folks introducing an argument from overhang risk to prove its relevance within a specific conversation, rather than just introducing the adversely-gearsy concept to justify safety-coded accelerationism and/or profiteering. Everyone’s prior should be against actions Waluigi-ing, by default (while remaining alert to the possibility!).
Folks using compute overhang to 4D chess their way into supporting actions that differentially benefit capabilities.
I’m often tempted to comment this in various threads, but it feels like a rabbit hole, it’s not an easy one to convince someone of (because it’s an argument they’ve accepted for years), and I’ve had relatively little success talking about this with people in person (there’s some change I should make in how I’m talking about it, I think).
More broadly, I’ve started using quick takes to catalog random thoughts, because sometimes when I’m meeting someone for the first time, they have heard of me, and are mistaken about my beliefs, but would like to argue against their straw version. Having a public record I can point to of things I’ve thought feels useful for combatting this.
While I’m not a general fan of compute overhang, I do think that it’s at least somewhat relevant in worlds where AI pauses are very close to when a system is able to automate at least the entire AI R&D process, if not the entire AI economy itself, and I do suspect realistic pauses imposed by governments will likely only come once a massive amount of people lose their jobs, which can create incentives to go to algorithmic progress, and even small algorithmic progress might immediately blow up the pause agreement crafted in the aftermath of many people losing their jobs.
Basically, my statement in short terms is that conditional on AI pause happening because of massive job losses from AI that is barely unable to take-over the world, then even small saving in compute via better algorithms due to algorithmic research not being banned would incentivize more algorithmic research, which then lowers the compute enough to make the AI pause untenable and the AI takes over the world.
So for this argument to be worth bringing up in some general context where a pause is discussed, the person arguing it should probably believe:
We are far and away most likely to get a pause only as a response to unemployment.
An AI that precipitates pause-inducing levels of unemployment is inches from automating AI R+D.
The period between implementing the pause and massive algorithmic advancements is long enough that we’re able to increase compute stock...
....but short enough that we’re not able to make meaningful safety progress before algorithmic advancements make the pause ineffective (because, i.e., we regulated FLOPS and it just now takes 100x fewer FLOPS to build the dangerous thing).
I think the conjunct probability of all these things is low, and I think their likelihood is sensitive to the terms of the pause agreement itself. I agree that the design of a pause should consider a broad range of possibilities, and try to maximize its own odds of attaining its ends (Keep Everyone Alive).
I’m also not sure how this goes better in the no-pause world? Unless this person also has really high odds on multipolar going well and expects some Savior AI trained and aligned in the same length of time as the effective window of the theoretical pause to intervene? But that’s a rare position among people who care about safety ~at all; it’s kind of a George Hotz take or something...
(I don’t think we disagree; you did flag that this as ”...somewhat relevant in worlds where...” which is often code for “I really don’t expect this to happen, but Someone Somewhere should hold this possibility in mind.” Just want to make sure I’m actually following!)
I think 1 and 2 are actually pretty likely, but 3 and 4 is where I’m a lot less confident in actually happening.
A big reason for this is that I suspect one of the reasons people aren’t reacting to AI progress is they assume it won’t take their job, so it will likely require massive job losses for humans to make a lot of people care about AI, and depending on how concentrated AI R&D is, there’s a real possibility that AI has fully automated AI R&D before massive job losses begin in a way that matters to regular people.
Cool! I think we’re in agreement at a high level. Thanks for taking the extra time to make sure you were understood.
In more detail, though:
I think I disagree with 1 being all that likely; there are just other things I could see happening that would make a pause or stop politically popular (i.e. warning shots, An Inconvenient Truth AI Edition, etc.), likely not worth getting into here. I also think ‘if we pause it will be for stupid reasons’ is a very sad take.
I think I disagree with 2 being likely, as well; probably yes, a lot of the bottleneck on development is ~make-work that goes away when you get a drop-in replacement for remote workers, and also yes, AI coding is already an accelerant // effectively doing gradient descent on gradient descent (RLing the RL’d researcher to RL the RL...) is intelligence-explosion fuel. But I think there’s a big gap between the capabilities you need for politically worrisome levels of unemployment, and the capabilities you need for an intelligence explosion, principally because >30 percent of human labor in developed nations could be automated with current tech if the economics align a bit (hiring 200+k/year ML engineers to replace your 30k/year call center employee is only just now starting to make sense economically). I think this has been true of current tech since ~GPT-4, and that we haven’t seen a concomitant massive acceleration in capabilities on the frontier (things are continuing to move fast, and the proliferation is scary, but it’s not an explosion).
I take “depending on how concentrated AI R&D is” to foreshadow that you’d reply to the above with something like: “This is about lab priorities; the labs with the most impressive models are the labs focusing the most on frontier model development, and they’re unlikely to set their sights on comprehensive automation of shit jobs when they can instead double-down on frontier models and put some RL in the RL to RL the RL that’s been RL’d by the...”
I think that’s right about lab priorities. However, I expect the automation wave to mostly come from middle-men, consultancies, what have you, who take all of the leftover ML researchers not eaten up by the labs and go around automating things away individually (yes, maybe the frontier moves too fast for this to be right, because the labs just end up with a drop-in remote worker ‘for free’ as long as they keep advancing down the tech tree, but I don’t quite think this is true, because human jobs are human-shaped, and buyers are going to want pretty rigorous role-specific guarantees from whoever’s selling this service, even if they’re basically unnecessary, and the one-size-fits-all solution is going to have fewer buyers than the thing marketed as ‘bespoke’).
In general, I don’t like collapsing the various checkpoints between here and superintelligence; there are all these intermediate states, and their exact features matter a lot, and we really don’t know what we’re going to get. ‘By the time we’ll have x, we’ll certainly have y’ is not a form of prediction that anyone has a particularly good track record making.
I think I disagree with 1 being all that likely; there are just other things I could see happening that would make a pause or stop politically popular (i.e. warning shots, An Inconvenient Truth AI Edition, etc.), likely not worth getting into here. I also think ‘if we pause it will be for stupid reasons’ is a very sad take.
I generally don’t think the Inconvenient truth movie mattered that much for solving climate change, compared to technological solutions like renewable energy, and made the issue a little more partisan (though environmentalism/climate change was unusually partisan by then) and I think social movements to affect AI already had less impact on AI safety than technical work (in a broad sense) for reducing doom, and I expect this trend to continue.
I think warning shots could scare the public, but I worry that the level of warning shots necessary to clear AI is in a fairly narrow band, and I also expect AI control to have a reasonable probability of containing human-level scheming models that do work, so I wouldn’t pick this at all.
I agree it’s a sad take that “if we pause it will be for stupid reasons”, but I also think this is the very likely attractor, if AI does become a subject that is salient in politics, because people hate nuance, and nuance matters way more than the average person wants to deal with on AI (For example, I think the second species argument critically misses important differences that make the human-AI relationship more friendly than the human-gorilla relationship, and that’s without the subject being politicized).
To address this:
But I think there’s a big gap between the capabilities you need for politically worrisome levels of unemployment, and the capabilities you need for an intelligence explosion, principally because >30 percent of human labor in developed nations could be automated with current tech if the economics align a bit (hiring 200+k/year ML engineers to replace your 30k/year call center employee is only just now starting to make sense economically). I think this has been true of current tech since ~GPT-4, and that we haven’t seen a concomitant massive acceleration in capabilities on the frontier (things are continuing to move fast, and the proliferation is scary, but it’s not an explosion).
I think the key crux is I believe that the unreliability of GPT-4 would doom any attempt to automate 30% of jobs, and I think at most 0-1% of jobs could be automated, and while in principle you could improve reliability without improving capabilities too much, I also don’t think the incentives yet favor this option.
In general, I don’t like collapsing the various checkpoints between here and superintelligence; there are all these intermediate states, and their exact features matter a lot, and we really don’t know what we’re going to get. ‘By the time we’ll have x, we’ll certainly have y’ is not a form of prediction that anyone has a particularly good track record making.
I agree with this sort of argument, and in general I am not a fan of collapsing checkpoints between today’s AI and God AIs, which is a big mistake I think MIRI did, but my main claim is that the checkpoints would be illegible enough to the average citizen such that they don’t notice the progress until it’s too late, and that the reliability improvements will in practice also be coupled with capabilities improvements that matter to the AI explosion, but not very visible to the average citizen for the reason Garrison Lovely describes here:
There’s a vibe that AI progress has stalled out in the last ~year, but I think it’s more accurate to say that progress has become increasingly illegible. Since 6⁄23, perf. on PhD level science questions went from barely better than random guessing to matching domain experts.
I think I get what you’re saying… That the argument you dislike is, “we should rush to AGI sooner, so that there’s less compute overhang when we get there.”
I agree that that argument is a pretty bad one. I personally think that we are already so far into a compute overhang regime that that ship has sailed. We are using very inefficient learning algorithms, and will be able to run millions of inference instances of any model we produce.
I want to say yes, but I think this might be somewhat more narrow than I mean. It might be helpful if you could list a few other ways one might read my message, that seem similarly-plausible to this one.
Overhangs, overhangs everywhere. A thousand gleaming threads stretching backwards from the fog of the Future, forwards from the static Past, and ending in a single Gordian knot before us here and now.
That knot: understanding, learning, being, thinking. The key, the source, the remaining barrier between us and the infinite, the unknowable, the singularity.
When will it break? What holds it steady? Each thread we examine seems so inadequate. Could this be what is holding us back, saving us from ourselves, from our Mind Children? Not this one, nor that, yet some strange mix of many compensating factors.
Surely, if we had more compute, we’d be there already? Or better data? The right algorithms? Faster hardware? Neuromorphic chips? Clever scaffolding? Training on a regress of chains of thought, to better solutions, to better chains of thought, to even better solutions?
All of these, and none of these. The web strains at the breaking point. How long now? Days? Months?
If we had enough ways to utilize inference-time compute, couldn’t we just scale that to super-genius, and ask the genius for a more efficient solution? But it doesn’t seem like that has been done. Has it been tried? Who can say.
Will the first AGI out the gate be so expensive it is unmaintainable for more than a few hours? Will it quickly find efficiency improvements?
Or will we again be bound, hung up on novel algorithmic insights hanging just out of sight. Who knows?
Surely though, surely.… surely rushing ahead into the danger cannot be the wisest course, the safest course? Can we not agree to take our time, to think through the puzzles that confront us, to enumerate possible consequences and proactively reduce risks?
I hope. I fear. I stare in awestruck wonder at our brilliance and stupidity so tightly intermingled. We place the barrel of the gun to our collective head, panting, desperate, asking ourselves if this is it. Will it be? Intelligence is dead, long live intelligence.
Please stop appealing to compute overhang. In a world where AI progress has wildly accelerated chip manufacture, this already-tenuous argument has become ~indefensible.
I tried to make a similar argument here, and I’m not sure it landed. I think the argument has since demonstrated even more predictive validity with e.g. the various attempts to build and restart nuclear power plants, directly motivated by nearby datacenter buildouts, on top of the obvious effects on chip production.
I’ve just read this post and the comments. Thank you for writing that; some elements of the decomposition feel really good, and I don’t know that they’ve been done elsewhere.
I think discourse around this is somewhat confused, because you actually have to do some calculation on the margin, and need a concrete proposal to do that with any confidence.
The straw-Pause rhetoric is something like “Just stop until safety catches up!” The overhang argument is usually deployed (as it is in those comments) to the effect of ‘there is no stopping.’ And yeah, in this calculation, there are in fact marginal negative externalities to the implementation of some subset of actions one might call a pause. The straw-Pause advocate really doesn’t want to look at that, because it’s messy to entertain counter-evidence to your position, especially if you don’t have a concrete enough proposal on the table to assign weights in the right places.
Because it’s so successful against straw-Pausers, the anti-pause people bring in the overhang argument like an absolute knockdown, when it’s actually just a footnote to double check the numbers and make sure your pause proposal avoids slipping into some arcane failure mode that ‘arms’ overhang scenarios. That it’s received as a knockdown is reinforced by the gearsiness of actually having numbers (and most of these conversations about pauses are happening in the abstract, in the absence of, i.e., draft policy).
But… just because your interlocutor doesn’t have the numbers at hand, doesn’t mean you can’t have a real conversation about the situations in which compute overhang takes on sufficient weight to upend the viability of a given pause proposal.
You said all of this much more elegantly here:
...which feels to me like the most important part. The burden is on folks introducing an argument from overhang risk to prove its relevance within a specific conversation, rather than just introducing the adversely-gearsy concept to justify safety-coded accelerationism and/or profiteering. Everyone’s prior should be against actions Waluigi-ing, by default (while remaining alert to the possibility!).
To whom are are you talking?
Folks using compute overhang to 4D chess their way into supporting actions that differentially benefit capabilities.
I’m often tempted to comment this in various threads, but it feels like a rabbit hole, it’s not an easy one to convince someone of (because it’s an argument they’ve accepted for years), and I’ve had relatively little success talking about this with people in person (there’s some change I should make in how I’m talking about it, I think).
More broadly, I’ve started using quick takes to catalog random thoughts, because sometimes when I’m meeting someone for the first time, they have heard of me, and are mistaken about my beliefs, but would like to argue against their straw version. Having a public record I can point to of things I’ve thought feels useful for combatting this.
While I’m not a general fan of compute overhang, I do think that it’s at least somewhat relevant in worlds where AI pauses are very close to when a system is able to automate at least the entire AI R&D process, if not the entire AI economy itself, and I do suspect realistic pauses imposed by governments will likely only come once a massive amount of people lose their jobs, which can create incentives to go to algorithmic progress, and even small algorithmic progress might immediately blow up the pause agreement crafted in the aftermath of many people losing their jobs.
I think it would be very helpful to me if you broke that sentence up a bit more. I took a stab at it but didn’t get very far.
Sorry for my failure to parse!
Basically, my statement in short terms is that conditional on AI pause happening because of massive job losses from AI that is barely unable to take-over the world, then even small saving in compute via better algorithms due to algorithmic research not being banned would incentivize more algorithmic research, which then lowers the compute enough to make the AI pause untenable and the AI takes over the world.
So for this argument to be worth bringing up in some general context where a pause is discussed, the person arguing it should probably believe:
We are far and away most likely to get a pause only as a response to unemployment.
An AI that precipitates pause-inducing levels of unemployment is inches from automating AI R+D.
The period between implementing the pause and massive algorithmic advancements is long enough that we’re able to increase compute stock...
....but short enough that we’re not able to make meaningful safety progress before algorithmic advancements make the pause ineffective (because, i.e., we regulated FLOPS and it just now takes 100x fewer FLOPS to build the dangerous thing).
I think the conjunct probability of all these things is low, and I think their likelihood is sensitive to the terms of the pause agreement itself. I agree that the design of a pause should consider a broad range of possibilities, and try to maximize its own odds of attaining its ends (Keep Everyone Alive).
I’m also not sure how this goes better in the no-pause world? Unless this person also has really high odds on multipolar going well and expects some Savior AI trained and aligned in the same length of time as the effective window of the theoretical pause to intervene? But that’s a rare position among people who care about safety ~at all; it’s kind of a George Hotz take or something...
(I don’t think we disagree; you did flag that this as ”...somewhat relevant in worlds where...” which is often code for “I really don’t expect this to happen, but Someone Somewhere should hold this possibility in mind.” Just want to make sure I’m actually following!)
I think 1 and 2 are actually pretty likely, but 3 and 4 is where I’m a lot less confident in actually happening.
A big reason for this is that I suspect one of the reasons people aren’t reacting to AI progress is they assume it won’t take their job, so it will likely require massive job losses for humans to make a lot of people care about AI, and depending on how concentrated AI R&D is, there’s a real possibility that AI has fully automated AI R&D before massive job losses begin in a way that matters to regular people.
Cool! I think we’re in agreement at a high level. Thanks for taking the extra time to make sure you were understood.
In more detail, though:
I think I disagree with 1 being all that likely; there are just other things I could see happening that would make a pause or stop politically popular (i.e. warning shots, An Inconvenient Truth AI Edition, etc.), likely not worth getting into here. I also think ‘if we pause it will be for stupid reasons’ is a very sad take.
I think I disagree with 2 being likely, as well; probably yes, a lot of the bottleneck on development is ~make-work that goes away when you get a drop-in replacement for remote workers, and also yes, AI coding is already an accelerant // effectively doing gradient descent on gradient descent (RLing the RL’d researcher to RL the RL...) is intelligence-explosion fuel. But I think there’s a big gap between the capabilities you need for politically worrisome levels of unemployment, and the capabilities you need for an intelligence explosion, principally because >30 percent of human labor in developed nations could be automated with current tech if the economics align a bit (hiring 200+k/year ML engineers to replace your 30k/year call center employee is only just now starting to make sense economically). I think this has been true of current tech since ~GPT-4, and that we haven’t seen a concomitant massive acceleration in capabilities on the frontier (things are continuing to move fast, and the proliferation is scary, but it’s not an explosion).
I take “depending on how concentrated AI R&D is” to foreshadow that you’d reply to the above with something like: “This is about lab priorities; the labs with the most impressive models are the labs focusing the most on frontier model development, and they’re unlikely to set their sights on comprehensive automation of shit jobs when they can instead double-down on frontier models and put some RL in the RL to RL the RL that’s been RL’d by the...”
I think that’s right about lab priorities. However, I expect the automation wave to mostly come from middle-men, consultancies, what have you, who take all of the leftover ML researchers not eaten up by the labs and go around automating things away individually (yes, maybe the frontier moves too fast for this to be right, because the labs just end up with a drop-in remote worker ‘for free’ as long as they keep advancing down the tech tree, but I don’t quite think this is true, because human jobs are human-shaped, and buyers are going to want pretty rigorous role-specific guarantees from whoever’s selling this service, even if they’re basically unnecessary, and the one-size-fits-all solution is going to have fewer buyers than the thing marketed as ‘bespoke’).
In general, I don’t like collapsing the various checkpoints between here and superintelligence; there are all these intermediate states, and their exact features matter a lot, and we really don’t know what we’re going to get. ‘By the time we’ll have x, we’ll certainly have y’ is not a form of prediction that anyone has a particularly good track record making.
I generally don’t think the Inconvenient truth movie mattered that much for solving climate change, compared to technological solutions like renewable energy, and made the issue a little more partisan (though environmentalism/climate change was unusually partisan by then) and I think social movements to affect AI already had less impact on AI safety than technical work (in a broad sense) for reducing doom, and I expect this trend to continue.
I think warning shots could scare the public, but I worry that the level of warning shots necessary to clear AI is in a fairly narrow band, and I also expect AI control to have a reasonable probability of containing human-level scheming models that do work, so I wouldn’t pick this at all.
I agree it’s a sad take that “if we pause it will be for stupid reasons”, but I also think this is the very likely attractor, if AI does become a subject that is salient in politics, because people hate nuance, and nuance matters way more than the average person wants to deal with on AI (For example, I think the second species argument critically misses important differences that make the human-AI relationship more friendly than the human-gorilla relationship, and that’s without the subject being politicized).
To address this:
I think the key crux is I believe that the unreliability of GPT-4 would doom any attempt to automate 30% of jobs, and I think at most 0-1% of jobs could be automated, and while in principle you could improve reliability without improving capabilities too much, I also don’t think the incentives yet favor this option.
I agree with this sort of argument, and in general I am not a fan of collapsing checkpoints between today’s AI and God AIs, which is a big mistake I think MIRI did, but my main claim is that the checkpoints would be illegible enough to the average citizen such that they don’t notice the progress until it’s too late, and that the reliability improvements will in practice also be coupled with capabilities improvements that matter to the AI explosion, but not very visible to the average citizen for the reason Garrison Lovely describes here:
https://x.com/GarrisonLovely/status/1866945509975638493
I think I get what you’re saying… That the argument you dislike is, “we should rush to AGI sooner, so that there’s less compute overhang when we get there.”
I agree that that argument is a pretty bad one. I personally think that we are already so far into a compute overhang regime that that ship has sailed. We are using very inefficient learning algorithms, and will be able to run millions of inference instances of any model we produce.
Does this correspond with what you are thinking?
I want to say yes, but I think this might be somewhat more narrow than I mean. It might be helpful if you could list a few other ways one might read my message, that seem similarly-plausible to this one.
Overhangs, overhangs everywhere. A thousand gleaming threads stretching backwards from the fog of the Future, forwards from the static Past, and ending in a single Gordian knot before us here and now.
That knot: understanding, learning, being, thinking. The key, the source, the remaining barrier between us and the infinite, the unknowable, the singularity.
When will it break? What holds it steady? Each thread we examine seems so inadequate. Could this be what is holding us back, saving us from ourselves, from our Mind Children? Not this one, nor that, yet some strange mix of many compensating factors.
Surely, if we had more compute, we’d be there already? Or better data? The right algorithms? Faster hardware? Neuromorphic chips? Clever scaffolding? Training on a regress of chains of thought, to better solutions, to better chains of thought, to even better solutions?
All of these, and none of these. The web strains at the breaking point. How long now? Days? Months?
If we had enough ways to utilize inference-time compute, couldn’t we just scale that to super-genius, and ask the genius for a more efficient solution? But it doesn’t seem like that has been done. Has it been tried? Who can say.
Will the first AGI out the gate be so expensive it is unmaintainable for more than a few hours? Will it quickly find efficiency improvements?
Or will we again be bound, hung up on novel algorithmic insights hanging just out of sight. Who knows?
Surely though, surely.… surely rushing ahead into the danger cannot be the wisest course, the safest course? Can we not agree to take our time, to think through the puzzles that confront us, to enumerate possible consequences and proactively reduce risks?
I hope. I fear. I stare in awestruck wonder at our brilliance and stupidity so tightly intermingled. We place the barrel of the gun to our collective head, panting, desperate, asking ourselves if this is it. Will it be? Intelligence is dead, long live intelligence.
This world?
Yes this world.