A complaint about AI pause: if we pause AI and then unpause, progress will then be really quick, because there’s a backlog of improvements in compute and algorithmic efficiency that can be immediately applied.
One definition of what an RSP is: if a lab makes observation O, then they pause scaling until they implement protection P.
Doesn’t this sort of RSP have the same problem with fast progress after pausing? Why have I never heard anyone make this complaint about RSPs? Possibilities:
They do and I just haven’t seen it
People expect “AI pause” to produce longer / more serious pauses than RSPs (but this seems incidental to the core structure of RSPs)
Basically I just agree with what James said. But I think the steelman is something like: you should expect shorter (or no) pauses with an RSP if all goes well, because the precautions are matched to the risks. Like, the labs aim to develop safety measures which keep pace with the dangers introduced by scaling, and if they succeed at that, then they never have to pause. But even if they fail, they’re also expecting that building frontier models will help them solve alignment faster. I.e., either way the overall pause time would probably be shorter?
It does seem like in order to not have this complaint about the RSP, though, you need to expect that it’s shorter by a lot (like by many months or years). My guess is that the labs do believe this, although not for amazing reasons. Like, the answer which feels most “real” to me is that this complaint doesn’t apply to RSPs because the labs aren’t actually planning to do a meaningful pause.
The point that a capabilities overhang might cause rapid progress in a short period of time has been made by a number of people without any connections to AI labs, including me, which should reduce your credence that it’s “basically, total self-serving BS”.
More to the point of Daniel Filan’s original comment, I have criticized the Responsible Scaling Policy document in the past for failing to distinguish itself clearly from AI pause proposals. My guess is that your second and third points are likely mostly correct: AI labs think of an RSP as different from AI pause because it’s lighter-touch, more narrowly targeted, and the RSP-triggered pause could be lifted more quickly, potentially minimally disrupting business operations.
I think it’s not an unreasonable point to take into account when talking price, but also a lot of the time it’s serves as a BS talking point for people who don’t really care about the subtleties.
AI pause: no observation on what safety issue to address, work on capabilities anyways, then may lead to only capability improvements. (Assumption is that AI pausing means no releasing of models.)
RSP: observed O, shift more resources to work on mitigating O and less on capabilities, and when protection P is done, publish the model, then shift back to capabilities. (Ideally.)
I’m not saying there’s no reason to think that RSPs are better or worse than pause, just that if overhang is a relevant consideration for pause, it’s also a relevant consideration for RSPs.
I’d imagine that RSP proponents think that if we execute them properly, we will simply not build dangerous models beyond our control, period. If progress was faster than what labs can handle after pausing, RSPs should imply that you’d just pause again. On the other hand, there’s not a clear criteria for when we would pause again after, say, a six month pause in scaling.
Now whether this would happen in practice is perhaps a different question.
Are they the same people advocating for RSPs and also using compute/algorithm overhang as a primary argument against a pause? My understanding of the main argument in favor of RSPs over an immediate pause is:
Sure, we could continue to make some progress on safety if we paused other AI progress.
But:
we could make even more progress on safety if we could work with more advanced models; and
right now we have the necessary safety measures to create the next generation of models with low risk.
If AI progress continues without corresponding progress on safety, then (2.b) will no longer hold, so we should indeed pause at that time, hence the RSP.
If you believe that (2.a) and (2.b) are both true, then you can argue that RSPs are better than an immediate pause without referring to compute/algorithm overhang. If you believe that one of (2.a) and (2.b) is false, but are skeptical of a pause because you believe compute/algorithm overhang would increase risk (or at least negate the benefit), then it seems you should also be skeptical of RSPs.
I’m not saying that RSPs are or aren’t better than a pause. But I would think that if overhang is a relevant consideration for pauses, it’s also a relevant consideration for RSPs.
I agree that if overhang is a relevant consideration for pauses, then it’s also a relevant consideration for RSPs. My previous question was: Do you see the same people invoking overhang as an argument against pauses and also talking about RSPs as though they are not also impacted?
Maybe you’re not saying that there are people taking that position, but rather that those who invoke overhang as an argument against pauses don’t seem to be equally vocal against RSPs (if not necessarily in favor of them either). I can think of a couple of separate reasons this could be the case:
To the extent I think a pause is bad (for example, because of overhang), I might still be more motivated to prioritize arguing against “unconditional pause” than “maybe pause in the future”, even if the argument could apply to both. This is especially true if I consider the prospect of an unconditional pause a legitimate, near-term threat.
If I think a pause introduces a high, additional risk, and I think the base level of risk is low, it seems clear that I should not introduce that high risk. But if I get new evidence that there is an immediate, even-higher risk, which a pause could help mitigate, I should be willing to roll the dice on the pause, which now comes with a net reduction in risk.
(2) isn’t a very reassuring position, but it does suggest that “immediate pause bad because overhang” and “RSPs good [in spite of overhang]” are logically compatible.
Do you see the same people invoking overhang as an argument against pauses and also talking about RSPs as though they are not also impacted?
I guess I’m not tracking this closely enough. I’m not really that focussed on any one arguer’s individual priorities, but more about the discourse in general. Basically, I think that overhang is a consideration for unconditional pauses if and only if it’s a consideration for RSPs, so it’s a bad thing if overhang is brought up as an argument against unconditional pauses and not against RSPs, because this will distort the world’s ability to figure out the costs and benefits of each kind of policy.
Also, to be clear, it’s not impossible that RSPs are all things considered better than unconditional pauses, and better than nothing, despite overhang. But if so, I’d hope someone somewhere would have written a piece saying “RSPs have the cost of causing overhang, but on net are worth it”.
As others have said, I believe AI pauses by governments would absolutely be more serious and longer, preventing overhangs from building up too much.
The big worry I do have with pause proposals in practice is that I expect most realistic pauses to buy us several years at most, but not decades long because people will shift their incentives towards algorithmic progress, which isn’t very controllable by default, and I also expect there to be at most 1 OOM of compute left to build AGI which scales to superintelligence by the time we pause, meaning that it’s a very unstable policy as any algorithmic advances like AI search actually working in complicated domains would immediately blow up the pause, and there are likely strong incentives to break the pause once people realize what superintelligence means.
I believe AI pauses by governments would absolutely be more serious and longer, preventing overhangs from building up too much.
Are you saying that overhangs wouldn’t build up too much under pauses because the government wouldn’t let it happen, or that RSPs would have less overhang because they’d pause for less long so less overhang would build up? I can’t quite tell.
A complaint about AI pause: if we pause AI and then unpause, progress will then be really quick, because there’s a backlog of improvements in compute and algorithmic efficiency that can be immediately applied.
One definition of what an RSP is: if a lab makes observation O, then they pause scaling until they implement protection P.
Doesn’t this sort of RSP have the same problem with fast progress after pausing? Why have I never heard anyone make this complaint about RSPs? Possibilities:
They do and I just haven’t seen it
People expect “AI pause” to produce longer / more serious pauses than RSPs (but this seems incidental to the core structure of RSPs)
Basically I just agree with what James said. But I think the steelman is something like: you should expect shorter (or no) pauses with an RSP if all goes well, because the precautions are matched to the risks. Like, the labs aim to develop safety measures which keep pace with the dangers introduced by scaling, and if they succeed at that, then they never have to pause. But even if they fail, they’re also expecting that building frontier models will help them solve alignment faster. I.e., either way the overall pause time would probably be shorter?
It does seem like in order to not have this complaint about the RSP, though, you need to expect that it’s shorter by a lot (like by many months or years). My guess is that the labs do believe this, although not for amazing reasons. Like, the answer which feels most “real” to me is that this complaint doesn’t apply to RSPs because the labs aren’t actually planning to do a meaningful pause.
Good point!
Man, my model of what’s going on is:
The AI pause complaint is, basically, total self-serving BS that has not been called out enough
The implicit plan for RSPs is for them to never trigger in a business-relevant way
It is seen as a good thing (from the perspective of the labs) if they can lose less time to an RSP-triggered pause
...and these, taken together, should explain it.
The point that a capabilities overhang might cause rapid progress in a short period of time has been made by a number of people without any connections to AI labs, including me, which should reduce your credence that it’s “basically, total self-serving BS”.
More to the point of Daniel Filan’s original comment, I have criticized the Responsible Scaling Policy document in the past for failing to distinguish itself clearly from AI pause proposals. My guess is that your second and third points are likely mostly correct: AI labs think of an RSP as different from AI pause because it’s lighter-touch, more narrowly targeted, and the RSP-triggered pause could be lifted more quickly, potentially minimally disrupting business operations.
I think it’s not an unreasonable point to take into account when talking price, but also a lot of the time it’s serves as a BS talking point for people who don’t really care about the subtleties.
My guess is:
AI pause: no observation on what safety issue to address, work on capabilities anyways, then may lead to only capability improvements. (Assumption is that AI pausing means no releasing of models.)
RSP: observed O, shift more resources to work on mitigating O and less on capabilities, and when protection P is done, publish the model, then shift back to capabilities. (Ideally.)
I’m not saying there’s no reason to think that RSPs are better or worse than pause, just that if overhang is a relevant consideration for pause, it’s also a relevant consideration for RSPs.
I’d imagine that RSP proponents think that if we execute them properly, we will simply not build dangerous models beyond our control, period. If progress was faster than what labs can handle after pausing, RSPs should imply that you’d just pause again. On the other hand, there’s not a clear criteria for when we would pause again after, say, a six month pause in scaling.
Now whether this would happen in practice is perhaps a different question.
I think pause proponents think similarly!
Realized that I didn’t respond to this—PauseAI’s proposal is for a pause until safety can be guaranteed, rather than just for 6 months.
Are they the same people advocating for RSPs and also using compute/algorithm overhang as a primary argument against a pause? My understanding of the main argument in favor of RSPs over an immediate pause is:
Sure, we could continue to make some progress on safety if we paused other AI progress.
But:
we could make even more progress on safety if we could work with more advanced models; and
right now we have the necessary safety measures to create the next generation of models with low risk.
If AI progress continues without corresponding progress on safety, then (2.b) will no longer hold, so we should indeed pause at that time, hence the RSP.
If you believe that (2.a) and (2.b) are both true, then you can argue that RSPs are better than an immediate pause without referring to compute/algorithm overhang. If you believe that one of (2.a) and (2.b) is false, but are skeptical of a pause because you believe compute/algorithm overhang would increase risk (or at least negate the benefit), then it seems you should also be skeptical of RSPs.
I’m not saying that RSPs are or aren’t better than a pause. But I would think that if overhang is a relevant consideration for pauses, it’s also a relevant consideration for RSPs.
I agree that if overhang is a relevant consideration for pauses, then it’s also a relevant consideration for RSPs. My previous question was: Do you see the same people invoking overhang as an argument against pauses and also talking about RSPs as though they are not also impacted?
Maybe you’re not saying that there are people taking that position, but rather that those who invoke overhang as an argument against pauses don’t seem to be equally vocal against RSPs (if not necessarily in favor of them either). I can think of a couple of separate reasons this could be the case:
To the extent I think a pause is bad (for example, because of overhang), I might still be more motivated to prioritize arguing against “unconditional pause” than “maybe pause in the future”, even if the argument could apply to both. This is especially true if I consider the prospect of an unconditional pause a legitimate, near-term threat.
If I think a pause introduces a high, additional risk, and I think the base level of risk is low, it seems clear that I should not introduce that high risk. But if I get new evidence that there is an immediate, even-higher risk, which a pause could help mitigate, I should be willing to roll the dice on the pause, which now comes with a net reduction in risk.
(2) isn’t a very reassuring position, but it does suggest that “immediate pause bad because overhang” and “RSPs good [in spite of overhang]” are logically compatible.
I guess I’m not tracking this closely enough. I’m not really that focussed on any one arguer’s individual priorities, but more about the discourse in general. Basically, I think that overhang is a consideration for unconditional pauses if and only if it’s a consideration for RSPs, so it’s a bad thing if overhang is brought up as an argument against unconditional pauses and not against RSPs, because this will distort the world’s ability to figure out the costs and benefits of each kind of policy.
Also, to be clear, it’s not impossible that RSPs are all things considered better than unconditional pauses, and better than nothing, despite overhang. But if so, I’d hope someone somewhere would have written a piece saying “RSPs have the cost of causing overhang, but on net are worth it”.
As others have said, I believe AI pauses by governments would absolutely be more serious and longer, preventing overhangs from building up too much.
The big worry I do have with pause proposals in practice is that I expect most realistic pauses to buy us several years at most, but not decades long because people will shift their incentives towards algorithmic progress, which isn’t very controllable by default, and I also expect there to be at most 1 OOM of compute left to build AGI which scales to superintelligence by the time we pause, meaning that it’s a very unstable policy as any algorithmic advances like AI search actually working in complicated domains would immediately blow up the pause, and there are likely strong incentives to break the pause once people realize what superintelligence means.
See here for one example:
https://yellow-apartment-148.notion.site/AI-Search-The-Bitter-er-Lesson-44c11acd27294f4495c3de778cd09c8d
Are you saying that overhangs wouldn’t build up too much under pauses because the government wouldn’t let it happen, or that RSPs would have less overhang because they’d pause for less long so less overhang would build up? I can’t quite tell.
That RSPs would have less overhang because they’d pause for less long so less overhang would build up.