It seems to conflate scaling pauses (which aren’t clearly very useful) with pausing all AI related progress (hardware, algorithmic development, software). Many people think that scaling pauses aren’t clearly that useful due to overhang issues, but hardware pauses are pretty great. However, hardware development and production pauses would clearly be extremely difficult to implement. IMO the sufficient pause AI ask is more like “ask nvidia/tsmc/etc to mostly shut down” rather than “ask AGI labs to pause”.
More generally, the exact type of pause which would actually be better than (e.g.) well implemented RSPs is a non-trivial technical problem which makes this complex to communicate. I think this is a major reason why people don’t say stuff like “obviously, a full pause with XYZ characteristics would be better”. For instance, if I was running the US, I’d probably slow down scaling considerably, but I’d mostly be interested in implementing safety standards similar to RSPs due to lack of strong international coordination.
The post says “many people believe” a “pause is necessary” claim[1], but the exact claim you state probably isn’t actually believed by the people you cite below without additional complications. Like what exact counterfactuals are you comparing? For instance, I think that well implemented RSPs required by a regulatory agency can reduce risk to <5% (partially by stopping in worlds where this appears needed). So as an example, I don’t believe a scaling pause is necessary and other interventions would probably reduce risk more (while also probably being politically easier). And, I think a naive “AI scaling pause” doesn’t reduce risk that much, certainly less than a high quality US regulatory agency which requires and reviews something like RSPs. When claiming “many people believe”, I think you should make a more precise claim that the people you name actually believe.
Calling something a “pragmatic middle ground” doesn’t imply that there aren’t better options (e.g., shut down the whole hardware industry).
For instance, I don’t think it’s “lying” when people advocate for partial reductions in nuclear arms without noting that it would be better to secure sufficient international coordination to guarantee world peace. Like world peace would be great, but idk if it’s necessary to talk about. (There is probably less common knowledge in the AI case, but I think this example mostly holds.)
This post says “When called out on this, most people we talk to just fumble.”. I strongly predict that the people actually mentioned in the part above this (Open Phil, Paul, ARC evals, etc) don’t actually fumble and have a reasonable response. So, I think this misleadingly conflates the responses of two different groups at best.
More generally, this post seems to claim people have views that I don’t actually think they have and assumes the motives for various actions are powerseeking without any evidence for this.
The use of the term lying seems like a case of “noncentral fallacy” to me. The post presupposes a communication/advocacy norm and states violations of this norm should be labeled “lying”. I’m not sure I’m sold on this communication norm in the first place. (Edit: I think “say the ideal thing” shouldn’t be a norm (something where we punish people who violate this), but it does seem probably good in many cases to state the ideal policy.)
In a saner world, all AGI progress should have already stopped. If we don’t, there’s more than a 10% chance we all die.
Many people in the AI safety community believe this, but they have not stated it publicly. Worse, they have stated different beliefs more saliently, which misdirect everyone else about what should be done, and what the AI safety community believes.
The title doesn’t seem supported by the content. The post doesn’t argue that people are being cowardly or aren’t being strategic (it does argue they are incorrect and seeking power in a immoral way, but this is different).
As an aside, this seems to be a general trend: I have seen people defend misleading headlines on news articles with suggestions that the title should be judged independent of the content. I disagree.
Yes, but the author wasn’t forced at gunpoint, presumably, to work with that particular editor. So then the question can be reframed as: why did the author choose to work with an editor that seems untrustworthy?
Journalists at most news outlets do not choose which editor(s) they work with on a given story, except insofar as they choose to not quit their job. This does not feel like a fair basis on which to hold the journalist responsible for the headline chosen by their editor(s).
Maybe if they were deceived into thinking the editor was genuine and trustworthy, but otherwise if they knew they’re working with someone untrustworthy , and they still choose to associate their names together publicly, then obviously it impacts their credibility.
Insofar as a reporter works for an outlet that habitually writes misleading headlines, that does undermine the credibility of the reporter, but that’s partly true because outlets that publish grossly misleading headlines tend to take other ethical shortcuts as well. But without that general trend or a broader assessment of an outlet’s credibility, it’s possible that an otherwise fair story would get a misleading headline through no fault of the reporter, and it would be incorrect to judge the reporter for that (as Eli says above).
For instance, if I was running the US, I’d probably slow down scaling considerably, but I’d mostly be interested in implementing safety standards similar to RSPs due to lack of strong international coordination.
Surely if you were running the US, that would be a great position to try to get international coordination on policies you think are best for everyone?
[I agree with most of this, and think it’s a very useful comment; just pointing out disagreements]
For instance, I think that well implemented RSPs required by a regulatory agency can reduce risk <5% (partially by stopping in worlds where this appears needed).
I assume this would be a crux with Connor/Gabe (and I think I’m at least much less confident in this than you appear to be).
We’re already in a world where stopping appears necessary.
It’s entirely possible we all die before stopping was clearly necessary.
What gives you confidence that RSPs would actually trigger a pause?
If a lab is stopping for reasons that aren’t based on objective conditions in an RSP, then what did the RSP achieve?
Absent objective tests that everyone has signed up for, a lab may well not stop, since there’ll always be the argument “Well we think that the danger is somewhat high, but it doesn’t help if only we pause”.
It’s far from clear that we’ll get objective and sufficient conditions for safety (or even for low risk). I don’t expect us to—though it’d obviously be nice to be wrong.
[EDIT: or rather, ones that allow scaling to continue safely—we already know sufficient conditions for safety: stopping]
Calling something a “pragmatic middle ground” doesn’t imply that there aren’t better options
I think the objection here is more about what is loosely suggested by the language used, and what is not said—not about logical implications. What is loosely suggested by the ARC Evals language is that it’s not sensible to aim for the more “extreme” end of things (pausing), and that this isn’t worthy of argument.
Perhaps ARC Evals have a great argument , but they don’t make one. I think it’s fair to say that they argue the middle ground is practical. I don’t think it can be claimed they argue for pragmatic until they address both the viability of other options, and the risks of various courses. Doing a practical thing that would predictably lead to higher risk is not pragmatic.
It’s not clear what the right course here, but making no substantive argument gives a completely incorrect impression. If they didn’t think it was the right place for such an argument, then it’d be easy to say that: that this is a complex question, that it’s unclear this course is best, and that RSPs vs Pause vs … deserves a lot more analysis.
The post presupposes a communication/advocacy norm and states violations of this norm should be labeled “lying”. I’m not sure I’m sold on this communication norm in the first place.
I’d agree with that, but I do think that in this case it’d be useful for people/orgs to state both a [here’s what we’d like ideally] and a [here’s what we’re currently pushing for]. I can imagine many cases where this wouldn’t hold, but I don’t see the argument here. If there is an argument, I’d like to hear it! (fine if it’s conditional on not being communicated further)
Thanks for the response, one quick clarification in case this isn’t clear.
On:
For instance, I think that well implemented RSPs required by a regulatory agency can reduce risk to <5% (partially by stopping in worlds where this appears needed).
I assume this would be a crux with Connor/Gabe (and I think I’m at least much less confident in this than you appear to be).
It’s worth noting here that I’m responding to this passage from the text:
In a saner world, all AGI progress should have already stopped. If we don’t, there’s more than a 10% chance we all die.
Many people in the AI safety community believe this, but they have not stated it publicly. Worse, they have stated different beliefs more saliently, which misdirect everyone else about what should be done, and what the AI safety community believes.
I’m responding to the “many people believe this” which I think implies that the groups they are critiquing believe this. I want to contest what these people believe, not what is actually true.
Like many of therse people think policy interventions other than pause reduce X-risk below 10%.
Maybe I think something like (numbers not well considered):
P(doom) = 35%
P(doom | scaling pause by executive order in 2024) = 25%
P(doom | good version of regulatory agency doing something like RSP and safety arguments passed into law in 2024) = 5% (depends a ton on details and political buy in!!!)
P(doom | full and strong international coordination around pausing all AI related progress for 10+ years which starts by pausing hardware progress and current manufacturing) = 3%
Note that these numbers take into account evidential updates (e.g., probably other good stuff is happening if we have super strong internation coordination around pausing AI).
Agreed that the post is at the very least not clear. In particular, it’s obviously not true that [if we don’t stop today, there’s more than a 10% chance we all die], and I don’t think [if we neverstop, under any circumstances...] is a case many people would be considering at all.
It’d make sense to be much clearer on the ‘this’ that “many people believe”.
Calling something a “pragmatic middle ground” doesn’t imply that there aren’t better options
I think the objection here is more about what is loosely suggested by the language used, and what is not said—not about logical implications. What is loosely suggested by the ARC Evals language is that it’s not sensible to aim for the more “extreme” end of things (pausing), and that this isn’t worthy of argument.
Perhaps ARC Evals have a great argument , but they don’t make one. I think it’s fair to say that they argue the middle ground is practical. I don’t think it can be claimed they argue for pragmatic until they address both the viability of other options, and the risks of various courses. Doing a practical thing that would predictably lead to higher risk is not pragmatic.
It’s not clear what the right course here, but making no substantive argument gives a completely incorrect impression. If they didn’t think it was the right place for such an argument, then it’d be easy to say that: that this is a complex question, that it’s unclear this course is best, and that RSPs vs Pause vs … deserves a lot more analysis.
Yeah, I probably want to walk back my claim a bit. Maybe I want to say “doesn’t strongly imply”?
It would have been better if ARC evals noted that the conclusion isn’t entirely obvious. It doesn’t seem like a huge error to me, but maybe I’m underestimating the ripple effects etc.
As an aside, I think it’s good for people and organizations (especially AI labs) to clearly state their views on AI risk, see e.g., my comment here. So I agree with this aspect of the post.
Stating clear views on what ideal government/international policy would look like also seems good.
(And I agree with a bunch of other misc specific points in the post like “we can maybe push the overton window far” and “avoiding saying true things to retain respectability in order to get more power is sketchy”.)
(Edit: from a communication best practices perspective, I wish I noted where I agree in the parent comment than here.)
I think this post is quite misleading and unnecessarily adversarial.
I’m not sure if I want to engage futher, I might give examples of this later.(See examples below)(COI: I often talk to and am friendly with many of the groups criticized in this post.)
Examples:
It seems to conflate scaling pauses (which aren’t clearly very useful) with pausing all AI related progress (hardware, algorithmic development, software). Many people think that scaling pauses aren’t clearly that useful due to overhang issues, but hardware pauses are pretty great. However, hardware development and production pauses would clearly be extremely difficult to implement. IMO the sufficient pause AI ask is more like “ask nvidia/tsmc/etc to mostly shut down” rather than “ask AGI labs to pause”.
More generally, the exact type of pause which would actually be better than (e.g.) well implemented RSPs is a non-trivial technical problem which makes this complex to communicate. I think this is a major reason why people don’t say stuff like “obviously, a full pause with XYZ characteristics would be better”. For instance, if I was running the US, I’d probably slow down scaling considerably, but I’d mostly be interested in implementing safety standards similar to RSPs due to lack of strong international coordination.
The post says “many people believe” a “pause is necessary” claim[1], but the exact claim you state probably isn’t actually believed by the people you cite below without additional complications. Like what exact counterfactuals are you comparing? For instance, I think that well implemented RSPs required by a regulatory agency can reduce risk to <5% (partially by stopping in worlds where this appears needed). So as an example, I don’t believe a scaling pause is necessary and other interventions would probably reduce risk more (while also probably being politically easier). And, I think a naive “AI scaling pause” doesn’t reduce risk that much, certainly less than a high quality US regulatory agency which requires and reviews something like RSPs. When claiming “many people believe”, I think you should make a more precise claim that the people you name actually believe.
Calling something a “pragmatic middle ground” doesn’t imply that there aren’t better options (e.g., shut down the whole hardware industry).
For instance, I don’t think it’s “lying” when people advocate for partial reductions in nuclear arms without noting that it would be better to secure sufficient international coordination to guarantee world peace. Like world peace would be great, but idk if it’s necessary to talk about. (There is probably less common knowledge in the AI case, but I think this example mostly holds.)
This post says “When called out on this, most people we talk to just fumble.”. I strongly predict that the people actually mentioned in the part above this (Open Phil, Paul, ARC evals, etc) don’t actually fumble and have a reasonable response. So, I think this misleadingly conflates the responses of two different groups at best.
More generally, this post seems to claim people have views that I don’t actually think they have and assumes the motives for various actions are powerseeking without any evidence for this.
The use of the term lying seems like a case of “noncentral fallacy” to me. The post presupposes a communication/advocacy norm and states violations of this norm should be labeled “lying”. I’m not sure I’m sold on this communication norm in the first place. (Edit: I think “say the ideal thing” shouldn’t be a norm (something where we punish people who violate this), but it does seem probably good in many cases to state the ideal policy.)
The exact text from the post is:
The title doesn’t seem supported by the content. The post doesn’t argue that people are being cowardly or aren’t being strategic (it does argue they are incorrect and seeking power in a immoral way, but this is different).
As an aside, this seems to be a general trend: I have seen people defend misleading headlines on news articles with suggestions that the title should be judged independent of the content. I disagree.
Well, the author of an article often doesn’t decide the the title of the post. The editor does that.
So it can be the case that an author wrote a reasonable and nuanced piece, and then the editor added an outrageous click-bait headline.
Yes, but the author wasn’t forced at gunpoint, presumably, to work with that particular editor. So then the question can be reframed as: why did the author choose to work with an editor that seems untrustworthy?
Journalists at most news outlets do not choose which editor(s) they work with on a given story, except insofar as they choose to not quit their job. This does not feel like a fair basis on which to hold the journalist responsible for the headline chosen by their editor(s).
Why does it not feel like a fair basis?
Maybe if they were deceived into thinking the editor was genuine and trustworthy, but otherwise if they knew they’re working with someone untrustworthy , and they still choose to associate their names together publicly, then obviously it impacts their credibility.
Insofar as a reporter works for an outlet that habitually writes misleading headlines, that does undermine the credibility of the reporter, but that’s partly true because outlets that publish grossly misleading headlines tend to take other ethical shortcuts as well. But without that general trend or a broader assessment of an outlet’s credibility, it’s possible that an otherwise fair story would get a misleading headline through no fault of the reporter, and it would be incorrect to judge the reporter for that (as Eli says above).
Surely if you were running the US, that would be a great position to try to get international coordination on policies you think are best for everyone?
Sure, but seems reasonably likely that it would be hard to get that much international coordination.
Maybe—but you definitely can’t get it if you don’t even try to communicate the thing you think would be better.
[I agree with most of this, and think it’s a very useful comment; just pointing out disagreements]
I assume this would be a crux with Connor/Gabe (and I think I’m at least much less confident in this than you appear to be).
We’re already in a world where stopping appears necessary.
It’s entirely possible we all die before stopping was clearly necessary.
What gives you confidence that RSPs would actually trigger a pause?
If a lab is stopping for reasons that aren’t based on objective conditions in an RSP, then what did the RSP achieve?
Absent objective tests that everyone has signed up for, a lab may well not stop, since there’ll always be the argument “Well we think that the danger is somewhat high, but it doesn’t help if only we pause”.
It’s far from clear that we’ll get objective and sufficient conditions for safety (or even for low risk). I don’t expect us to—though it’d obviously be nice to be wrong.
[EDIT: or rather, ones that allow scaling to continue safely—we already know sufficient conditions for safety: stopping]
I think the objection here is more about what is loosely suggested by the language used, and what is not said—not about logical implications. What is loosely suggested by the ARC Evals language is that it’s not sensible to aim for the more “extreme” end of things (pausing), and that this isn’t worthy of argument.
Perhaps ARC Evals have a great argument , but they don’t make one. I think it’s fair to say that they argue the middle ground is practical. I don’t think it can be claimed they argue for pragmatic until they address both the viability of other options, and the risks of various courses. Doing a practical thing that would predictably lead to higher risk is not pragmatic.
It’s not clear what the right course here, but making no substantive argument gives a completely incorrect impression. If they didn’t think it was the right place for such an argument, then it’d be easy to say that: that this is a complex question, that it’s unclear this course is best, and that RSPs vs Pause vs … deserves a lot more analysis.
I’d agree with that, but I do think that in this case it’d be useful for people/orgs to state both a [here’s what we’d like ideally] and a [here’s what we’re currently pushing for]. I can imagine many cases where this wouldn’t hold, but I don’t see the argument here. If there is an argument, I’d like to hear it! (fine if it’s conditional on not being communicated further)
Thanks for the response, one quick clarification in case this isn’t clear.
On:
It’s worth noting here that I’m responding to this passage from the text:
I’m responding to the “many people believe this” which I think implies that the groups they are critiquing believe this. I want to contest what these people believe, not what is actually true.
Like many of therse people think policy interventions other than pause reduce X-risk below 10%.
Maybe I think something like (numbers not well considered):
P(doom) = 35%
P(doom | scaling pause by executive order in 2024) = 25%
P(doom | good version of regulatory agency doing something like RSP and safety arguments passed into law in 2024) = 5% (depends a ton on details and political buy in!!!)
P(doom | full and strong international coordination around pausing all AI related progress for 10+ years which starts by pausing hardware progress and current manufacturing) = 3%
Note that these numbers take into account evidential updates (e.g., probably other good stuff is happening if we have super strong internation coordination around pausing AI).
Ah okay—thanks. That’s clarifying.
Agreed that the post is at the very least not clear.
In particular, it’s obviously not true that [if we don’t stop today, there’s more than a 10% chance we all die], and I don’t think [if we never stop, under any circumstances...] is a case many people would be considering at all.
It’d make sense to be much clearer on the ‘this’ that “many people believe”.
(and I hope you’re correct on P(doom)!)
Yeah, I probably want to walk back my claim a bit. Maybe I want to say “doesn’t strongly imply”?
It would have been better if ARC evals noted that the conclusion isn’t entirely obvious. It doesn’t seem like a huge error to me, but maybe I’m underestimating the ripple effects etc.
As an aside, I think it’s good for people and organizations (especially AI labs) to clearly state their views on AI risk, see e.g., my comment here. So I agree with this aspect of the post.
Stating clear views on what ideal government/international policy would look like also seems good.
(And I agree with a bunch of other misc specific points in the post like “we can maybe push the overton window far” and “avoiding saying true things to retain respectability in order to get more power is sketchy”.)
(Edit: from a communication best practices perspective, I wish I noted where I agree in the parent comment than here.)