My rough ranking of different ways superintelligence could be developed:
Least safe: Corporate Race. Superintelligence is developed in the context of a corporate race between OpenAI, Microsoft, Google, Anthropic, and Facebook.
Safer (but still quite dangerous): USG race with China. Superintelligence is developed in the context of a USG project or “USG + Western allies” project with highly secure weights. The coalition hopefully obtains a lead of 1-3 years that it tries to use to align superintelligence and achieve a decisive strategic advantage. This probably relies heavily on deep learning and means we do not have time to invest into alternative paradigms (“provably safe” systems, human intelligence enhancement, etc.
Safest (but still not a guarantee of success): International coalition.Superintelligence is developed in the context of an international project with highly secure weights. The coalition still needs to develop superintelligence before rogue projects can, but the coalition hopes to obtain a lead of 10+ years that it can use to align a system that can prevent rogue AGI projects. This could buy us enough time to invest heavily in alternative paradigms.
My own thought is that we should be advocating for option #3 (international coordination) unless/until there is enough evidence that suggests that it’s actually not feasible, and then we should settle for option #2. I’m not yet convinced by people who say we have to settle for option #2 just because EG climate treaties have not went well or international cooperation is generally difficult.
But I also think people advocating #3 should be aware that there are some worlds in which international cooperation will not be feasible, and we should be prepared to do #2 if it’s quite clear that the US and China are unwilling to cooperate on AGI development. (And again, I don’t think we have that evidence yet– I think there’s a lot of uncertainty here.)
I don’t think the risk ordering is obvious at all, especially not between #2 and #3, and especially not if you also took into account tractability concerns and risks separate from extinction (e.g. stable totalitarianism, s-risks). Even if you thought coordinating with China might be worth it, I think it should be at least somewhat obvious why the US government [/ and its allies] might be very uncomfortable building a coalition with, say, North Korea or Russia. Even between #1 and #2, the probable increase in risks of centralization might make it not worth it, at least in some worlds, depending on how optimistic one might be about e.g. alignment or offense-defense balance from misuse of models with dangerous capabilities.
I also don’t think it’s obvious alternative paradigms would necessarily be both safer and tractable enough, even on 10-year timelines, especially if you don’t use AI automation (using the current paradigm, probably) to push those forward.
the probable increase in risks of centralization might make it not worth it
Can you say more about why the risk of centralization differs meaningfully between the three worlds?
IMO if you assume that (a) an intelligence explosion occurs at some point, (b) the leading actor uses the intelligence explosion to produce a superintelligence that provides a decisive strategic advantage, and (c) the superintelligence is aligned/controlled...
Then you are very likely (in the absence of coordination) to result in centralization no matter what. It’s just a matter of whether OpenAI/Microsoft (scenario #1), the USG and allies (scenario #2), or a broader international coalition (weighted heavily toward the USG and China) are the ones wielding the superintelligence.
(If anything, it seems like the “international coalition” approach seems less likely to lead to centralization than the other two approaches, since you’re more likely to get post-AGI coordination.)
especially if you don’t use AI automation (using the current paradigm, probably) to push those forward.
In my vision, the national or international project would be investing into “superalignment”-style approaches, they would just (hopefully) have enough time/resources to be investing into other approaches as well.
I typically assume we don’t get “infinite time”– i.e., even the international coalition is racing against “the clock” (e.g., the amount of time it takes for a rogue actor to develop ASI in a way that can’t be prevented, or the amount of time we have until a separate existential catastrophe occurs.) So I think it would be unwise for the international coalition to completely abandon DL/superalignemnt, even if one of the big hopes is that a safer paradigm would be discovered in time.
IMO if you assume that (a) an intelligence explosion occurs at some point, (b) the leading actor uses the intelligence explosion to produce a superintelligence that provides a decisive strategic advantage, and (c) the superintelligence is aligned/controlled...
I don’t think this is obvious, stably-multipolar worlds seem at least plausible to me.
@Bodgan, Can you spell out a vision for a stably multipolar world with the above assumptions satisfied?
IMO assumption B is doing a lot of the work— you might argue that the IE will not give anyone a DSA, in which case things get more complicated. I do see some plausible stories in which this could happen but they seem pretty unlikely.
@Ryan, thanks for linking to those. Lmk if there are particular points you think are most relevant (meta: I think in general I find discourse more productive when it’s like “hey here’s a claim, also read more here” as opposed to links. Ofc that puts more communication burden on you though, so feel free to just take the links approach.)
(Yeah, I was just literally linking to things people might find relevant to read without making any particular claim. I think this is often slightly helpful, so I do it. Edit: when I do this, I should probably include a disclaimer like “Linking for relevance, not making any specific claim”.)
Yup, I was thinking about worlds in which there is no obvious DSA, or where the parties involved are risk averse enough (perhaps e.g. for reasons like in this talk)
My expectation is that DSI can (and will) be achieved before ASI. In fact, I expect ASI to be about as useful as a bomb which has a minimum effect size of destroying the entire solar system if deployed. In other words, useful only for Mutually Assured Destruction.
DSI only requires a nuclear-armed state actor to have an effective global missile defense system. Whichever nuclear-armed state actor gets that without any other group having that can effectively demand the surrender and disarmament of all other nations. Including confiscating their compute resources.
Do you think missile defense is so difficult that only ASI can manage it? I don’t. That seems like a technical discussion which would need more details to hash out. I’m pretty sure an explicitly designed tool AI and a large drone and satellite fleet could accomplish that.
Competition is fractal. There are multiple hierarchies (countries/departments/agencies/etc, corporations/divisions/teams/etc), with individual humans acting on their own behalf. Often, individuals have influence and goals in multiple hierarchies.
Your 1/2/3 delineation is not the important part. It’s going to be all 3, with chaotic shifts as public perception, funding, and regulation shifts around.
Agree—I think people need to be prepared for “try-or-die” scenarios.
One unfun one I’ll toss into the list: “Company A is 12 months from building Cthulhu, and governments truly do not care and there is extremely strong reason to believe that will not change in the next year. All our policy efforts have failed, our existing technical methods are useless, and the end of the world has come. Everyone report for duty at Company B, we’re going to try to roll the hard six.”
If Company A is 12 months from building Cthulhu, we fucked up upstream. Also, I don’t understand why you’d want to play the AI arms race—you have better options. They expect an AI arms race. Use other tactics. Get into their OODA loop.
You are probably already familiar with this, but re option 3, the Multilateral AGI Consortium (MAGIC) proposal is I assume along the lines of what you are thinking.
My rough ranking of different ways superintelligence could be developed:
Least safe: Corporate Race. Superintelligence is developed in the context of a corporate race between OpenAI, Microsoft, Google, Anthropic, and Facebook.
Safer (but still quite dangerous): USG race with China. Superintelligence is developed in the context of a USG project or “USG + Western allies” project with highly secure weights. The coalition hopefully obtains a lead of 1-3 years that it tries to use to align superintelligence and achieve a decisive strategic advantage. This probably relies heavily on deep learning and means we do not have time to invest into alternative paradigms (“provably safe” systems, human intelligence enhancement, etc.
Safest (but still not a guarantee of success): International coalition. Superintelligence is developed in the context of an international project with highly secure weights. The coalition still needs to develop superintelligence before rogue projects can, but the coalition hopes to obtain a lead of 10+ years that it can use to align a system that can prevent rogue AGI projects. This could buy us enough time to invest heavily in alternative paradigms.
My own thought is that we should be advocating for option #3 (international coordination) unless/until there is enough evidence that suggests that it’s actually not feasible, and then we should settle for option #2. I’m not yet convinced by people who say we have to settle for option #2 just because EG climate treaties have not went well or international cooperation is generally difficult.
But I also think people advocating #3 should be aware that there are some worlds in which international cooperation will not be feasible, and we should be prepared to do #2 if it’s quite clear that the US and China are unwilling to cooperate on AGI development. (And again, I don’t think we have that evidence yet– I think there’s a lot of uncertainty here.)
I don’t think the risk ordering is obvious at all, especially not between #2 and #3, and especially not if you also took into account tractability concerns and risks separate from extinction (e.g. stable totalitarianism, s-risks). Even if you thought coordinating with China might be worth it, I think it should be at least somewhat obvious why the US government [/ and its allies] might be very uncomfortable building a coalition with, say, North Korea or Russia. Even between #1 and #2, the probable increase in risks of centralization might make it not worth it, at least in some worlds, depending on how optimistic one might be about e.g. alignment or offense-defense balance from misuse of models with dangerous capabilities.
I also don’t think it’s obvious alternative paradigms would necessarily be both safer and tractable enough, even on 10-year timelines, especially if you don’t use AI automation (using the current paradigm, probably) to push those forward.
Can you say more about why the risk of centralization differs meaningfully between the three worlds?
IMO if you assume that (a) an intelligence explosion occurs at some point, (b) the leading actor uses the intelligence explosion to produce a superintelligence that provides a decisive strategic advantage, and (c) the superintelligence is aligned/controlled...
Then you are very likely (in the absence of coordination) to result in centralization no matter what. It’s just a matter of whether OpenAI/Microsoft (scenario #1), the USG and allies (scenario #2), or a broader international coalition (weighted heavily toward the USG and China) are the ones wielding the superintelligence.
(If anything, it seems like the “international coalition” approach seems less likely to lead to centralization than the other two approaches, since you’re more likely to get post-AGI coordination.)
In my vision, the national or international project would be investing into “superalignment”-style approaches, they would just (hopefully) have enough time/resources to be investing into other approaches as well.
I typically assume we don’t get “infinite time”– i.e., even the international coalition is racing against “the clock” (e.g., the amount of time it takes for a rogue actor to develop ASI in a way that can’t be prevented, or the amount of time we have until a separate existential catastrophe occurs.) So I think it would be unwise for the international coalition to completely abandon DL/superalignemnt, even if one of the big hopes is that a safer paradigm would be discovered in time.
I don’t think this is obvious, stably-multipolar worlds seem at least plausible to me.
See also here and here.
@Bodgan, Can you spell out a vision for a stably multipolar world with the above assumptions satisfied?
IMO assumption B is doing a lot of the work— you might argue that the IE will not give anyone a DSA, in which case things get more complicated. I do see some plausible stories in which this could happen but they seem pretty unlikely.
@Ryan, thanks for linking to those. Lmk if there are particular points you think are most relevant (meta: I think in general I find discourse more productive when it’s like “hey here’s a claim, also read more here” as opposed to links. Ofc that puts more communication burden on you though, so feel free to just take the links approach.)
(Yeah, I was just literally linking to things people might find relevant to read without making any particular claim. I think this is often slightly helpful, so I do it. Edit: when I do this, I should probably include a disclaimer like “Linking for relevance, not making any specific claim”.)
Yup, I was thinking about worlds in which there is no obvious DSA, or where the parties involved are risk averse enough (perhaps e.g. for reasons like in this talk)
My expectation is that DSI can (and will) be achieved before ASI. In fact, I expect ASI to be about as useful as a bomb which has a minimum effect size of destroying the entire solar system if deployed. In other words, useful only for Mutually Assured Destruction. DSI only requires a nuclear-armed state actor to have an effective global missile defense system. Whichever nuclear-armed state actor gets that without any other group having that can effectively demand the surrender and disarmament of all other nations. Including confiscating their compute resources. Do you think missile defense is so difficult that only ASI can manage it? I don’t. That seems like a technical discussion which would need more details to hash out. I’m pretty sure an explicitly designed tool AI and a large drone and satellite fleet could accomplish that.
Competition is fractal. There are multiple hierarchies (countries/departments/agencies/etc, corporations/divisions/teams/etc), with individual humans acting on their own behalf. Often, individuals have influence and goals in multiple hierarchies.
Your 1/2/3 delineation is not the important part. It’s going to be all 3, with chaotic shifts as public perception, funding, and regulation shifts around.
Agree—I think people need to be prepared for “try-or-die” scenarios.
One unfun one I’ll toss into the list: “Company A is 12 months from building Cthulhu, and governments truly do not care and there is extremely strong reason to believe that will not change in the next year. All our policy efforts have failed, our existing technical methods are useless, and the end of the world has come. Everyone report for duty at Company B, we’re going to try to roll the hard six.”
If Company A is 12 months from building Cthulhu, we fucked up upstream. Also, I don’t understand why you’d want to play the AI arms race—you have better options. They expect an AI arms race. Use other tactics. Get into their OODA loop.
Unsee the frontier lab.
...yes ? I think my scenario explicitly assumes that we’ve fucked up upstream in many, many ways.
Oh, by that I meant something like “yeah I really think it is not a good idea to focus on an AI arms race”. See also Slack matters more than any other outcome.
You are probably already familiar with this, but re option 3, the Multilateral AGI Consortium (MAGIC) proposal is I assume along the lines of what you are thinking.
Indeed, Akash is familiar: https://arxiv.org/abs/2310.20563 :)
(I think it was a later paper he co-authored than the one you cite)