AI and the Probability of Conflict
Cross-posted from The Metasophist. Necessarily speculative.
It is commonly argued that artificial general intelligence (AGI), unaligned with human values, represents an existential risk for humanity. For example, in his recent book The Precipice, philosopher Toby Ord argued that there is a 10 percent chance that unaligned artificial general intelligence will result in existential catastrophe for humanity in the coming century.
Less attention is devoted to whether or how aligned AGI could introduce additional existential risk, such as through increasing the probability of conflict. For example, how would one country react to the news that another country is on the verge of developing an AGI fully under their control, a potential geopolitical game changer similar to the advent of nuclear weapons?
Aligned AGI could have important geopolitical implications by altering both the nature and probability of conflict. But how exactly? This is difficult to answer given that the form of AGI is still hugely uncertain. Nevertheless, the different scenarios explored below indicate that aligned AGI would not greatly increase the probability of conflict.
Let’s explore these scenarios in detail.
Setting the scene
Consider two countries, the Hegemon and the Challenger. Furthermore, assume that both countries are trying to develop AGI. How would they strategise vis-à-vis one another?
First, we need to make some assumptions about what the goals of these two states are. Let’s assume that they seek to maximise their relative power, in line with the theory of offensive realism articulated by John Mearsheimer. In other words, they care more about security than prosperity, and therefore try to ensure their independence from other powers.
Empirically speaking, this is not always true: US and European policy towards China in the 90s and 00s is but one exception to this. But states do seem to converge over time to a realist approach, while divergences from it seem to be temporary and idiosyncratic.
Now, we want to understand how both states might behave just before and after AGI is developed.
The capabilities of AGI are important to note at this point. For this, it is instructive to take a quote from Ord. After discussing how an AGI could be even more persuasive than historical demagogues in convincing others to do its will, he says:
First, the AGI system could gain access to the internet and hide thousands of backup copies, scattered among insecure computer systems around the world, ready to wake up and continue the job if the original is removed. Even by this point, the AGI would be practically impossible to destroy: consider the political obstacles to erasing all hard drives in the world where it may have backups. It could then take over millions of unsecured systems on the internet, forming a large “botnet.” This would be a vast scaling-up of computational resources and provide a platform for escalating power. From there, it could gain financial resources (hacking the bank accounts on those computers) and human resources (using blackmail or propaganda against susceptible people or just paying them with its stolen money). It would then be as powerful as a well-resourced criminal underworld, but much harder to eliminate. None of these steps involve anything mysterious—hackers and criminals with human-level intelligence have already done all of these things using just the internet. (p. 146-147)
The security implications of this kind of AGI are already serious, but Ord then goes on to discuss how such an AGI could further scale up its intelligence. Presumably it could even compromise relatively secure military systems.
At this point it might be tempting to compare the creation of AGI to the development of nuclear weapons, as one state developing it first could render other countries defenceless. But there are important differences between AGI and nuclear weapons.
First, AGI could be quite surgical in the systems it targets and compromises. Whereas a leader may be reluctant to deploy a nuclear weapon due to the immense collateral damage it would wreak, the same reluctance may not be felt when deploying an AGI.
Second, an AGI is far easier to employ discreetly. It is impossible to miss the explosion of a nuclear bomb in any populated area. But how can you know if there is an AGI lurking in your radar detection or military communications system? The hack may take place well before exploit, making it harder to know how secure your defences actually are.
How aligned AGI could alter the probability of conflict
Having set the scene, we should now ask ourselves the broadest question: would such technology make war more likely?
In an article discussing the causes of conflict, Jackson and Morelli note that the prerequisites for a war to break out between rational actors involve 1) the cost not being overwhelmingly high and 2) there being no ability to reach a mutually beneficial and enforceable agreement.
AGI would cause both 1) and 2) to change in a way that makes war more likely. The cost of conflict could fall for two reasons. First, an AGI would be costless to replicate, like any program. Second, the fact that a war could start in cyberspace may also lead to the belief that it could be contained there, meaning that there would be fewer human casualties.
But it seems that part 2) changes more radically. If AGI is sufficiently advanced, it may be undetectable, making any non-aggression pact difficult to enforce. Moreover, the fact that AGI as software is essentially costless to replicate would allow the aggressor to choose unexpected and obscure targets: it may make sense to disrupt a competitor by being a nuisance in countless small ways, something which would not have been feasible when intelligent manpower was quite scarce.
There is therefore good reason to think that AGI would increase the probability of war. But this brief analysis is flawed, as it does not take into account the fact the possibility that AGI could self-improve at a rate that would render any rivals uncompetitive—a scenario known as fast take-off, under which would the probability of war would probably fall.
AGI: Fast take-off
If AGI is marked by fast take-off, this implies that one country would gain a decisive and permanent lead.
If the Hegemon attains this lead, there is probably little that the Challenger could do to stop it. As the weaker party, a pre-emptive strike would not be a feasible option. The Challenger could try to insulate itself from external influence, but they would be racing against time if the Hegemon became determined to assert its power. At a minimum, the Challenger would need some defensive systems to be completely free from digital interference. But even then, AGI could boost the capabilities of conventional kinetic weapons to such a degree such that that cyber-insulation would be futile.
A slightly more interesting scenario is if the Challenger is on the verge of developing AGI, and is closer to this than the Hegemon. Would the Hegemon launch a pre-emptive strike to maintain its superior position? This is quite similar to the dilemma the US faced as the Soviets developed their nuclear capability. At the time, some such as John von Neumann and Bertrand Russell were in favour of such a pre-emptive strike. Given the extreme costs of that, US leadership shied away from following it through.
The calculus for the hegemon would be less favourable in the context of AGI because it does not possess a decisive strategic advantage: other countries have nuclear capabilities, and therefore the Hegemon is vulnerable to nuclear retaliation if it tries to prevent the Challenger developing AGI. Even if the Hegemon tried to push for an agreement to prevent the development of weaponized AGI, this would unenforceable. Such agreements can partially work for nuclear weapons because testing a missile is detectable. The same would not be true for developing or testing an AGI.
At this point, it is worth noting that a fast take-off could introduce a large degree of asymmetric information: one side will know much more about its capabilities than the other side. In this context, miscalculations could be likely. Given the stakes involved, it would also not be surprising if much of this work was done in secret, in order to prevent any pre-emptive moves from the other side.
But fast take-off is simply one scenario. How would the calculus of the two states change in the context of slow take-off?
AGI: Slow take-off
Under this scenario, there would be many AGIs of broadly similar ability, with some better in some dimensions than others. Let’s assume that the two states have roughly similar levels of AGI technology.
A tenuous technological lead would decrease the incentive of launching a pre-emptive strike. So what the optimal strategy should be in this context is far less clear, if indeed there would a dominant strategy at all.
One possibility is that the overall outcome will depend upon whether offensive or defensive capabilities are greater. Bruce Schneier, noting that the status quo favours the offensive regarding cyberattacks, believes that improvements in AI will boost the defensive. The idea here is that cyberdefence is currently overly reliant on humans, whereas a computer can scan for vulnerabilities and launch attacks at speed and at scale. Growing defensive capabilities would therefore decrease the probability of an outbreak of conflict.
However, in an article for War on the Rocks, Ben Garfinkel and Allan Dafoe noted that historical attempts to guess this balance have been wrong: before WW1, it was assumed that offense had the advantage, but the opposite turned out to be true.
While these authors seem to be referring to AI rather than AGI, there are other reasons to think that a slow take-off scenario would decrease the probability of conflict.
First, if future forms of AGI differ widely, then both sides may want to prevent their knowledge falling into the hands of the other side, something presumably more likely to happen if AGI is deployed. That a qualitative race is less likely to lead to war is also an idea supported by Huntington, who theorized in a 1958 article that quantitative races are more likely to lead to war than qualitative races. This is because with qualitative races, an innovative breakthrough could jumble the ranking, whereas a quantitative race could allow a state to build a definitive and long-lasting lead. Moreover, quantitative races are more expensive, leading to greater efforts to radicalize the population against the enemy—emotions which could later force conflict.
As progress in AI is likely to be mainly qualitative, its growing importance in security matters would seem to reduce the probability of war. However, even a quantitative race could imply a lower probability of war, as Garfinkel and Dafoe stated in the article cited above. They use the example of drone swarms and cyberattacks to illustrate that while initial deployments of such technology would advantage an attacker, a continued build-up on both sides would eventually favour the defending side as they would be able to identify and plug any gaps in their defence.
At this point, it is important to note that we should be too confident in these models. Given the potentially surgical and under-the-radar nature of AGI, the nature of the conflict could also change in a way such that past models of conflict would become obsolete. For example, if we arrive at a world where AGI is undetectable and there is low differentiation in AGI forms, the emphasis may be on exploiting any vulnerability as soon as it arises. This could give rise to an attritional, under-the-radar conflict, where the objective may be to hinder the productive potential of the other side by generally being a nuisance. The point would be to do damage just below the threshold of detection, implying numerous small incidents rather than a few large strikes. This would hinder the productive potential of the opposing side, making them less threatening as a competitor in the international sphere.
If AGI takes a great deal of computational power, then this scenario could obviously result in a huge drain on resources. Perhaps one boundary on how intense that war would get would be just the strain on resources it would cause.
How would the world escape from such an equilibrium?
Walled gardens and customised infrastructures could become more common. Such a world could be less likely to feature for example the same piece of infrastructure being reused, because a breakthrough in one place would allow for a breakthrough in many places.
But for this to be feasible in geopolitical terms, there would need to be a broad equality in AGI technology across the different blocs.
Conclusion
We can draw a number of tentative conclusions from the above.
In a scenario with fast take-off, it is likely that a single state (or part thereof) would gain the capacity to become a hegemon, and it could be impossible for other states to catch up. While this would probably eliminate the possibility of a war (through either immense power of persuasion or outright coercion), states or groups that have acquired untrammelled power have rarely been benign. And history would never have seen an example so stark as this. The flaws of human nature thus leave much to be pessimistic about in this scenario.
In the event of slow take-off, different states or coalitions thereof will remain competitive. But if knowledge of the AGI capabilities of other states is opaque, and if AGI intrusions remain undetectable, then this could make war more likely due to higher amounts of asymmetric information and the greater difficulty of enforcing any agreement.
However, some characteristics of AGI could reduce the chance of war breaking out. First, AGI could eventually boost defensive capabilities more than offensive capabilities, making war (or at least cyberwar) a less attractive option for the state considering it. Second, the fact that it is more likely to trigger a qualitative race would make definitive leads less likely and incur less of a strain on the states concerned, making rallying the population less necessary.
Finally, even if we have discussed the chance of an outbreak of conflict, the form conflict could take is still unknown. In addition, the cost would greatly depend on whether cyberwar could be contained in cyberspace.
This all assumes that AGI does whatever its supposed operator wants it to do, and that other parties believe as much? I think the first part of this is very false, though the second part alas seems very realistic, so I think this misses the key thing that makes an AGI arms race lethal.
I expect that a dignified apocalypse looks like, “We could do limited things with this software and hope to not destroy the world, but as we ramp up the power and iterate the for-loops more times, the probability of destroying the world goes up along a logistic curve.” In “relatively optimistic” scenarios it will be obvious to operators and programmers that this curve is being ascended—that is, running the for-loops with higher bounds will produce an AGI with visibly greater social sophistication, increasing big-picture knowledge, visible crude attempts at subverting operators or escaping or replicating outside boxes, etc. We can then imagine the higher-ups demanding that crude patches be applied to get rid of the visible problems in order to ramp up the for-loops further, worrying that, if they don’t do this themselves, the Chinese will do that first with their stolen copy of the code. Somebody estimates a risk probability, somebody else tells them too bad, they need to take 5% more risk in order to keep up with the arms race. This resembles a nuclear arms race and deployment scenario where, even though there’s common knowledge that nuclear winter is a thing, you still end up with nuclear winter because people are instructed to incrementally deploy another 50 nuclear warheads at the cost of a 5% increase in triggering nuclear winter, and then the other side does the same. But this is at least a relatively more dignified death by poor Nash equilibrium, where people are taking everything as seriously as they took nuclear war back in the days when Presidents weren’t retired movie actors.
In less optimistic scenarios that realistically reflect the actual levels of understanding being displayed by programmers and managers in the most powerful organizations today, the programmers themselves just patch away the visible signs of impending doom and keep going, thinking that they have “debugged the software” rather than eliminated visible warning signs, being in denial for internal political reasons about how this is climbing a logistic probability curve towards ruin or how fast that curve is being climbed, not really having a lot of mental fun thinking about the doom they’re heading into and warding that off by saying, “But if we slow down, our competitors will catch up, and we don’t trust them to play nice” along of course with “Well, if Yudkowsky was right, we’re all dead anyways, so we may as well assume he was wrong”, and generally skipping straight to the fun part of running the AGI’s for-loops with as much computing power as is available to do the neatest possible things; and so we die in a less dignified fashion.
My point is that what you depict as multiple organizations worried about what other organizations will successfully do with an AGI being operated at maximum power, which is believed to do whatever its operator wants to do, reflects a scenario where everybody dies really fast, because they all share a mistaken optimistic belief about what happens when you operate AGIs at increasing capability. The real lethality of the arms race is that blowing past hopefully-visible warning signs or patching them out, and running your AGI at increasing power, creates an increasing risk of the whole world ending immediately. Your scenario is one where people don’t understand that and think that AGIs do whatever the operators want, so it’s a scenario where the outcome of the multipolar tensions is instant death as soon as the computing resources are sufficient for lethality.
Thanks for your comment.
If someone wants to estimate the overall existential risk attached to AGI, then it seems fitting that they would estimate the existential risk attached to the scenarios where we have 1) only unaligned AGI, 2) only aligned AGI, or 3) both. The scenario you portray is a subset of 1). I find it plausible. But most relevant discussion on this forum is devoted to 1) so I wanted to think about 2). If some non-zero probability is attached to 2), that should be a useful exercise.
I thought it was clear I was referring to Aligned AGI in the intro and the section heading. And of course, exploring a scenario doesn’t mean I think it is the only scenario that could materialise.
My point is that plausible scenarios for Aligned AGI give you AGI that remains aligned only when run within power bounds, and this seems to me like one of the largest facts affecting the outcome of arms-race dynamics.
Thanks for the clarification. If that’s the plausible scenario for Aligned AGI, then I was drawing a sharper line between Aligned and Unaligned than was warranted. I will edit some part of the text on my website to reflect that.
Ok, so lets assume that the alignment work has been done and solved. (Big assumption) I don’t really see this as a game of countries, more a game of teams.
The natural size of the teams is the set of people who have fairly detailed technical knowledge about the AI, and are working together. I suspect that non-technical and unwanted bureaucrats that push their noses into an AI project will get much lip service and little representation in the core utility function.
You would have say an openAI team. In the early stages of covid, a virus was something fairly easy for politicians to understand, and all the virologists had incentive to shout “look at this”. AGI is harder to understand, and the people at openAI have good reason not to draw too much government attention, if they expect the government to be nasty or coercive.
The people at openAI and deepmind are not enemies that want to defeat each other at all costs, some will be personal friends. Most will be after some sort of broadly utopian AI helps humanity future. Most are decent people. I predict neither side will want to bomb the other, even if they have the capability. There may be friendly rivalry or outright cooperation.
Thanks for your comment. This is something I should have stated a bit more explicitly.
When I mentioned “single state (or part thereof)”, the part thereof was referring to these groups or groups in other countries that are yet to be formed.
I think the chance of government intervention is quite high in the slow take-off scenario. It’s quite likely that any group successfully working on AGI will slowly but noticeably start to accumulate a lot of resources. If that cannot be concealed, it will start to attract a lot of attention. I think it is unlikely that the government and state bureaucracy would be content to let such resources accumulate untouched e.g. the current shifting attitude to Big Tech in Brussels and Washington.
In a fast take-off scenario, I think we can frame things more provocatively: the group that develops AGI either becomes the government, or the government takes control while it still can. I’m not sure what the relative probabilities are here, but in both circumstances you end up with something that will act like a state, and be treated as a state by other states, which is why I model them like a state in my analysis. For example, even if OpenAI and DeepMind are friendly to each other, and that persists over decades, I can easily imagine the Chinese state trying to develop an alternative that might not be friendly to those two groups, especially if the Chinese government perceive them as promoting a different model of government.
Is this an April Fools joke? This article claims that Leo Szilard supported a nuclear first-strike on the Soviet Union. Nothing could be further from the truth. I would know. Gene Dannen
I was surprised by that as well, but I took that from an article by Jules Lobel, Professor of Law, University of Pittsburgh Law School based on a book he wrote:
For that claim he in turn cites Marc Trachtenberg’s History and Strategy, which I do not have access to.
You can read that page of Trachtenberg’s book in Google Books, as I just have, by googling Marc Trachtenberg Szilard. Trachtenberg misunderstood what Szilard wrote in the references he cited. I just reviewed those also.
Leo Szilard has been my research focus for decades.
It does seem that the Trachtenberg reference basically relies upon individual recollections (which I don’t trust), and the following extract from a 1944 letter by Szilard to Vannevar Bush (my bold):
While one could make the argument there that he is arguing for a pre-emptive strike, it is sufficiently ambiguous (controlling by force could also mean conventional forces, and “used” could imply a demonstration rather then a deployment on a city) that I would prefer to delete the reference to Szilard in this article. Also because I’ve seen many more instances where this view was attributed to Russell and von Neumann, but this is the only case where it has been attributed to Szilard.