How well will policy-makers handle AGI? (initial findings)
Cross-posted from MIRI’s blog.
MIRI’s mission is “to ensure that the creation of smarter-than-human intelligence has a positive impact.” One policy-relevant question is: How well should we expect policy makers to handle the invention of AGI, and what does this imply about how much effort to put into AGI risk mitigation vs. other concerns?
To investigate these questions, we asked Jonah Sinick to examine how well policy-makers handled past events analogous in some ways to the future invention of AGI, and summarize his findings. We pre-committed to publishing our entire email exchange on the topic (with minor editing), just as with our project on how well we can plan for future decades. The post below is a summary of findings from our full email exchange (.docx) so far.
As with our investigation of how well we can plan for future decades, we decided to publish our initial findings after investigating only a few historical cases. This allows us to gain feedback on the value of the project, as well as suggestions for improvement, before continuing. It also means that we aren’t yet able to draw any confident conclusions about our core questions.
The most significant results from this project so far are:
We came up with a preliminary list of 6 seemingly-important ways in which a historical case could be analogous to the future invention of AGI, and evaluated several historical cases on these criteria.
Climate change risk seems sufficiently disanalogous to AI risk that studying climate change mitigation efforts probably gives limited insight into how well policy-makers will deal with AGI risk: the expected damage of climate change appears to be very small relative to the the expected damage due to AI risk, especially when one looks at expected damage to policy makers.
The 2008 financial crisis appears, after a shallow investigation, to be sufficiently analogous to AGI risk that it should give us some small reason to be concerned that policy-makers will not manage the invention of AGI wisely.
The risks to critical infrastructure from geomagnetic storms are far too small to be in the same reference class with risks from AGI.
The eradication of smallpox is only somewhat analogous to the invention of AGI.
Jonah performed very shallow investigations of how policy-makers have handled risks from cyberwarfare, chlorofluorocarbons, and the Cuban missile crisis, but these cases need more study before even “initial thoughts” can be given.
We identified additional historical cases that could be investigated in the future.
Further details are given below. For sources and more, please see our full email exchange (.docx).
6 ways a historical case can be analogous to the invention of AGI
In conversation, Jonah and I identified six features of the future invention of AGI that, if largely shared by a historical case, seem likely to allow the historical case to shed light on how well policy-makers will deal with the invention of AGI:
AGI may become a major threat in a somewhat unpredictable time.
AGI may become a threat when the world has very limited experience with it.
A good outcome with AGI may require solving a difficult global coordination problem.
Preparing for the AGI threat adequately may require lots of careful work in advance.
Policy-makers have strong personal incentives to solve the AGI problem.
A bad outcome with AGI would be a global disaster, and a good outcome with AGI would have global humanitarian benefit.
More details on these criteria and their use are given in the second email of our full email exchange.
Risks from climate change
People began to see climate change as a potential problem in the early 1970s, but there was some ambiguity as to whether human activity was causing warming (because of carbon emissions) or cooling (because of smog particles). The first IPCC report was issued in 1990, and stated that were was substantial anthropogenic global warming due to greenhouse gases. By 2001, there was a strong scientific consensus behind this claim. While policy-makers’ response to risks from climate change might seem likely to shed light on whether policy-makers will deal wisely with AGI, there are some important disanalogies:
The harms of global warming are expected to fall disproportionately on disadvantaged people in poor countries, not on policy-makers. So policy-makers have much less personal incentive to solve the problem than is the case with AGI.
In the median case, humanitarian losses from global warming seems to be about 20% of GDP per year for the poorest people. In light of anticipated economic development and marginal diminishing utility, this is a much smaller negative humanitarian impact than AGI risk (even ignoring future generations). For example, economist Indur Goklany estimated that “through 2085, only 13% of [deaths] from hunger, malaria, and extreme weather events (including coastal flooding from sea level rise) should be from [global] warming.”
Thus, potential analogies to AGI risk come from climate change’s tail risk. But there seem to be few credentialed scientists who have views compatible with a prediction that even a temperature increase in the 95th percentile of the probability distribution (by 2100) would do more than just begin to render some regions of Earth uninhabitable.
According to the 5th IPCC, the risk of human extinction from climate change seems very low: “Some thresholds that all would consider dangerous have no support in the literature as having a non-negligible chance of occurring. For instance, a ‘runaway greenhouse effect’—analogous to Venus—appears to have virtually no chance of being induced by anthropogenic activities.”
The 2008 financial crisis
Jonah did a shallow investigation of the 2008 financial crisis, but the preliminary findings are interesting enough for us to describe them in some detail. Jonah’s impressions about the relevance of the 2008 financial crisis to the AGI situation are based on a reading of After the Music Stopped by Alan Blinder, who was the vice chairman of the federal reserve for 1.5 years during the Clinton administration. Naturally, many additional sources should be consulted before drawing firm conclusions about the relevance of policy-makers’ handling of the financial crisis to their likelihood of handling AGI wisely.
Blinder’s seven main factors leading to the recession are (p. 27):
Inflated asset prices, especially of houses (the housing bubble) but also of certain securities (the bond bubble);
Excessive leverage (heavy borrowing) throughout the financial system and the economy;
Lax financial regulation, both in terms of what the law left unregulated and how poorly the various regulators performed their duties;
Disgraceful banking practices in subprime and other mortgage lending;
The crazy-quilt of unregulated securities and derivatives that were built on these bad mortgages;
The abysmal performance of the statistical rating agencies, which helped the crazy-quilt get stitched together; and
The perverse compensation systems in many financial institutions that created powerful incentives to go for broke.
With these factors in mind, let’s look at the strength of the analogy between the 2008 financial crisis and the future invention of AGI:
Almost tautologically, a financial crisis is unexpected, though we do know that financial crises happen with some regularity.
The 2008 financial crisis was not unprecedented in kind, only in degree (in some ways).
Avoiding the 2008 financial crisis would have required solving a difficult national coordination problem, rather than a global coordination problem. Still, this analogy seems fairly strong. As Jonah writes, “While the 2008 financial crisis seems to have been largely US specific (while having broader ramifications), there’s a sense in which preventing it would have required solving a difficult coordination problem. The causes of the crisis are diffuse, and responsibility falls on many distinct classes of actors.”
Jonah’s analysis wasn’t deep enough to discern whether the 2008 financial crisis is analogous to the future invention of AGI with regard to how much careful work would have been required in advance to avert the risk.
In contrast with AI risk, the financial crisis wasn’t a life or death matter for almost any of the actors involved. Many people in finance didn’t have incentives to avert the financial crisis: indeed, some of the key figures involved were rewarded with large bonuses. But it’s plausible that government decision makers had incentive to avert a financial crisis for reputational reasons, and many interest groups are adversely affected by financial crises.
Once again, the scale of the financial crisis wasn’t on a par with AI risk, but it was closer to that scale than the other risks Jonah looked at in this initial investigation.
Jonah concluded that “the conglomerate of poor decisions [leading up to] the 2008 financial crisis constitute a small but significant challenge to the view that [policy-makers] will successfully address AI risk.” His reasons were:
The magnitude of the financial crisis is nontrivial (even if small) compared with the magnitude of the AI risk problem (not counting future generations).
The financial crisis adversely affected a very broad range of people, apparently including a large fraction of those people in positions of power (this seems truer here than in the case of climate change). A recession is bad for most businesses and for most workers. Yet these actors weren’t able to recognize the problem, coordinate, and prevent it.
The reasons that policy-makers weren’t able to recognize the problem, coordinate, and prevent it seem related to reasons why people might not recognize AI risk as a problem, coordinate, and prevent it. First, several key actors involved seem to have exhibited conspicuous overconfidence and neglect of tail risk (e.g. Summers, etc. ignoring Brooksley Born’s warnings about excessive leverage). If true, this shows that people in positions of power are notably susceptible to overconfidence and neglect of tail risk. Avoiding overconfidence and giving sufficient weight to tail risk may be crucial in mitigating AI risk. Second, one gets a sense that bystander effect and tragedy of the commons played a large role in the case of the financial crisis. There are risks that weren’t adequately addressed because doing so didn’t fall under the purview of any of the existing government agencies. This may have corresponded to a mentality of the type “that’s not my job — somebody else can take care of it.” If people think that AI risk is large, then they might think “if nobody’s going to take care of it then I will, because otherwise I’m going to die.” But if people think that AI risk is small, they might think “This probably won’t be really bad for me, and even though someone should take care of it, it’s not going to be me.”
Risks from geomagnetic storms
Large geomagnetic storms like the 1859 Carrington Event are infrequent, but could cause serious damage to satellites and critical infrastructure. See this OECD report for an overview.
Jonah’s investigation revealed a wide range in expected losses from geomagnetic storms, from $30 million per year to $30 billion per year. But even this larger number amounts to $1.5 trillion in expected losses over the next 50 years. Compare this with the losses from the 2008 financial crisis (roughly a 1 in 50 years event), which are estimated to be about $13 trillion for Americans alone.
Though serious, the risks from geomagnetic storms appear to be small enough to be disanalogous to the future invention of AGI.
The eradication of smallpox
Smallpox, after killing more than 500 million people over the past several millennia, was eradicated in 1979 after a decades-long global eradication effort. Though a hallmark of successful global coordination, it doesn’t seem especially relevant to whether policy-makers will handle the invention of AGI wisely.
Here’s how the eradication of smallpox does our doesn’t fit our criteria for being analogous to the future invention of AGI:
Smallpox didn’t arrive at an unpredictable time; it arrived millennia before the eradication campaign.
The world didn’t have experience eradicating a disease before smallpox was eradicated, but a number of nations had eliminated smallpox.
Smallpox eradication required solving a difficult global coordination problem, but in a way disanalogous to the invention of AGI safety (see the other points on this list).
Preparing for smallpox eradication required effort in advance in some sense, but the effort had mostly already been exerted before the campaign was announced.
Nations without smallpox had incentive to eradicate smallpox so that they didn’t have to spend money to immunize citizens so that the virus would not be (re)-introduced to their countries. For example, in 1968, the United States spent about $100 million on routine smallpox vaccinations.
Smallpox can be thought of as a global disaster: by 1966, about 2 million people died of smallpox each year.
Shallow investigations of risks from cyberwarfare, chlorofluorocarbons, and the Cuban missile crisis
Jonah’s shallow investigation of risks from cyberwarfare revealed that experts disagree significantly about the nature and scope of these risks. It’s likely that dozens of hours of research would be required to develop a well-informed model of these risks.
To investigate how policy-makers handled the discovery that chlorofluorocarbons (CFCs) depleted the ozone layer, Jonah summarized the first 100 pages of Ozone Crisis: The 15-Year Evolution of a Sudden Global Emergency (see our full email exchange for the summary). This historical case seems worth investigating further, and may be a case of policy-makers solving a global risk with surprising swiftness, though whether the response was appropriately prompt is debated.
Jonah also did a shallow investigation of the Cuban missile crisis. It’s difficult to assess how likely it was for the crisis to escalate into a global nuclear war, but it appears that policy-makers made many poor decisions leading up to and during the Cuban missile crisis (see our full email exchange for a list). Jonah concludes:
even if the probability of the Cuban missile crisis leading to an all out nuclear war was only 1% or so, the risk was still sufficiently great so that the way in which the actors handled the situation is evidence against elites handling the creation of AI well. (This contrasts with the situation with climate change, in that elites had strong personal incentives to avert an all-out nuclear war.)
However, this is only a guess based on a shallow investigation, and should not be taken too seriously before a more thorough investigation of the historical facts can be made.
Additional historical cases that could be investigated
We also identified additional historical cases that could be investigated for potentially informative analogies to the future invention of AGI:
The 2003 Iraq War
The frequency with which dictators are deposed or assassinated due to “unforced errors” they made
One way that the banking crisis is similar to AGI, and not in a way that cheers me up, is that people were making money in the lead-up—they didn’t want it to be over because they were riding the boom. Coming up with near-AGI—self-improving programs which aren’t very generalized—is going to be very advantageous.
Also the ways they were making money were very technical, so people with technical skillsets that might be useful in mitigating risk were drawn in to making money rather than risk mitigation.
The invention of nuclear weapons seems like the overwhelmingly best case study.
New threat/power comes from fundamental new scientific insight.
Existential risks (nuclear winter, run-away nitrogen fusion in atmosphere).
Massive potential effects, both positive and negative (nuclear power for everything, medical treatments, dam building and other manipulation of Earth’s crust, space exploration, elimination of war, nuclear war, increased asymmetric warfare, reactor meltdowns, increased stability of dictatorships). Some were realized.
Very large first-mover advantage with times scales of less than a year.
Feasible development in secret.
Nuclear weapons differed in that the world was already at war when they were developed, so policy makers would be in a different mindset and have different incentives. But otherwise, I think the parallels are as good as you could possibly hope for. The only other competitor is the (overly broad) case of molecular nano-tech, but this hasn’t actually happened yet so you don’t have much to go on. In contrast, the Manhattan Project is extensively documented.
Is there a reason why the World Wars were not examined? The second one was even predicted by establishment figures (such as John Maynard Keynes) decades in advance yet still wasn’t averted.
With regards the financial crisis, I recommend Why did so many people make ex post bad decisions, which summarises various pieces of research to argue that the basic problem was over-optimism with regard housing prices, whereas asymmetric information was not a big issue.
If FOOMing doesn’t move us past the near/barely trans-human level too quickly, another policy area to consider could be immigration. Humans have a bad history of responding to outgroups and the patterns of those responses seem very similar across political and social conditions. Obviously just a piece of the puzzle, but might be worth tossing into the mix.
ds