AI Governance across Slow/​Fast Takeoff and Easy/​Hard Alignment spectra

It has been suggested that in a rapid enough takeoff scenario, governance would not be useful, because the transition to superintelligence would be too rapid for human actors—whether governments, corporations, or individuals—to respond to. This seems to imply that we only care about takeoff speed. And if that is the only relevant factor, the case for governance only applies if you believe slow takeoff is likely. Of course, it also matters how long we have until takeoff—but even so, I think this leaves a fair amount on the table in terms of what governance could do, and I want to try to make the case that even in that world, governance (still defined broadly1) is important—though in different ways.

The Easy/​Hard Spectrum

To make the argument, I will lay out three possibilities about AI alignment which are orthogonal to takeoff speed and timing; alignment-by-default, prosaic alignment, and provable alignment. These are actually somewhat of a spectrum, with the three scenarios spaced along it. In any case, for each possibility, governance needs to accomplish very different things in order to be successful, according to the above definition—and the relationship with takeoff speeds seems important, but not fully determinative.

The first possibility, alignment-by-default, is that if we train systems via reinforcement learning or similar, then even without particular effort to solve alignment, all systems which are successful end up learning policies and goals close enough to human values that they are beneficial and influenceable. In the slower takeoff case, initially, governance looks a lot like human governance, making sure that actors, both human and AI, can cooperate and follow mutually understood and agreed upon rules. Later, and in the faster takeoff case, our efforts towards governance become irrelevant as the AI systems replace human structures, or improve them.

The second possibility, prosaic alignment, is that alignment of artificial intelligence systems is somewhat difficult, but achievable via approaches which can be developed. So some systems will be aligned, but without oversight, unaligned systems are possible or likely. In this case, the key task of governance is to ensure that all early HLMI/​PASTA/​AGI systems undergo robust alignment procedures. Prior to the emergence of such systems, many tasks will be useful for ensuring this outcome, including monitoring progress, developing standards, and building norms about safety. But as above, later and/​or in the faster takeoff cases, governance becomes less relevant. Note, however, that this means more emphasis is needed on pre-emergence and early stage efforts, rather than eliminating the need for governance.

The final possibility is that the only way alignment can occur is via currently-impossible provable alignment. In this case, it may be that there are few potential ways to train safe AGI, and almost all earlier attempts are dangerous. Somewhat similar to the previous case, the key task is to prevent misaligned systems. In a fast takeoff case, the entirety of the usefulness of governance is prior to emergence, perhaps via intensive monitoring or limits of compute, while in slow takeoff case, there is some chance that governance can prevent disaster while allowing work in AI, perhaps via some sort of policing, a la lsusr’s Bayeswatch.

Along the different spectra

There are now three different dimensions being discussed. The first is how long we have until takeoff begins, which determines how much time we have to solve the various problems. The second is difficulty of alignment, which I argued above determines the key task of governance, whether it is to prevent unaligned systems, or it is to ensure that systems are aligned. And lastly, there is the speed of takeoff, which determines how much time governance has to act once takeoff begins.

In this model, along the second two dimensions, as either speed or difficulty increases, the relative emphasis on pre-AGI governance increases, and the usefulness of governance during the transition decreases. This leaves us with effectively a single dimension, albeit still one that is orthogonal to when takeoff occurs. And while there are certainly a class of interventions which are helpful towards one end of the spectrum, but harmful on the other2, there is also the real possibility that we can find approaches which are beneficial in both cases.

As a few small examples of what these might look like, regardless of where on the spectrum we are, governance can reduce risks by 1) monitoring compute usage and capabilities to enable response, 2) vastly improving computer security for AI labs which could prevent or slow at least some forms of takeoff, and 3) building norms around care taken in development, testing, and deployment of proto-AGI systems.


1) Allan Dafoe has suggested that “AI governance concerns how humanity can best navigate the transition to a world with advanced AI systems.” This seems broadly correct, and to add to it, he has suggested it concerns “norms and institutions shaping how AI is built and deployed, as well as the policy and research efforts to make it go well.”

2) This analysis implies that the vast majority of governance efforts matter in slow takeoff /​ relatively easy alignment worlds, but are irrelevant or in some cases even harmful in faster takeoff /​ harder alignment worlds. This is an issue, but the existence of such tradeoffs alone does not imply that these approaches should not be seriously considered or pursued.

Thanks to Allan Dafoe for very helpful feedback on an earlier version of this.