I’m a bit skeptical about calling this an “AI governance” problem. This sounds more like “governance” or maybe “existential risk governance”—if future technologies make irreversible destruction increasingly easy, how can we govern the world to avoid certain eventual doom?
Handling that involves political challenges, fundamental tradeoffs, institutional design problems, etc., but I don’t think it’s distinctive to risks posed by AI, don’t think that a solution necessarily involves AI, don’t think it’s right to view “access to TAI” as the only or primary lever of political power to prevent destructive acts, and I’m not convinced that this problem should be addressed by a community focused on AI in particular.
It seems good for people to think about the general long-term challenge as well as to think about the concrete possible destructive technologies on the horizon, in case there is narrower work that can help mitigate the risks they pose and thereby delay the need to implement a general solution. But in some sense this is just “delaying the inevitable.”
One potential difference is that I don’t see TAI as automatically posing a catastrophic risk. Alignment itself could pose a catastrophic risk. But if we resolve that, then I think we get some (unknown) amount of subjective time until the next thing goes wrong, which might be AI enabling access to destructive physical technology or might be something more conceptually gnarly. The further off that next risk is, the more political change is likely to happen in the interim.
This is an interesting point. But I’m not convinced, at least immediately, that this isn’t likely to be largely a matter of AI governance.
There is a long list of governance strategies that aren’t specific to AI that can help us handle perpetual risk. But there is also a long list of strategies that are. I think that all of the things I mentioned under strategy 2 have AI specific examples:
establishing regulatory agencies, auditing companies, auditing models, creating painful bureaucracy around building risky AI systems, influencing hardware supply chains to slow things down, and avoiding arms races.
And I think that some of the things I mentioned for strategy 3 do too:
giving governments powers to rapidly detect and respond to firms doing risky things with TAI, hitting killswitches involving global finance or the internet, cybersecurity, and generally being more resilient to catastrophes as a global community.
So ultimately, I won’t make claims about whether avoiding perpetual risk is mostly an AI governance problem or mostly a more general governance problem, but certainly there are a bunch of AI specific things in this domain. I also think they might be a bit neglected relative to some of the strategy 1 stuff.
I’m a bit skeptical about calling this an “AI governance” problem. This sounds more like “governance” or maybe “existential risk governance”—if future technologies make irreversible destruction increasingly easy, how can we govern the world to avoid certain eventual doom?
Handling that involves political challenges, fundamental tradeoffs, institutional design problems, etc., but I don’t think it’s distinctive to risks posed by AI, don’t think that a solution necessarily involves AI, don’t think it’s right to view “access to TAI” as the only or primary lever of political power to prevent destructive acts, and I’m not convinced that this problem should be addressed by a community focused on AI in particular.
It seems good for people to think about the general long-term challenge as well as to think about the concrete possible destructive technologies on the horizon, in case there is narrower work that can help mitigate the risks they pose and thereby delay the need to implement a general solution. But in some sense this is just “delaying the inevitable.”
I wrote some of my thoughts on this relationship in Handling destructive technology.
One potential difference is that I don’t see TAI as automatically posing a catastrophic risk. Alignment itself could pose a catastrophic risk. But if we resolve that, then I think we get some (unknown) amount of subjective time until the next thing goes wrong, which might be AI enabling access to destructive physical technology or might be something more conceptually gnarly. The further off that next risk is, the more political change is likely to happen in the interim.
This is an interesting point. But I’m not convinced, at least immediately, that this isn’t likely to be largely a matter of AI governance.
There is a long list of governance strategies that aren’t specific to AI that can help us handle perpetual risk. But there is also a long list of strategies that are. I think that all of the things I mentioned under strategy 2 have AI specific examples:
And I think that some of the things I mentioned for strategy 3 do too:
So ultimately, I won’t make claims about whether avoiding perpetual risk is mostly an AI governance problem or mostly a more general governance problem, but certainly there are a bunch of AI specific things in this domain. I also think they might be a bit neglected relative to some of the strategy 1 stuff.