Thanks for this post! I’ve been thinking a lot about AI governance strategies and their robustness/tractability lately, much of which feels like a close match to what you’ve written here.
For many AI governance strategies, I think we are more clueless than many seem to assume about whether a strategy ends up positively shaping the development of AI or backfiring in some foreseen or unforeseen way. There are many crucial considerations for AI governance strategies, miss or get one wrong and the whole strategy can fall apart, or become actively counterproductive. What I’ve been trying to do is:
Draft a list of trajectories for how the development and governance of AI up until we get to TAI, estimating the likelihood and associated xrisk from AI for each trajectory.
e.g. “There ends up being no meaningful international agreements or harsh regulation and labs race each other until TAI. Probability of trajectory: 10%, Xrisk from AI for scenario: 20%.”
Draft a list of AI governance strategies that can be pursued.
e.g. “push for slowing down frontier AI development by licensing the development of large models above a compute threshold and putting significant regulatory burden on them”.
For each combination of trajectory and strategy, assess whether we are clueless about the what the sign before the impact of said strategy would be, or if the strategy would be robustly good (~predictably lowers xrisk from AI in expectation), at least for this trajectory. A third option would of course be robustly bad.
e.g. “Clueless, it’s not clear which consideration should have more weight, and backfiring could be as bad as success is good.
+ This strategy would make this trajectory less likely and possibly shift it to a trajectory with lower xrisk from AI.
- Getting proper international agreement seems unlikely for this pessimistic trajectory. Partial regulation could disproportionally slow down good actors, or lead to open source proliferation and increases misuse risk.”
Try to identify strategies that are robust across a wide array of trajectories.
I’m just winging it without much background in how such foresight-related work is normally done, so any thoughts or feedback on how to approach this kind of investigation, or what existing foresight frameworks you think would be particularly helpful here are very much appreciated!
Any thoughts or feedback on how to approach this kind of investigation, or what existing foresight frameworks you think would be particularly helpful here are very much appreciated!
As I mentioned in the post, I think the Canadian and Singapore governments are both the best governments in this space, to my knowledge.
Fortunately, some organizations have created rigorous foresight methods. The top contenders I came across were Policy Horizons Canada within the Canadian Federal Government and the Centre for Strategic Futures within the Singaporean Government.
As part of this kind of work, you want to be doing scenario planning multiple levels down. How does AI interact with VR? Once you have that, how does it interact with security and defence? How does this impact offensive work? What are the geopolitical factors that work their way in? Does public sentiment through job loss impact the development of these technologies in some specific ways? For example, you might have more powerful pushback from industries with more distinguished, intelligent, heavily regulated industries with strong union support.
Aside from that, you might want to reach out to the Foresight Institute, though I’m a bit more skeptical that their methodology will help here (though I’m less familiar with it and like the organizers overall).
I also think that looking at the Malicious AI Report from a few years ago for some inspiration would be helpful, particularly because they held a workshop with people of different backgrounds. There might be some better, more recent work I’m unaware of.
Additionally, I’d like to believe that this post was a precursor to Vitalik’s post on d/acc (defensive accelerationism), so I’d encourage you to look at that.
Another thing to look into are companies that are in the cybersecurity space. I think we’ll be getting more AI Safety pilled orgs in this area soon. Lekara is an example of this, I met two employees and they essentially told me that the vision is to embed themselves into companies and then continue to figure out how to make AI safer and the world more robust once they are in that position.
There are also more organizations popping up, like the Center for AI Policy, and my understanding is that Cate Hall is starting an org that focuses on sensemaking (and grantmaking) for AI Safety.
If you or anyone is interested in continuing this kind of work, send me a DM. I’d be happy to help provide guidance in the best way I can.
Lastly, I will note that I think people have generally avoided this kind of work because “if you have a misaligned AGI, well, you are dead no matter how robust you make the world or wtv you plan around it.” I think this view is misguided and I think you can potentially make our situation a lot better by doing this kind of work. I think recent discussions on AI Control (rather than Alignment) are useful in questioning previous assumptions.
Thanks for this post! I’ve been thinking a lot about AI governance strategies and their robustness/tractability lately, much of which feels like a close match to what you’ve written here.
For many AI governance strategies, I think we are more clueless than many seem to assume about whether a strategy ends up positively shaping the development of AI or backfiring in some foreseen or unforeseen way. There are many crucial considerations for AI governance strategies, miss or get one wrong and the whole strategy can fall apart, or become actively counterproductive. What I’ve been trying to do is:
Draft a list of trajectories for how the development and governance of AI up until we get to TAI, estimating the likelihood and associated xrisk from AI for each trajectory.
e.g. “There ends up being no meaningful international agreements or harsh regulation and labs race each other until TAI. Probability of trajectory: 10%, Xrisk from AI for scenario: 20%.”
Draft a list of AI governance strategies that can be pursued.
e.g. “push for slowing down frontier AI development by licensing the development of large models above a compute threshold and putting significant regulatory burden on them”.
For each combination of trajectory and strategy, assess whether we are clueless about the what the sign before the impact of said strategy would be, or if the strategy would be robustly good (~predictably lowers xrisk from AI in expectation), at least for this trajectory. A third option would of course be robustly bad.
e.g. “Clueless, it’s not clear which consideration should have more weight, and backfiring could be as bad as success is good.
+ This strategy would make this trajectory less likely and possibly shift it to a trajectory with lower xrisk from AI.
- Getting proper international agreement seems unlikely for this pessimistic trajectory. Partial regulation could disproportionally slow down good actors, or lead to open source proliferation and increases misuse risk.”
Try to identify strategies that are robust across a wide array of trajectories.
I’m just winging it without much background in how such foresight-related work is normally done, so any thoughts or feedback on how to approach this kind of investigation, or what existing foresight frameworks you think would be particularly helpful here are very much appreciated!
As I mentioned in the post, I think the Canadian and Singapore governments are both the best governments in this space, to my knowledge.
As part of this kind of work, you want to be doing scenario planning multiple levels down. How does AI interact with VR? Once you have that, how does it interact with security and defence? How does this impact offensive work? What are the geopolitical factors that work their way in? Does public sentiment through job loss impact the development of these technologies in some specific ways? For example, you might have more powerful pushback from industries with more distinguished, intelligent, heavily regulated industries with strong union support.
Aside from that, you might want to reach out to the Foresight Institute, though I’m a bit more skeptical that their methodology will help here (though I’m less familiar with it and like the organizers overall).
I also think that looking at the Malicious AI Report from a few years ago for some inspiration would be helpful, particularly because they held a workshop with people of different backgrounds. There might be some better, more recent work I’m unaware of.
Additionally, I’d like to believe that this post was a precursor to Vitalik’s post on d/acc (defensive accelerationism), so I’d encourage you to look at that.
Another thing to look into are companies that are in the cybersecurity space. I think we’ll be getting more AI Safety pilled orgs in this area soon. Lekara is an example of this, I met two employees and they essentially told me that the vision is to embed themselves into companies and then continue to figure out how to make AI safer and the world more robust once they are in that position.
There are also more organizations popping up, like the Center for AI Policy, and my understanding is that Cate Hall is starting an org that focuses on sensemaking (and grantmaking) for AI Safety.
If you or anyone is interested in continuing this kind of work, send me a DM. I’d be happy to help provide guidance in the best way I can.
Lastly, I will note that I think people have generally avoided this kind of work because “if you have a misaligned AGI, well, you are dead no matter how robust you make the world or wtv you plan around it.” I think this view is misguided and I think you can potentially make our situation a lot better by doing this kind of work. I think recent discussions on AI Control (rather than Alignment) are useful in questioning previous assumptions.