Here are some of my thoughts after reflecting on this post for a day. These ideas are somewhat disconnected from one another but hopefully in aggregate provide some useful commentary on different aspects of the “reframing AI risk” proposal:
The results of the AI Safety Arguments Competition should be released soon (Thomas Woodside told me last week they were wrapping up review of argument submissions). If it went well, then we may see some compelling reframings coming out of that.
Projecting forward, I believe that as increasingly impressive large model and multimodal feats like GPT-N, DALL-E, etc. continue to be announced, it will become more natural for ML experts and other people following these developments to think and talk about AGI. Instead of seeming like a wacky far-off sci-fi idea, AGI starts to look like more something that’s coming down the pipe. So while I think reframing AI risk to get around AGI stigma might be useful for getting traction on alignment ideas sooner, I don’t think the benefit would be as large as it would be if you’re assuming that AGI skepticism is constant.
Changing the terminology has costs and risks. You touch on this in your post (the “Robust” point), but I think it’s worth emphasizing the care that needs to be taken with a sudden change. I worry that if we start using a term like “advanced software security” instead of “AGI alignment”, some ML researchers will be trying to decode what you’re saying. Then when they probe your models and realize what you’re talking about, their reaction will be “Oh look, the paranoid AGI transhumanists are getting sneaky and using euphemisms now”. Changing the terminology could also be confusing to people who are sympathetic or open to taking the risks seriously.
I thought this post from Matthew Yglesias was interesting, where he’s arguing for somewhat the opposite approach as this post. He says that we should embrace analogies to The Terminator movies to make AI risk more concrete and relatable, at least when discussing it with broader audiences.
There is something that has been bothering me about the dynamics of the ML/AI field for awhile now with respect to AGI risk—your post reminded me of it again such I’m going to finally try to write it down here. The opinions of ML researchers and engineers across the software industry are granted elevated respect on this topic. For example, the influential Grace et al. 2017 survey on AI timelines surveyed researchers from the NIPS/NeurIPS and ICML, which are conferences for people who work in ML broadly, not just for AGI researchers. Also, anecdotally, I have several friends who work in ML (but not on AGI research), and when I initially started talking to them about AGI risk they scoffed and looked down their noses (though over time have become more sympathetic).
My point is that working on narrow ML systems and AGI research are very different things. We currently treat anyone in the ML industry as having an expert opinion on AGI, but I don’t think we should. Working on a spam classifier model or self-driving cars does not qualify you to talk about AGI. In fact, it tends to bias you to think that AGI is more of a pipe dream than you would otherwise, because you’re accustomed to dealing with the embarrassing day-to-day failures of present-day ML systems, and you assume that’s the state of the art while not having time to read about recent research out of DeepMind and OpenAI. I would be interested in a reframing that rebalanced the default status/respect granted in AGI risk discussions from “anyone who works in ML” to “people who work at AGI research labs and people who have been specifically studying AGI”.
Here are some of my thoughts after reflecting on this post for a day. These ideas are somewhat disconnected from one another but hopefully in aggregate provide some useful commentary on different aspects of the “reframing AI risk” proposal:
The results of the AI Safety Arguments Competition should be released soon (Thomas Woodside told me last week they were wrapping up review of argument submissions). If it went well, then we may see some compelling reframings coming out of that.
I agree that AGI discussion has historically been stigmatized, but it seems to be becoming less so. DeepMind isn’t shy about using the term AGI on their website and in their podcast. Same with OpenAI who mentions “artificial general intelligence” in the headline of their About page, and whose CEO tweeted “AGI is gonna be wild” earlier this year. Elon Musk tweeted a few weeks ago that he thinks we’ll see AGI by 2029. Google Trends shows that searches on the terms “AGI” and “artificial intelligence” have been at historically high levels since around 2016.
Projecting forward, I believe that as increasingly impressive large model and multimodal feats like GPT-N, DALL-E, etc. continue to be announced, it will become more natural for ML experts and other people following these developments to think and talk about AGI. Instead of seeming like a wacky far-off sci-fi idea, AGI starts to look like more something that’s coming down the pipe. So while I think reframing AI risk to get around AGI stigma might be useful for getting traction on alignment ideas sooner, I don’t think the benefit would be as large as it would be if you’re assuming that AGI skepticism is constant.
Changing the terminology has costs and risks. You touch on this in your post (the “Robust” point), but I think it’s worth emphasizing the care that needs to be taken with a sudden change. I worry that if we start using a term like “advanced software security” instead of “AGI alignment”, some ML researchers will be trying to decode what you’re saying. Then when they probe your models and realize what you’re talking about, their reaction will be “Oh look, the paranoid AGI transhumanists are getting sneaky and using euphemisms now”. Changing the terminology could also be confusing to people who are sympathetic or open to taking the risks seriously.
I thought this post from Matthew Yglesias was interesting, where he’s arguing for somewhat the opposite approach as this post. He says that we should embrace analogies to The Terminator movies to make AI risk more concrete and relatable, at least when discussing it with broader audiences.
There is something that has been bothering me about the dynamics of the ML/AI field for awhile now with respect to AGI risk—your post reminded me of it again such I’m going to finally try to write it down here. The opinions of ML researchers and engineers across the software industry are granted elevated respect on this topic. For example, the influential Grace et al. 2017 survey on AI timelines surveyed researchers from the NIPS/NeurIPS and ICML, which are conferences for people who work in ML broadly, not just for AGI researchers. Also, anecdotally, I have several friends who work in ML (but not on AGI research), and when I initially started talking to them about AGI risk they scoffed and looked down their noses (though over time have become more sympathetic).
My point is that working on narrow ML systems and AGI research are very different things. We currently treat anyone in the ML industry as having an expert opinion on AGI, but I don’t think we should. Working on a spam classifier model or self-driving cars does not qualify you to talk about AGI. In fact, it tends to bias you to think that AGI is more of a pipe dream than you would otherwise, because you’re accustomed to dealing with the embarrassing day-to-day failures of present-day ML systems, and you assume that’s the state of the art while not having time to read about recent research out of DeepMind and OpenAI. I would be interested in a reframing that rebalanced the default status/respect granted in AGI risk discussions from “anyone who works in ML” to “people who work at AGI research labs and people who have been specifically studying AGI”.