how best to correct this communications gap (and prevent similar gaps in the future) between the two groups of people working on AI risk?
Convince the researchers at OpenAI, FHI and Open Phil, and maybe DeepMind and CHAI, that it’s not possible to get safe, competitive AI; then ask them to pass it on to governance researchers.
I have a feeling it’s not that simple. See the last part of “Generate evidence of difficulty” as a research purpose on biases. So for example I know at least one person who quit from an AI safety org (in part) because they became convinced that it’s too difficult to achieve safe, competitive AI (or at least the approach pursued by the org wasn’t going to work). Another person privately told me they have little idea how their research will eventually contribute to a safe, competitive AI, but hasn’t written anything like that publicly AFAIK. (And note that I don’t actually have that many opportunities to speak privately with other AI safety researchers.) Another thing is that most AI safety researchers probably don’t think it’s part of their job to “generate evidence of difficulty” so I have to convince them of that first.
Unless these problems are solved, I might be able to convince a few safety researchers to go to governance researchers and tell them they think it’s not possible to get safe, competitive AI, but their concerns will probably just be dismissed as outliers. I think a better step forward would be to build a private forum where these kinds of concerns can be more frankly discussed, as well as a culture where doing so is normative. This addresses some of the possible biases and I’m still not sure about the others.
This is pretty strongly different from my impressions, but I don’t think we could resolve the disagreement without talking about specific examples of people, so I’m inclined to set this aside.
i) are the kinds of transformative AI that we’re reasonably likely to get in the next 25 years are unalignable?
ii) how plausible are the extreme levels of cooperation Wei Dai wants
iii) how important is career capital/credibility?
I’m perhaps midway between Wei Dai’s view and the median governance view so may be an interesting example. I think we’re ~10% likely to get transformative general AI in the next 20 years, and ~6% likely to get an incorrigible one, and ~5.4% likely to get incorrigible general AI that’s insufficiently philosophically competent. Extreme cooperation seems ~5% likely, and is correlated with having general AI. It would be nice if more people worked on that, or on whatever more-realistic solutions would work for the transformative unsafe AGI scenario, but I’m happy for some double-digit percentage of governance researchers to keep working on less extreme (and more likely) solutions to build credibility.
Seems right to me, yes.
Convince the researchers at OpenAI, FHI and Open Phil, and maybe DeepMind and CHAI, that it’s not possible to get safe, competitive AI; then ask them to pass it on to governance researchers.
I have a feeling it’s not that simple. See the last part of “Generate evidence of difficulty” as a research purpose on biases. So for example I know at least one person who quit from an AI safety org (in part) because they became convinced that it’s too difficult to achieve safe, competitive AI (or at least the approach pursued by the org wasn’t going to work). Another person privately told me they have little idea how their research will eventually contribute to a safe, competitive AI, but hasn’t written anything like that publicly AFAIK. (And note that I don’t actually have that many opportunities to speak privately with other AI safety researchers.) Another thing is that most AI safety researchers probably don’t think it’s part of their job to “generate evidence of difficulty” so I have to convince them of that first.
Unless these problems are solved, I might be able to convince a few safety researchers to go to governance researchers and tell them they think it’s not possible to get safe, competitive AI, but their concerns will probably just be dismissed as outliers. I think a better step forward would be to build a private forum where these kinds of concerns can be more frankly discussed, as well as a culture where doing so is normative. This addresses some of the possible biases and I’m still not sure about the others.
This is pretty strongly different from my impressions, but I don’t think we could resolve the disagreement without talking about specific examples of people, so I’m inclined to set this aside.
I would guess three main disagreements are:
i) are the kinds of transformative AI that we’re reasonably likely to get in the next 25 years are unalignable?
ii) how plausible are the extreme levels of cooperation Wei Dai wants
iii) how important is career capital/credibility?
I’m perhaps midway between Wei Dai’s view and the median governance view so may be an interesting example. I think we’re ~10% likely to get transformative general AI in the next 20 years, and ~6% likely to get an incorrigible one, and ~5.4% likely to get incorrigible general AI that’s insufficiently philosophically competent. Extreme cooperation seems ~5% likely, and is correlated with having general AI. It would be nice if more people worked on that, or on whatever more-realistic solutions would work for the transformative unsafe AGI scenario, but I’m happy for some double-digit percentage of governance researchers to keep working on less extreme (and more likely) solutions to build credibility.