One thing to consider: if we successfully dissuade deepmind researchers from working on AGI, who actually do take alignment issues a little bit seriously, does it instead get developed by meta researchers who (for the sake of argument) don’t care?
More generally you’re not going to successfully remove everyone from the field. So the people you’ll remove will be those who are most concerned about alignment, leaving those who are least concerned to discover AGI.
It’s certainly a consideration, and I don’t want us to convince anybody who is genuinely working on alignment at DeepMind to leave. On the other hand, I don’t think their positive affinity towards AGI alignment matters much if what they are doing is accelerating its production. This seems a little like saying “you shouldn’t get those Iranian nuclear researchers to quit, because you’re removing the ‘checks’ on the other more radical researchers”. It’s probably a little naive to assume they’re “checking” the other members of DeepMind if their name is on this blog post: https://www.deepmind.com/blog/generally-capable-agents-emerge-from-open-ended-play.
This is also probably ignoring the maybe more plausible intermediate success of outreach, which is to get someone who doesn’t already have concerns to have them, who then keeps their job because of inertia. We can still do a lot of work getting those who refuse to quit to help move organizations like OpenAI or DeepMind more toward actual alignment research, and create institutional failsafes.
The goal here should be social consensus about individual disincentives to misaligned deployment—stopping individual research labs has a pretty modest ROI. (If they pivot their attention given social consensus, that’s up to them.)
One thing to consider: if we successfully dissuade deepmind researchers from working on AGI, who actually do take alignment issues a little bit seriously, does it instead get developed by meta researchers who (for the sake of argument) don’t care?
More generally you’re not going to successfully remove everyone from the field. So the people you’ll remove will be those who are most concerned about alignment, leaving those who are least concerned to discover AGI.
It’s certainly a consideration, and I don’t want us to convince anybody who is genuinely working on alignment at DeepMind to leave. On the other hand, I don’t think their positive affinity towards AGI alignment matters much if what they are doing is accelerating its production. This seems a little like saying “you shouldn’t get those Iranian nuclear researchers to quit, because you’re removing the ‘checks’ on the other more radical researchers”. It’s probably a little naive to assume they’re “checking” the other members of DeepMind if their name is on this blog post: https://www.deepmind.com/blog/generally-capable-agents-emerge-from-open-ended-play.
This is also probably ignoring the maybe more plausible intermediate success of outreach, which is to get someone who doesn’t already have concerns to have them, who then keeps their job because of inertia. We can still do a lot of work getting those who refuse to quit to help move organizations like OpenAI or DeepMind more toward actual alignment research, and create institutional failsafes.
I think a case could be made that if you don’t want Eleuther making X public, then it is also bad for DeepMind or OpenAI to make X.
I would make that case. In this circumstance it’s also definitely the far lesser evil.
Within the pessimistic hypothesis it does not matter who develops AGI, in any case our death is almost certain.
The goal here should be social consensus about individual disincentives to misaligned deployment—stopping individual research labs has a pretty modest ROI. (If they pivot their attention given social consensus, that’s up to them.)