I’m not sure that’s true. It’s true if you adopt the dominant local perspective “alignment is very hard and we need more time to do it”. But there are other perspectives: see “AI is easy to control” by Pope & Belrose, arguing that the success of RLHF means there’s a less than 1% risk of extinction from AI. I think this perspective is both subtly wrong and deeply confused in mistaking alignment with total x-risk, but the core argument isn’t obviously wrong. So reasonable people can and do argue for full speed ahead on AGI.
I agree with pretty much all of the counterarguments made by Steve Byrnes in his Thoughts on “AI is easy to control” by Pope & Belrose. But not all reasonable people will. And those who are also non-utilitarians (most of humanity) will be pursuing AGI ASAP for rational (if ultimately subtly wrong) reasons.
I think we need to understand and take this position seriously to do a good job of avoiding extinction as best we can.
Basically, I think whether or not one thinks whether alignment is hard or not is much more of the crux than whether or not they’re utilitarian.
Pesonally, I don’t find Pope & Belrose very convincing, although I do commend them for the reasonable effort—but if I did believe that AI is likely to go well, I’d probably also be all for it. I just don’t see how this is related to utilitarianism (maybe for all but a very small subset of people in EA).
I’m not sure that’s true. It’s true if you adopt the dominant local perspective “alignment is very hard and we need more time to do it”. But there are other perspectives: see “AI is easy to control” by Pope & Belrose, arguing that the success of RLHF means there’s a less than 1% risk of extinction from AI. I think this perspective is both subtly wrong and deeply confused in mistaking alignment with total x-risk, but the core argument isn’t obviously wrong. So reasonable people can and do argue for full speed ahead on AGI.
I agree with pretty much all of the counterarguments made by Steve Byrnes in his Thoughts on “AI is easy to control” by Pope & Belrose. But not all reasonable people will. And those who are also non-utilitarians (most of humanity) will be pursuing AGI ASAP for rational (if ultimately subtly wrong) reasons.
I think we need to understand and take this position seriously to do a good job of avoiding extinction as best we can.
Basically, I think whether or not one thinks whether alignment is hard or not is much more of the crux than whether or not they’re utilitarian.
Pesonally, I don’t find Pope & Belrose very convincing, although I do commend them for the reasonable effort—but if I did believe that AI is likely to go well, I’d probably also be all for it. I just don’t see how this is related to utilitarianism (maybe for all but a very small subset of people in EA).