“If you’re building an AGI, it’s like building a Saturn V rocket [but with every human on it]. It’s a complex, difficult engineering task, and you’re going to try and make it aligned, which means it’s going to deliver people to the moon and home again.
People ask “why assume they won’t just land on the Moon and return home safely?”
And I’m like, because you don’t know what you’re doing!
If you try to send people to the moon and you don’t know what you’re doing, your astronauts will die.
[Unlike the telephone, or electricity, where you can assume it’s probably going to work out okay] I contend that ASI is more like the moon rocket.
“The moon is small compared with the rest of the sky, so you don’t get to the moon by default – you hit some part of the sky that isn’t the moon. So, show me the plan by which you predict to specifically hit the moon.”
While I don’t want to defend the public’s reasoning on AI alignment, because often they have very confused beliefs on what AI alignment even is, I do think that something like this claim is very likely false, and that the analogy is very, very poor actually.
The basic reason for this is that AI values, as well as AIs themselves are way, way more robust to things going wrong than the rocket analogy often gives us:
I think at a fundamental level, I think this is one of my biggest divides with people like @Eliezer Yudkowsky and @Rob Bensinger et al, in that I think alignment is an easier target to hit than making a working rocket that goes to the moon.
While I don’t want to defend the public’s reasoning on AI alignment, because often they have very confused beliefs on what AI alignment even is, I do think that something like this claim is very likely false, and that the analogy is very, very poor actually.
The basic reason for this is that AI values, as well as AIs themselves are way, way more robust to things going wrong than the rocket analogy often gives us:
https://www.lesswrong.com/posts/wAczufCpMdaamF9fy/my-objections-to-we-re-all-gonna-die-with-eliezer-yudkowsky#Yudkowsky_mentions_the_security_mindset__
https://www.lesswrong.com/posts/JcLhYQQADzTsAEaXd/?commentId=7iBb7aF4ctfjLH6AC
I think at a fundamental level, I think this is one of my biggest divides with people like @Eliezer Yudkowsky and @Rob Bensinger et al, in that I think alignment is an easier target to hit than making a working rocket that goes to the moon.