I think most people would agree that if a scientist happened to create a synthetic virus that was airborne and could kill hundreds of millions of people if released into the wild, we wouldn’t want the instructions for creating that synthetic virus to be published in the open for terrorist groups or hawkish governments to use. And for the same reasons, we wouldn’t want a Friendly AI textbook to explain how to build highly dangerous AI systems. But excepting that, I would love to see a rigorously technical textbook on friendliness theory, and I agree that friendliness research will need to increase for us to see that textbook be written in 15 years.
Why do you think that a rigorous description of friendliness would also shed light on how to build AGI?
Friendly AI theory isn’t just about the problem of friendliness content, but also about the kind of AI architecture that is capable of using friendliness content. But many kinds of progress on that kind of AI architecture will be progress toward AGI that can take arbitrary goals, almost all of which would be bad for humanity.
Why do you think that a rigorous description of friendliness would also shed light on how to build AGI?
Friendly AI theory isn’t just about the problem of friendliness content, but also about the kind of AI architecture that is capable of using friendliness content. But many kinds of progress on that kind of AI architecture will be progress toward AGI that can take arbitrary goals, almost all of which would be bad for humanity.