Eliezer Y, along with I’m guessing a lot of people in the rationalist community, seems to be essentially a kind of Humean about morality (more specifically, a Humean consequentialist). Now, Humean views of morality are essentially extremely compatible with a very broad statement of the Orthogonality Thesis, applied to all rational entities.
Humean views about morality, though, are somewhat controversial. Plenty of people think we can rationally derive moral laws (Kantians); plenty of people think there are certain objective ends to human life (virtue ethicists).
My question here isn’t about the truth of these moral theories per se, but rather: does accepting one of these alternative meta-ethical theories cast any doubt on the Orthogonality Thesis as applied to AGI?
I don’t think so. It would probably make AI alignment easier, if we were able to define morality in a relatively simple way that allowed the AGI to derive the rest logically. That still doesn’t counter the Orthogonality Thesis, in that an AGI doesn’t necessarily have to have morality. We would still have to program it in—it would just be (probably) easier to do that than to find a robust definition of human values.
Eliezer Y, along with I’m guessing a lot of people in the rationalist community, seems to be essentially a kind of Humean about morality (more specifically, a Humean consequentialist). Now, Humean views of morality are essentially extremely compatible with a very broad statement of the Orthogonality Thesis, applied to all rational entities.
Humean views about morality, though, are somewhat controversial. Plenty of people think we can rationally derive moral laws (Kantians); plenty of people think there are certain objective ends to human life (virtue ethicists).
My question here isn’t about the truth of these moral theories per se, but rather: does accepting one of these alternative meta-ethical theories cast any doubt on the Orthogonality Thesis as applied to AGI?
I don’t think so. It would probably make AI alignment easier, if we were able to define morality in a relatively simple way that allowed the AGI to derive the rest logically. That still doesn’t counter the Orthogonality Thesis, in that an AGI doesn’t necessarily have to have morality. We would still have to program it in—it would just be (probably) easier to do that than to find a robust definition of human values.