I’m assuming the relevant values are the optimizer ones not what people say. I discussed social institutions, including those encouraging people to endorse and optimize for common values, in the section on subversion.
Alignment with a human other than yourself could be a problem because people are to some degree selfish and, to a smaller degree, have different general principles/aesthetics about how things should be. So some sort of incentive optimization / social choice theory / etc might help. But at least there’s significant overlap between different humans’ values. Though, there’s a pretty big existing problem of people dying, the default was already that current people would be replaced by other people.
I’m assuming the relevant values are the optimizer ones not what people say. I discussed social institutions, including those encouraging people to endorse and optimize for common values, in the section on subversion.
Alignment with a human other than yourself could be a problem because people are to some degree selfish and, to a smaller degree, have different general principles/aesthetics about how things should be. So some sort of incentive optimization / social choice theory / etc might help. But at least there’s significant overlap between different humans’ values. Though, there’s a pretty big existing problem of people dying, the default was already that current people would be replaced by other people.