I think that there is relevant discussion further on in the book (Chapter 13) regarding Coherent Extrapolated Volition. It’s kind of an attempt to specify human values to the AI so it can figure out what the values are are in a way that takes everyone into account and avoids the problem of one individual’s current values dominating the system (with a lot more nuance to it). If executed correctly, it ought to work even if the creators are mistaken about human values in some way.
I think that there is relevant discussion further on in the book (Chapter 13) regarding Coherent Extrapolated Volition. It’s kind of an attempt to specify human values to the AI so it can figure out what the values are are in a way that takes everyone into account and avoids the problem of one individual’s current values dominating the system (with a lot more nuance to it). If executed correctly, it ought to work even if the creators are mistaken about human values in some way.