To speak of building an AGI which shares “our values” is likely to provoke negative reactions from any
AGI researcher whose current values include terms for respecting the desires of future sentient beings and
allowing them to self-actualize their own potential without undue constraint. This itself, of course, is a
component of the AGI researcher’s preferences which would not necessarily be shared by all powerful
optimization processes, just as natural selection doesn’t care about old elephants starving to death or
gazelles dying in pointless agony. Building an AGI which shares, quote, “our values”, unquote, sounds
decidedly non-cosmopolitan, something like trying to rule that future intergalactic civilizations must be
composed of squishy meat creatures with ten fingers or they couldn’t possibly be worth anything—and
hence, of course, contrary to our own cosmopolitan values, i.e., cosmopolitan preferences. The
counterintuitive idea is that even from a cosmopolitan perspective, you cannot take a hands-off approach
to the value systems of AGIs; most random utility functions result in sterile, boring futures because the
resulting agent does not share our own intuitions about the importance of things like novelty and
diversity, but simply goes off and e.g. tiles its future lightcone with paperclips, or other configurations of
matter which seem to us merely “pointless”.
Complete quote is