What would it mean for an outcome to be objectively best for all of reality?
It might be your subjective opinion that maximized conscious happiness would be the objectively best reality. Another human’s subjective opinion might be that a reality that maximized the fulfillment of fundamentalist Christian values was the objectively best reality. A third human might hold that there’s no such thing as the objectively best, and all we have are subjective opinions.
Given that different people disagree, one could argue that we shouldn’t privilege any single person’s opinion, but try to take everyone’s opinions into account—that is, build an AI that cared about the fulfillment of something like “human values”.
Of course, that would be just their subjective opinion. But it’s the kind of subjective opinion that the people involved in AI alignment discussions tend to have.
What would it mean for an outcome to be objectively best for all of reality?
It might be your subjective opinion that maximized conscious happiness would be the objectively best reality. Another human’s subjective opinion might be that a reality that maximized the fulfillment of fundamentalist Christian values was the objectively best reality. A third human might hold that there’s no such thing as the objectively best, and all we have are subjective opinions.
Given that different people disagree, one could argue that we shouldn’t privilege any single person’s opinion, but try to take everyone’s opinions into account—that is, build an AI that cared about the fulfillment of something like “human values”.
Of course, that would be just their subjective opinion. But it’s the kind of subjective opinion that the people involved in AI alignment discussions tend to have.
Suppose everyone agreed that the proposed outcome is what we wanted. Would this scenario then be difficult to achieve?