(shrug) Sure, if you’re right that the “most urgent and agreeable stuff” doesn’t happen to press a significant number of people’s emotional buttons, then it follows that not many people’s emotional buttons will be pressed.
But there’s a big difference between assuming that this will be the case, and considering it a bug if it isn’t.
Either I trust the process we build more than I trust my personal judgments, or I don’t.
If I don’t, then why go through this whole rigamarole in the first place? I should prefer to implement my personal judgments. (Of course, I may not have the power to do so, and prefer to join more powerful coalitions whose judgments are close-enough to mine. But in that case CEV becomes a mere political compromise among the powerful.)
If I do, then it’s not clear to me that “fixing the bug” is a good idea.
That is, OK, suppose we write a seed AI intended to work out humanity’s collective CEV, work out some next-step goals based on that CEV and an understanding of likely consequences, construct a program P to implement those goals, run P, and quit.
Suppose that I am personally horrified by the results of running P. Ought I choose to abort P? Or ought I say to myself “Oh, how interesting: my near-mode emotional reactions to the implications of what humanity really wants are extremely negative. Still, most everybody else seems OK with it. OK, fine: this is not going to be a pleasant transition period for me, but my best guess is still that it will ultimately be for the best.”
Is there some number of people such that if more than that many people are horrified by the results, we ought to choose to abort P?
Does the question even matter? The process as you’ve described it doesn’t include an abort mechanism; whichever choice we make P is executed.
Ought we include such an abort mechanism? It’s not at all clear to me that we should. I can get on a roller-coaster or choose not to get on it, but giving me a brake pedal on a roller coaster is kind of ridiculous.
(shrug) Sure, if you’re right that the “most urgent and agreeable stuff” doesn’t happen to press a significant number of people’s emotional buttons, then it follows that not many people’s emotional buttons will be pressed.
But there’s a big difference between assuming that this will be the case, and considering it a bug if it isn’t.
Either I trust the process we build more than I trust my personal judgments, or I don’t.
If I don’t, then why go through this whole rigamarole in the first place? I should prefer to implement my personal judgments. (Of course, I may not have the power to do so, and prefer to join more powerful coalitions whose judgments are close-enough to mine. But in that case CEV becomes a mere political compromise among the powerful.)
If I do, then it’s not clear to me that “fixing the bug” is a good idea.
That is, OK, suppose we write a seed AI intended to work out humanity’s collective CEV, work out some next-step goals based on that CEV and an understanding of likely consequences, construct a program P to implement those goals, run P, and quit.
Suppose that I am personally horrified by the results of running P. Ought I choose to abort P? Or ought I say to myself “Oh, how interesting: my near-mode emotional reactions to the implications of what humanity really wants are extremely negative. Still, most everybody else seems OK with it. OK, fine: this is not going to be a pleasant transition period for me, but my best guess is still that it will ultimately be for the best.”
Is there some number of people such that if more than that many people are horrified by the results, we ought to choose to abort P?
Does the question even matter? The process as you’ve described it doesn’t include an abort mechanism; whichever choice we make P is executed.
Ought we include such an abort mechanism? It’s not at all clear to me that we should. I can get on a roller-coaster or choose not to get on it, but giving me a brake pedal on a roller coaster is kind of ridiculous.