Agreed that on EY’s view (and my own), human “fundamental values” (1) have not yet been fully articulated/extrapolated; that we can’t say with confidence whether X is in that set.
But AFAICT, EY rejects the idea (which you seem here to claim that he endorses?) that an AI with a simple utility function can be sure that maximizing that function is the right thing to do. It might believe that maximizing that function is the right thing to do, but it would be wrong. (2)
AFAICT this is precisely what RichardChappell considers implausible: the idea that unlike the AI, humans can correctly believe that maximizing their utility function is the right thing to do.
==
(1) Supposing there exist any such things, of which I am not convinced.
(2) Necessarily wrong, in fact, since on EY’s view as I understand it there’s one and only one right set of values, and humans currently implement it, and the set of values humans implement is irreducably complex and therefore cannot be captured by a simple utility function. Therefore, an AI maximizing a simple utility function is necessarily not doing the right thing on EY’s view.
Sorry, I meant to use the two-place version; it wouldn’t be what’s right; what I meant is that the completely analogous concept of “that-AI-right” would consist simply of that utility function.
To the extent that you are still talking about EY’s views, I still don’t think that’s correct… I think he would reject the idea that “that-AI-right” is analogous to right, or that “right” is a 2-place predicate.
That said, given that this question has come up elsethread and I’m apparently in the minority, and given that I don’t understand what all this talk of right adds to the discussion in the first place, it becomes increasingly likely that I’ve just misunderstood something.
In any case, I suspect we all agree that the AI’s decisions are motivated by its simple utility function in a manner analogous to how human decisions are motivated by our (far more complex) utility function. What disagreement exists, if any, involves the talk of “right” that I’m happy to discard altogether.
Agreed that on EY’s view (and my own), human “fundamental values” (1) have not yet been fully articulated/extrapolated; that we can’t say with confidence whether X is in that set.
But AFAICT, EY rejects the idea (which you seem here to claim that he endorses?) that an AI with a simple utility function can be sure that maximizing that function is the right thing to do. It might believe that maximizing that function is the right thing to do, but it would be wrong. (2)
AFAICT this is precisely what RichardChappell considers implausible: the idea that unlike the AI, humans can correctly believe that maximizing their utility function is the right thing to do.
==
(1) Supposing there exist any such things, of which I am not convinced.
(2) Necessarily wrong, in fact, since on EY’s view as I understand it there’s one and only one right set of values, and humans currently implement it, and the set of values humans implement is irreducably complex and therefore cannot be captured by a simple utility function. Therefore, an AI maximizing a simple utility function is necessarily not doing the right thing on EY’s view.
Sorry, I meant to use the two-place version; it wouldn’t be what’s right; what I meant is that the completely analogous concept of “that-AI-right” would consist simply of that utility function.
To the extent that you are still talking about EY’s views, I still don’t think that’s correct… I think he would reject the idea that “that-AI-right” is analogous to right, or that “right” is a 2-place predicate.
That said, given that this question has come up elsethread and I’m apparently in the minority, and given that I don’t understand what all this talk of right adds to the discussion in the first place, it becomes increasingly likely that I’ve just misunderstood something.
In any case, I suspect we all agree that the AI’s decisions are motivated by its simple utility function in a manner analogous to how human decisions are motivated by our (far more complex) utility function. What disagreement exists, if any, involves the talk of “right” that I’m happy to discard altogether.