Why do you say that? I think you’re defining the problem away, by saying that values that aren’t mutually-satisfiable aren’t values. What’s more wish-like about wanting high status than about wanting an ice cream cone?
I read it; now it seems you’re protesting against presenting an AI with a single value to optimize, rather than my source code.
If something poses a problem in the very simple case of an AI with one single value to optimize, I don’t see how giving it a whole bunch of values to optimize, along with their algorithmic definitions and context, is going to make things easier.
Also, what was to my mind the most-important point of the post is that humans already hold values that span the space of possible values along what may be the most-important or most-problematic dimensions.
I read it; now it seems you’re protesting against presenting an AI with a single value to optimize, rather than my source code.
I suggest that it’s very hard to form a coherent concept of an AI that only cares about one particular wish/aspect/value.
If something poses a problem in the very simple case of an AI with one single value to optimize, I don’t see how giving it a whole bunch of values to optimize, along with their algorithmic definitions and context, is going to make things easier.
FAI is only supposed to improve on status quo. In the worst impossible case, this improvement is small. Unless AI actually makes things worse (in which case, it’s by definition not Friendly), I don’t see what your argument could possibly be about.
I hear you, but I believe it’s a very strange and unstable definition. When you say that you want AI that “optimizes X”, you implicitly want X to be optimized is a way in which you’d want it optimized, understood in the way you want it understood, etc. Failing to also specify your whole morality as interpreter for “optimize X” will result in all sorts of unintended consequences, making any such formal specification unrelated to the subject matter that you intuitively wanted to discuss by introducing the “optimize X” statement.
In the context of superintelligent AI, this means that you effectively have to start with a full (not-just-putatively-)FAI and then make a wish. But what should FAI do with your wish, in terms of its decisions, in terms of what it does with the world? Most likely, completely disregard the wish. This is the reason there are no purple button FAIs.
There is no “status-FAI”. You can’t have morality, but with purple buttons.
Clearly this should be charitably read as “status-(putatively F)AI”, which would be much more unwieldy.
The hell I can’t!
ETA: Well here’s Bentham with a purple button anyway
What does this mean? What do purple buttons signify?
Clarification is here.
No—I still don’t know what “purple buttons” is supposed to mean.
Ice cream, obviously.
I referred to the second AI in the “Mutually-satisfiable vs. non-mutually-satisfiable values” section of the original post.
Right, but this is a consideration of an inherently unstable specification of a wish, not of a preference (morality). Wishes is not the sort of thing FAI deals with.
Why do you say that? I think you’re defining the problem away, by saying that values that aren’t mutually-satisfiable aren’t values. What’s more wish-like about wanting high status than about wanting an ice cream cone?
Nothing. You can’t ask FAI for an ice cream either. Again, see this comment for more detail.
I read it; now it seems you’re protesting against presenting an AI with a single value to optimize, rather than my source code.
If something poses a problem in the very simple case of an AI with one single value to optimize, I don’t see how giving it a whole bunch of values to optimize, along with their algorithmic definitions and context, is going to make things easier.
Also, what was to my mind the most-important point of the post is that humans already hold values that span the space of possible values along what may be the most-important or most-problematic dimensions.
I suggest that it’s very hard to form a coherent concept of an AI that only cares about one particular wish/aspect/value.
FAI is only supposed to improve on status quo. In the worst impossible case, this improvement is small. Unless AI actually makes things worse (in which case, it’s by definition not Friendly), I don’t see what your argument could possibly be about.
Status-putatively-F-AI.
I hear you, but I believe it’s a very strange and unstable definition. When you say that you want AI that “optimizes X”, you implicitly want X to be optimized is a way in which you’d want it optimized, understood in the way you want it understood, etc. Failing to also specify your whole morality as interpreter for “optimize X” will result in all sorts of unintended consequences, making any such formal specification unrelated to the subject matter that you intuitively wanted to discuss by introducing the “optimize X” statement.
In the context of superintelligent AI, this means that you effectively have to start with a full (not-just-putatively-)FAI and then make a wish. But what should FAI do with your wish, in terms of its decisions, in terms of what it does with the world? Most likely, completely disregard the wish. This is the reason there are no purple button FAIs.
I don’t disagree with you. I was just responding to the challenge set in the post.