Even what you really mean may not be what you should be wishing for, if you don’t have complete information, but that’s honestly the least of the relevant problems. We’ve got a hell of a time just getting computers to understand human speech : it’s taken decades to achieve the idiot-listeners on telephone lines. By the point where you can point an AGI at yourself and tell it to do what I mean, you’ve either programmed it with a non-trivial set of human morality or taught it to program itself with a non-trivial portion of human morality.
You might as well skip the wasted breath and opaqueness. That’s a genie that’s safe enough to simply ask to do as you should wish, aka Friendly-AI-complete.
((On top of /that/, the more complex the utility function, the more likely you are to get killed by value drift down the road, when some special-case patch or rule doesn’t correctly transfer from your starting FAI to its next generation, and eventually you end up with a very unfriendly AI, or when the scales get large enough that your initial premises no longer survive.))
Remember the distinction between an AI that doesn’t understand what you mean, and an AI that does understand what you mean but doesn’t always follow that. These are two different things. In order to be safe, an AI must be in neither category, but different arguments apply to each category.
When I point out that a genie might fail to understand you but a superintelligent AI should understand you because it is superintelligent (which I took from MugaSofer, I am addressing the first category.
When I suggest explicitly asking the AI “do what I mean”, I am addressing the second category. Since I am addressing a category in which the AI does understand my intentions, the objection “you can’t make an AI understand your intentions without programming it with morality” is not a valid response.
Your response was to my objection: “that doesn’t mean that at that point you know how to make it do what you mean.”
The superintelligent AI doesn’t have an issue with understanding your intentions, it simply doesn’t have any reason to care about your intentions.
In order to program it to care about your intentions, you, the programmer need to know how to codify the concept of “your intentions” (Perhaps not the specific intention, but the concept of what it means to have an intention). How do you do that?
Even what you really mean may not be what you should be wishing for, if you don’t have complete information, but that’s honestly the least of the relevant problems. We’ve got a hell of a time just getting computers to understand human speech : it’s taken decades to achieve the idiot-listeners on telephone lines. By the point where you can point an AGI at yourself and tell it to do what I mean, you’ve either programmed it with a non-trivial set of human morality or taught it to program itself with a non-trivial portion of human morality.
You might as well skip the wasted breath and opaqueness. That’s a genie that’s safe enough to simply ask to do as you should wish, aka Friendly-AI-complete.
((On top of /that/, the more complex the utility function, the more likely you are to get killed by value drift down the road, when some special-case patch or rule doesn’t correctly transfer from your starting FAI to its next generation, and eventually you end up with a very unfriendly AI, or when the scales get large enough that your initial premises no longer survive.))
Remember the distinction between an AI that doesn’t understand what you mean, and an AI that does understand what you mean but doesn’t always follow that. These are two different things. In order to be safe, an AI must be in neither category, but different arguments apply to each category.
When I point out that a genie might fail to understand you but a superintelligent AI should understand you because it is superintelligent (which I took from MugaSofer, I am addressing the first category.
When I suggest explicitly asking the AI “do what I mean”, I am addressing the second category. Since I am addressing a category in which the AI does understand my intentions, the objection “you can’t make an AI understand your intentions without programming it with morality” is not a valid response.
Your response was to my objection: “that doesn’t mean that at that point you know how to make it do what you mean.”
The superintelligent AI doesn’t have an issue with understanding your intentions, it simply doesn’t have any reason to care about your intentions.
In order to program it to care about your intentions, you, the programmer need to know how to codify the concept of “your intentions” (Perhaps not the specific intention, but the concept of what it means to have an intention). How do you do that?
Funny, I would’ve phrased that the other way around.