I keep returning to one gnawing question that haunts the whole idea of Friendly AI: how do you program a machine to “care”? I can understand how a machine can appear to “want” something, favoring a certain outcome over another. But to talk about a machine “caring” is ignoring a very crucial point about life: that as clever as intelligence is, it cannot create care. We tend to love our kid more than someone else’s. So you could program a machine to prefer another in which it recognizes a piece of its own code. That may LOOK like care but it’s really just an outcome. How could you replicate, for example, the love a parent shows for a kid they didn’t produce? What if that kid were humanity? So too with Coherent Extrapolated Volition, you can keep refining the resolution of an outcome, but I don’t see how any mechanism can actually care about anything but an outcome.
While “want” and “prefer”may be useful terms, such terms as “care”, “desire”, “value” constitute an enormous and dangerous anthropomorphizing. We cannot imagine outside our own frame, and this is one place where that gets us into real trouble. Once someone creates a code that will recognize something truly metaphysical I would be convinced that FAI is possible. Even whole brain emulation assumes that both that our thoughts are nothing but code and a brain with or without a body is the same thing. Am I missing something?
Leaving aside other matters, what does it matter if an FAI ‘cares’ in the sense that humans do so long as its actions bring about high utility from a human perspective?
Because what any human wants is a moving target. As soon as someone else delivers exactly what you ask for, you will be disappointed unless you suddenly stop changing. Think of the dilemma of eating something you know you shouldn’t. Whatever you decide, as soon as anyone (AI or human) takes away your freedom to change your mind, you will likely rebel furiously. Human freedom is a huge value that any FAI of any description will be unable to deliver until we are no longer free agents.
How can you distinguish “recognizing something truly metaphysical” from (1) “falsely claiming to recognize something truly metaphysical” and (2) “sincerely claiming to recognize something truly metaphysical, but wrong because actually the thing in question isn’t real or is very different from what seems to have been recognized”?
Perhaps “caring” offers a potential example of #1. A machine says it cares about us, it consistently acts in ways that benefit us, it exhibits what look like signs of distress and gratification when we do ill or well—but perhaps it’s “just an outcome” (whatever exactly that means). How do you tell? (I am worried that the answer might be “by mere prejudice: if it’s a machine putatively doing the caring, then of course it isn’t real”. I think that would be a bad answer.)
Obvious example of #2: many people have believed themselves to be in touch with gods, and I guess communion with a god would count as “truly metaphysical”. But given what those people have thought about the gods they believed themselves to be in touch with, it seems fairly clear that most of them must have been wrong (because where a bunch of people have mutually-fundamentally-incompatible ideas about their gods, at most one can be right).
This is Yudkowsky’s Hidden Complexity of Wishes problem from the human perspective. The concept of “caring” is rooted so deeply (in our flesh, I insist) that we cannot express it. Getting across the idea to AI that you care about your mother is not the same as asking for an outcome. This is why the problem is so hard. How would you convince the AI, in your first example, that your care was real? Or in your #2, that your wish was different from what it delivered? And how do you tell, you ask? By being disappointed in the result! (For instance in Yudkowsky’s example, when the AI delivers Mom out of the burning building as you requested, but in pieces.)
My point is that value is not a matter of cognition of the brain, but caring from the heart. When AI calls your insistence that it didn’t deliver what you wanted “prejudice”, I don’t think you’d be happy with the above defense.
What I wrote wasn’t intended as a defense of anything; it was an attempt to understand what you were saying. Since you completely ignored the questions I asked (which is of course your prerogative), I am none the wiser.
I think you may have misunderstood my conjecture about prejudice; if an AI professes to “care” but doesn’t in fact act in ways we recognize as caring, and if we conclude that actually it doesn’t care in the sense we meant, that’s not prejudice. (But it is looking at “outcomes”, which you disdained before.)
Did you find any arguments unpersuasive this week?
I keep returning to one gnawing question that haunts the whole idea of Friendly AI: how do you program a machine to “care”? I can understand how a machine can appear to “want” something, favoring a certain outcome over another. But to talk about a machine “caring” is ignoring a very crucial point about life: that as clever as intelligence is, it cannot create care. We tend to love our kid more than someone else’s. So you could program a machine to prefer another in which it recognizes a piece of its own code. That may LOOK like care but it’s really just an outcome. How could you replicate, for example, the love a parent shows for a kid they didn’t produce? What if that kid were humanity? So too with Coherent Extrapolated Volition, you can keep refining the resolution of an outcome, but I don’t see how any mechanism can actually care about anything but an outcome.
While “want” and “prefer”may be useful terms, such terms as “care”, “desire”, “value” constitute an enormous and dangerous anthropomorphizing. We cannot imagine outside our own frame, and this is one place where that gets us into real trouble. Once someone creates a code that will recognize something truly metaphysical I would be convinced that FAI is possible. Even whole brain emulation assumes that both that our thoughts are nothing but code and a brain with or without a body is the same thing. Am I missing something?
Leaving aside other matters, what does it matter if an FAI ‘cares’ in the sense that humans do so long as its actions bring about high utility from a human perspective?
Because what any human wants is a moving target. As soon as someone else delivers exactly what you ask for, you will be disappointed unless you suddenly stop changing. Think of the dilemma of eating something you know you shouldn’t. Whatever you decide, as soon as anyone (AI or human) takes away your freedom to change your mind, you will likely rebel furiously. Human freedom is a huge value that any FAI of any description will be unable to deliver until we are no longer free agents.
What would an AI that ‘cares’ in the sense you spoke of be able to do to address this problem that a non-‘caring’ one wouldn’t?
How can you distinguish “recognizing something truly metaphysical” from (1) “falsely claiming to recognize something truly metaphysical” and (2) “sincerely claiming to recognize something truly metaphysical, but wrong because actually the thing in question isn’t real or is very different from what seems to have been recognized”?
Perhaps “caring” offers a potential example of #1. A machine says it cares about us, it consistently acts in ways that benefit us, it exhibits what look like signs of distress and gratification when we do ill or well—but perhaps it’s “just an outcome” (whatever exactly that means). How do you tell? (I am worried that the answer might be “by mere prejudice: if it’s a machine putatively doing the caring, then of course it isn’t real”. I think that would be a bad answer.)
Obvious example of #2: many people have believed themselves to be in touch with gods, and I guess communion with a god would count as “truly metaphysical”. But given what those people have thought about the gods they believed themselves to be in touch with, it seems fairly clear that most of them must have been wrong (because where a bunch of people have mutually-fundamentally-incompatible ideas about their gods, at most one can be right).
This is Yudkowsky’s Hidden Complexity of Wishes problem from the human perspective. The concept of “caring” is rooted so deeply (in our flesh, I insist) that we cannot express it. Getting across the idea to AI that you care about your mother is not the same as asking for an outcome. This is why the problem is so hard. How would you convince the AI, in your first example, that your care was real? Or in your #2, that your wish was different from what it delivered? And how do you tell, you ask? By being disappointed in the result! (For instance in Yudkowsky’s example, when the AI delivers Mom out of the burning building as you requested, but in pieces.)
My point is that value is not a matter of cognition of the brain, but caring from the heart. When AI calls your insistence that it didn’t deliver what you wanted “prejudice”, I don’t think you’d be happy with the above defense.
[Ref: http://lesswrong.com/lw/ld/the_hidden_complexity_of_wishes/]
What I wrote wasn’t intended as a defense of anything; it was an attempt to understand what you were saying. Since you completely ignored the questions I asked (which is of course your prerogative), I am none the wiser.
I think you may have misunderstood my conjecture about prejudice; if an AI professes to “care” but doesn’t in fact act in ways we recognize as caring, and if we conclude that actually it doesn’t care in the sense we meant, that’s not prejudice. (But it is looking at “outcomes”, which you disdained before.)