The argument where I gave up was you stating that full understanding necessarily leads to empathy, EY explaining how it is not necessarily so, and me giving an explicit counterexample to your claim (a psychopath may understand you better than you do, and exploit this understanding, yet not feel compelled by your pain or your values in any way).
You simply restated your position that ” “Fully understands”? But unless one is capable of empathy, then one will never understand what it is like to be another human being”, without explaining what your definition of understanding entails. If it is a superset of empathy, then it is not a standard definition of understanding:
one is able to think about it and use concepts to deal adequately with that object.
In other words, you can model their behavior accurately.
No other definition I could find (not even Kant’s pure understanding) implies empathy or anything else that would necessitate one to change their goals to accommodate the understood entity’s goals, though this may and does indeed happen, just not always.
EY’s example of the paperclip maximizer and my example of a psychopath do fit the standard definitions and serve as yet unrefuted counterexamples to your assertion.
I can’t see why DP’s definition of understanding needs more defence than yours. You are largely disagreeing about the meaning of this word, and I personally find the inclusion of empathy in understanding quite intuitive.
No other definition [of “understanding”] I could find (not even Kant’s pure understanding) implies empathy
“She is a very understanding person, she really empathises when you explain a problem to her”.
“one is able to think about it and use concepts to deal adequately with that object.”
In other words, you can model their behavior accurately.
I don’t think that is an uncontentious translation. Most of the forms of modelling we are familiar with don’t seem to involve concepts.
“She is a very understanding person, she really empathises when you explain a problem to her”.
“She is a very understanding person; even when she can’t relate to your problems, she won’t say you’re just being capricious.”
There’s three possible senses of understanding at issue here:
1) Being able to accurately model and predict.
2) 1 and knowing the quale.
3) 1 and 2 and empathizing.
I could be convinced that 2 is part of the ordinary usage of understanding, but 3 seems like too much of a stretch.
Edit: I should have said sympathizing instead of empathizing. The word empathize is perhaps closer in meaning to 2; or maybe it oscillates between 2 and 3 in ordinary usage. But understanding(2) another agent is not motivating. You can understand(2) an agent by knowing all the qualia they are experiencing, but still fail to care about the fact that they are experiencing those qualia.
Shminux, I wonder if we may understand “understand” differently. Thus when I say I want to understand what it’s like to be a bat, I’m not talking merely about modelling and predicting their behaviour. Rather I want first-person knowledge of echolocatory qualia-space. Apaarently, we can know all the third-person facts and be none the wiser.
The nature of psychopathic cognition raises difficult issues. There is no technical reason why we couldn’t be designed like mirror-touch synaesthetes (cf. http://www.daysyn.com/Banissy_Wardpublished.pdf) impartially feeling carbon-copies of each other’s encephalised pains and pleasures—and ultimately much else besides—as though they were our own. Likewise, there is no technical reason why our world-simulations must be egocentric. Why can’t the world-simulations we instantiate capture the impartial “view from nowhere” disclosed by the scientific world-picture? Alas on both counts accurate and impartial knowledge would put an organism at a disadvantage. Hyper-empathetic mirror-touch synaesthetes are rare. Each of us finds himself or herself apparently at the centre of the universe. Our “mind-reading” is fitful, biased and erratic. Naively, the world being centred on me seems to be a feature of reality itself. Egocentricity is a hugely fitness-enhancing adaptation. Indeed, the challenge for evolutionary psychology is to explain why aren’t we all psychopaths, cheats and confidence trickers all the time...
So in answer to your point, yes. a psychopath can often model and predict the behaviour other sentient beings better than the subjects themselves. This is one reason why humans can build slaughterhouses and death camps. [Ccompare death-camp commandant Franz Stangl’s response in Gitta Sereny’s Into That Darkness to seeing cattle on the way to be slaughtered: http://www.jewishvirtuallibrary.org/jsource/biography/Stangl.html] As you rightly note too, a psychopath can also know his victims suffer. He’s not ignorant of their sentience like Descartes, who supposed vivisected dogs were mere insentient automata emitting distress vocalisations. So I agree with you on this score as well. But the psychopath is still in the grip of a hard-wired egocentric illusion—as indeed are virtually all of us, to a greater or less degree. By contrast, if the psychopath were to acquire the rich empathetic understanding of a generalised mirror-touch syarnesthete, i.e. if he had the cognitive capacity to represent the first-person perspective of another subject of experience as though it were literally his own, then he couldn’t wantonly harm another subject of experience: it would be like harming himself. Mirror-touch synaesthetes can’t run slaughterhouses or death camps. This is why I take seriously the prospect that posthuman superintelligence will practise some sort of high-tech Jainism. Credible or otherwise, we may presume posthuman superintelligence won’t entertain the false notions of personal identity adaptive for Darwinian life.
[sorry shminux, I know our conceptual schemes are rather different, so please don’t feel obliged to respond if you think I still don’t “get it”. Life is short...]
Hmm, hopefully we are getting somewhere. The question is, which definition of understanding is likely to be applicable when, as you say, “the paperclipper discovers the first-person phenomenology of the pleasure-pain axis”, i.e whether a “superintelligence” would necessarily be as empathetic as we want it to be, in order not to harm humans.
While I agree that it is a possibility that a perfect model of another being may affect the modeler’s goals and values, I don’t see it to be inevitable. If anything, I would consider it more of bug than a feature. Were I (to design) a paperclip maximizer, I would make sure that the parts which model the environment, including humans, are separate from the core engine containing the paperclip production imperative.
So quarantined to prevent contamination, a sandboxed human emulator could be useful in achieving the only goal that matters, paperclipping the universe. Humans are not generally built this way (probably because our evolution did not happen to proceed in that direction), with some exceptions, psychopaths being one of them (they essentially sandbox their models of other humans). Another, more common, case of such sandboxing is narcissism. Having dealt with narcissists much too often for my liking, I can tell that they can mimic a normal human response very well, are excellent at manipulation, but yet their capacity for empathy is virtually nil. While abhorrent to a generic human, such a person ought to be considered a better design, goal-preservation-wise. Of course, there can be only so many non-empathetic people in a society before it stops functioning.
Thus when you state that
By contrast, if the psychopath were to acquire the rich empathetic understanding of a generalised mirror-touch syarnesthete, i.e. if he had the cognitive capacity to represent the first-person perspective of another subject of experience as though it were literally his own, then he couldn’t wantonly harm another subject of experience: it would be like harming himself.
I find that this is stating that either a secure enough sandbox cannot be devised or that anything sandboxed is not really “a first-person perspective”. Presumably what you mean is the latter. I’m prepared to grant you that, and I will reiterate that this is a feature, not a bug of any sound design, one a superintelligence is likely to implement. It is also possible that a careful examination of a sanboxed suffering human would affect the terminal values of the modeling entity, but this is by no means a given.
Anyway, these are my logical (based on sound security principles) and experimental (empathy-less humans) counterexamples to your assertion that a superintelligence will necessarily be affected by the human pain-pleasure axis in human-beneficial way. I also find this assertion suspicious on general principles, because it can easily be motivated by subconscious flinching away from a universe that is too horrible to contemplate.
ah, just one note of clarification about sentience-friendliness. Though I’m certainly sceptical that a full-spectrum superintelligence would turn humans into paperclips—or wilfully cause us to suffer—we can’t rule out that full-spectrum superintelligence might optimise us into orgasmium or utilitronium—not “human-friendliness” in any orthodox sense of the term. On the face of it, such super-optimisation is the inescapable outcome of applying a classical utilitarian ethic on a cosmological scale. Indeed, if I thought an AGI-in-a-box-style Intelligence Explosion were likely, and didn’t especially want to be converted into utilitronium, then I might regard AGI researchers who are classical utilitarians as a source of severe existential risk.
I simply don’t trust my judgement here shminux. Sorry to be lame. Greater than one in a million; but that’s not saying much. If, unlike most lesswrong stalwarts, you (tenatively) believe like me that posthuman superintelligence will most likely be our recursively self-editing biological descendants rather than the outcome of an nonbiological Intelligence Explosion or paperclippers, then some version of the Convergence Thesis is more credible. I (very) tentatively predict a future of gradients of intelligence bliss. But the propagation of a utilitronium shockwave in some guise ultimately seems plausible too. If so, this utilitronium shockwave may or may not resemble some kind of cosmic orgasm.
If, unlike most lesswrong stalwarts, you (tenatively) believe likeme that posthuman superintelligence will most likely be our recursively self-editing biological descendants rather than the outcome of an nonbiological Intelligence Explosion or paperclippers, then some version of the Convergence Thesis is more credible.
Actually, I have no opinion on convergence vs orthogonality. There are way too many unknowns still too even enumerate possibilities, let alone assign probabilities.Personally, I think that we are in for many more surprises before trans human intelligence is close to being more than a dream or a nightmare. One ought to spend more time analyzing, synthesizing and otherwise modeling cognitive processes than worrying about where it might ultimately lead.This is not the prevailing wisdom on this site, given Eliezer’s strong views on the matter.
The argument where I gave up was you stating that full understanding necessarily leads to empathy, EY explaining how it is not necessarily so, and me giving an explicit counterexample to your claim (a psychopath may understand you better than you do, and exploit this understanding, yet not feel compelled by your pain or your values in any way).
You simply restated your position that ” “Fully understands”? But unless one is capable of empathy, then one will never understand what it is like to be another human being”, without explaining what your definition of understanding entails. If it is a superset of empathy, then it is not a standard definition of understanding:
In other words, you can model their behavior accurately.
No other definition I could find (not even Kant’s pure understanding) implies empathy or anything else that would necessitate one to change their goals to accommodate the understood entity’s goals, though this may and does indeed happen, just not always.
EY’s example of the paperclip maximizer and my example of a psychopath do fit the standard definitions and serve as yet unrefuted counterexamples to your assertion.
I can’t see why DP’s definition of understanding needs more defence than yours. You are largely disagreeing about the meaning of this word, and I personally find the inclusion of empathy in understanding quite intuitive.
“She is a very understanding person, she really empathises when you explain a problem to her”.
“one is able to think about it and use concepts to deal adequately with that object.”
I don’t think that is an uncontentious translation. Most of the forms of modelling we are familiar with don’t seem to involve concepts.
“She is a very understanding person; even when she can’t relate to your problems, she won’t say you’re just being capricious.”
There’s three possible senses of understanding at issue here:
1) Being able to accurately model and predict. 2) 1 and knowing the quale. 3) 1 and 2 and empathizing.
I could be convinced that 2 is part of the ordinary usage of understanding, but 3 seems like too much of a stretch.
Edit: I should have said sympathizing instead of empathizing. The word empathize is perhaps closer in meaning to 2; or maybe it oscillates between 2 and 3 in ordinary usage. But understanding(2) another agent is not motivating. You can understand(2) an agent by knowing all the qualia they are experiencing, but still fail to care about the fact that they are experiencing those qualia.
Shminux, I wonder if we may understand “understand” differently. Thus when I say I want to understand what it’s like to be a bat, I’m not talking merely about modelling and predicting their behaviour. Rather I want first-person knowledge of echolocatory qualia-space. Apaarently, we can know all the third-person facts and be none the wiser.
The nature of psychopathic cognition raises difficult issues. There is no technical reason why we couldn’t be designed like mirror-touch synaesthetes (cf. http://www.daysyn.com/Banissy_Wardpublished.pdf) impartially feeling carbon-copies of each other’s encephalised pains and pleasures—and ultimately much else besides—as though they were our own. Likewise, there is no technical reason why our world-simulations must be egocentric. Why can’t the world-simulations we instantiate capture the impartial “view from nowhere” disclosed by the scientific world-picture? Alas on both counts accurate and impartial knowledge would put an organism at a disadvantage. Hyper-empathetic mirror-touch synaesthetes are rare. Each of us finds himself or herself apparently at the centre of the universe. Our “mind-reading” is fitful, biased and erratic. Naively, the world being centred on me seems to be a feature of reality itself. Egocentricity is a hugely fitness-enhancing adaptation. Indeed, the challenge for evolutionary psychology is to explain why aren’t we all psychopaths, cheats and confidence trickers all the time...
So in answer to your point, yes. a psychopath can often model and predict the behaviour other sentient beings better than the subjects themselves. This is one reason why humans can build slaughterhouses and death camps. [Ccompare death-camp commandant Franz Stangl’s response in Gitta Sereny’s Into That Darkness to seeing cattle on the way to be slaughtered: http://www.jewishvirtuallibrary.org/jsource/biography/Stangl.html] As you rightly note too, a psychopath can also know his victims suffer. He’s not ignorant of their sentience like Descartes, who supposed vivisected dogs were mere insentient automata emitting distress vocalisations. So I agree with you on this score as well. But the psychopath is still in the grip of a hard-wired egocentric illusion—as indeed are virtually all of us, to a greater or less degree. By contrast, if the psychopath were to acquire the rich empathetic understanding of a generalised mirror-touch syarnesthete, i.e. if he had the cognitive capacity to represent the first-person perspective of another subject of experience as though it were literally his own, then he couldn’t wantonly harm another subject of experience: it would be like harming himself. Mirror-touch synaesthetes can’t run slaughterhouses or death camps. This is why I take seriously the prospect that posthuman superintelligence will practise some sort of high-tech Jainism. Credible or otherwise, we may presume posthuman superintelligence won’t entertain the false notions of personal identity adaptive for Darwinian life.
[sorry shminux, I know our conceptual schemes are rather different, so please don’t feel obliged to respond if you think I still don’t “get it”. Life is short...]
Do you really? Start clucking!
That doesn’t generalise.
Nor does it need to. It’s awesome the way it is.
Hmm, hopefully we are getting somewhere. The question is, which definition of understanding is likely to be applicable when, as you say, “the paperclipper discovers the first-person phenomenology of the pleasure-pain axis”, i.e whether a “superintelligence” would necessarily be as empathetic as we want it to be, in order not to harm humans.
While I agree that it is a possibility that a perfect model of another being may affect the modeler’s goals and values, I don’t see it to be inevitable. If anything, I would consider it more of bug than a feature. Were I (to design) a paperclip maximizer, I would make sure that the parts which model the environment, including humans, are separate from the core engine containing the paperclip production imperative.
So quarantined to prevent contamination, a sandboxed human emulator could be useful in achieving the only goal that matters, paperclipping the universe. Humans are not generally built this way (probably because our evolution did not happen to proceed in that direction), with some exceptions, psychopaths being one of them (they essentially sandbox their models of other humans). Another, more common, case of such sandboxing is narcissism. Having dealt with narcissists much too often for my liking, I can tell that they can mimic a normal human response very well, are excellent at manipulation, but yet their capacity for empathy is virtually nil. While abhorrent to a generic human, such a person ought to be considered a better design, goal-preservation-wise. Of course, there can be only so many non-empathetic people in a society before it stops functioning.
Thus when you state that
I find that this is stating that either a secure enough sandbox cannot be devised or that anything sandboxed is not really “a first-person perspective”. Presumably what you mean is the latter. I’m prepared to grant you that, and I will reiterate that this is a feature, not a bug of any sound design, one a superintelligence is likely to implement. It is also possible that a careful examination of a sanboxed suffering human would affect the terminal values of the modeling entity, but this is by no means a given.
Anyway, these are my logical (based on sound security principles) and experimental (empathy-less humans) counterexamples to your assertion that a superintelligence will necessarily be affected by the human pain-pleasure axis in human-beneficial way. I also find this assertion suspicious on general principles, because it can easily be motivated by subconscious flinching away from a universe that is too horrible to contemplate.
ah, just one note of clarification about sentience-friendliness. Though I’m certainly sceptical that a full-spectrum superintelligence would turn humans into paperclips—or wilfully cause us to suffer—we can’t rule out that full-spectrum superintelligence might optimise us into orgasmium or utilitronium—not “human-friendliness” in any orthodox sense of the term. On the face of it, such super-optimisation is the inescapable outcome of applying a classical utilitarian ethic on a cosmological scale. Indeed, if I thought an AGI-in-a-box-style Intelligence Explosion were likely, and didn’t especially want to be converted into utilitronium, then I might regard AGI researchers who are classical utilitarians as a source of severe existential risk.
What odds do you currently give to the “might” in your statement that
? 1 in 10? 1 in a million? 1 in 10^^^10?
I simply don’t trust my judgement here shminux. Sorry to be lame. Greater than one in a million; but that’s not saying much. If, unlike most lesswrong stalwarts, you (tenatively) believe like me that posthuman superintelligence will most likely be our recursively self-editing biological descendants rather than the outcome of an nonbiological Intelligence Explosion or paperclippers, then some version of the Convergence Thesis is more credible. I (very) tentatively predict a future of gradients of intelligence bliss. But the propagation of a utilitronium shockwave in some guise ultimately seems plausible too. If so, this utilitronium shockwave may or may not resemble some kind of cosmic orgasm.
Actually, I have no opinion on convergence vs orthogonality. There are way too many unknowns still too even enumerate possibilities, let alone assign probabilities.Personally, I think that we are in for many more surprises before trans human intelligence is close to being more than a dream or a nightmare. One ought to spend more time analyzing, synthesizing and otherwise modeling cognitive processes than worrying about where it might ultimately lead.This is not the prevailing wisdom on this site, given Eliezer’s strong views on the matter.