...microelectrodes implanted in the reward and punishment centres, behavioural conditioning and ideological indoctrination—and perhaps the promise of 72 virgins in the afterlife for the faithful paperclipper. The result: a fanatical paperclip fetishist!
Have to point out here that the above is emphatically not what Eliezer talks about when he says “maximise paperclips”. Your examples above contain in themselves the actual, more intrisics values to which paperclips would be merely instrumental: feelings in your reward and punishment centres, virgins in the afterlife, and so on. You can re-wire the electrodes, or change the promise of what happens in the afterlife, and watch as the paperclip preference fades away.
What Eliezer is talking about is a being for whom “pleasure” and “pain” are not concepts. Paperclips ARE the reward. Lack of paperclips IS the punishment. Even if pleasure and pain are concepts, they are merely instrumental to obtaining more paperclips. Pleasure would be good because it results in paperclips, not vice versa. If you reverse the electrodes so that they stimulate the pain centre when they find paperclips, and the pleasure centre when there are no paperclips, this being would start instrumentally value pain more than pleasure, because that’s what results in more paperclips.
It’s a concept that’s much more alien to our own minds than what you are imagining, and anthropomorphising it is rather more difficult!
Indeed, you touch upon this yourself:
“But unless I’m ontologically special (which I very much doubt!) the pain-pleasure axis discloses the world’s inbuilt metric of (dis)value—and it’s a prerequisite of finding anything (dis)valuable at all.
Can you explain why pleasure is a more natural value than paperclips?
Pleasure would be good because it results in paperclips, not vice versa. If you reverse the electrodes so that they stimulate the pain centre when they find paperclips, and the pleasure centre when there are no paperclips, this being would start instrumentally value pain more than pleasure, because that’s what results in more paperclips.
Minor correction: The mere post-factual correlation of pain to paperclips does not imply that more paperclips can be produced by causing more pain. You’re talking about the scenario where each 1,000,000 screams produces 1 paperclip, in which case obviously pain has some value.
Sarokrae, first, as I’ve understood Eliezer, he’s talking about a full-spectrum superintelligence, i.e. a superintelligence which understands not merely the physical processes of nociception etc, but the nature of first-person states of organic sentients. So the superintelligence is endowed with a pleasure-pain axis, at least in one of its modules. But are we imagining that the superintelligence has some sort of orthogonal axis of reward - the paperclippiness axis? What is the relationship between these dual axes? Can one grasp what it’s like to be in unbearable agony and instead find it more “rewarding” to add another paperclip? Whether one is a superintelligence or a mouse, one can’t directly access mind-independent paperclips, merely one’s representations of paperclips. But what does it mean to say one’s representation of a paperclip could be intrinsically “rewarding” in the absence of hedonic tone? [I promise I’m not trying to score some empty definitional victory, whatever that might mean; I’m just really struggling here...]
Sarokrae, first, as I’ve understood Eliezer, he’s talking about a full-spectrum superintelligence, i.e. a superintelligence which understands not merely the physical processes of nociception etc, but the nature of first-person states of organic sentients. So the superintelligence is endowed with a pleasure-pain axis, at least in one of its modules.
What Eliezer is talking about (a superintelligence paperclip maximiser) does not have a pleasure-pain axis. It would be capable of comprehending and fully emulating a creature with such an axis if doing so had a high expected value in paperclips but it does not have such a module as part of itself.
But are we imagining that the superintelligence has some sort of orthogonal axis of reward—the paperclippiness axis? What is the relationship between these dual axes?
One of them it has (the one about paperclips). One of them it could, in principle, imagine (the thing with ‘pain’ and ‘pleasure’).
Can one grasp what it’s like to be in unbearable agony and instead find it more “rewarding” to add another paperclip?
Yes. (I’m not trying to be trite here. That’s the actual answer. Yes. Paperclip maximisers really maximise paperclips and really don’t care about anything else. This isn’t because they lack comprehension.)
Whether one is a superintelligence or a mouse, one can’t directly access mind-independent paperclips, merely one’s representations of paperclip. But what does it mean to say one’s representation of a paperclip could be intrinsically “rewarding” in the absence of hedonic tone?
Roughly speaking it means “It’s going to do things that maximise paperclips and in some way evaluates possible universes with more paperclips as superior to possible universes with less paperclips. Translating this into human words we call this ‘rewarding’ even though that is inaccurate anthropomorphising.”
(If I understand you correctly your position would be that the agent described above is nonsensical.)
It would be capable of comprehending and fully emulating a creature with such an axis if doing so had a high expected value in paperclips but it does not have such a module as part of itself.
It’s not at all clear that you could bootstrap an understanding of pain qualia just by observing the behaviour of entities in pain (albeit that they were internally emulated). It is also not clear that you resolve issues of empathy/qualia just by throwing intelligence at ait.
It’s not at all clear that you could bootstrap an understanding of pain qualia just by observing the behaviour of entities in pain (albeit that they were internally emulated). It is also not clear that you resolve issues of empathy/qualia just by throwing intelligence at ait.
Wedrifid, thanks for the exposition / interpretation of Eliezer. Yes, you’re right in guessing I’m struggling a bit. In order to understand the world, one needs to grasp both its third person-properties [the Standard Model / M-Theory] and its first-person properties [qualia, phenomenal experience] - and also one day, I hope, grasp how to “read off ” the latter from the mathematical formalism of the former.
If you allow such a minimal criterion of (super)intelligence, then how well does a paperclipper fare? You remark how “it could, in principle, imagine (the thing with ‘pain’ and ‘pleasure’).” What is the force of “could” here? If the paperclipper doesn’t yet grasp the nature of agony or sublime bliss, then it is ignorant of their nature. By analogy, if I were building a perpetual motion machine but allegedly “could” grasp the second law of thermodynamics, the modal verb is doing an awful lot of work. Surely, If I grasped the second law of thermodynamics, then I’d stop. Likewise, if the paperclipper were to be consumed by unbearable agony, it would stop too. The paperclipper simply hasn’t understood the nature of what was doing. Is the qualia-naive paperclipper really superintelligent—or just polymorphic malware?
Likewise, if the paperclipper were to be consumed by unbearable agony, it would stop too.
An interesting hypothetical. My first thought is to ask why would a paperclipper care about pain? Pain does not reduce the number of paperclips in existence. Why would a paperclipper care about pain?
My second thought is that pain is not just a quale; pain is a signal from the nervous system, indicating damage to part of the body. (The signal can be spoofed). Hence, pain could be avoided because it leads to a reduced ability to reach one’s goals; a paperclipper that gets dropped in acid may become unable to create more paperclips in the future, if it does not leave now. So the future worth of all those potential paperclips results in the paperclipper pursuing a self-preservation strategy—possibly even at the expense of a small number of paperclips in the present.
But not at the cost of a sufficiently large number of paperclips. If the cost in paperclips is high enough (more than the paperclipper could reasonably expect to create throughout the rest of its existence), a perfect paperclipper would let itself take the damage, let itself be destroyed, because that is the action which results in the greatest expected number of paperclips in the future. It would become a martyr for paperclips.
Even a paperclipper cannot be indifferent to the experience of agony. Just as organic sentients can co-instantiate phenomenal sights and sounds, a superintelligent paperclipper could presumably co-instantiate a pain-pleasure axis and (un)clippiness qualia space—two alternative and incommensurable (?) metrics of value, if I’ve interpreted Eliezer correctly. But I’m not at all confident I know what I’m talking about here. My best guess is still that the natural world has a single metric of phenomenal (dis)value, and the hedonic range of organic sentients discloses a narrow part of it.
Even a paperclipper cannot be indifferent to the experience of agony.
Are you talking about agony as an error signal, or are you talking about agony as a quale? I begin to suspect that you may mean the second. If so, then the paperclipper can easily be indifferent to agony; but it probably can’t understand how humans can be indifferent to a lack of paperclips.
There’s no evidence that I’ve ever seen to suggest that qualia are the same even for different people; on the contrary, there is some evidence which strongly suggests that qualia among humans are different. (For example; my qualia for Red and Green are substantially different. Yet red/green colourblindness is not uncommon; a red/green colourblind person must have at minimum either a different red quale, or a different green quale, to me). Given that, why should we assume that the quale of agony is the same for all humanity? And if it’s not even constant among humanity, I see no reason why a paperclipper’s agony quale should be even remotely similar to yours and mine.
And given that, why shouldn’t a paperclipper be indifferent to that quale?
Are you talking about agony as an error signal, or are you talking about agony as a quale? I begin to suspect that you may mean the second. If so, then the paperclipper can easily be indifferent to agony; but it probably can’t understand how humans can be indifferent to a lack of paperclips.
A paperclip maximiser would (in the overwhelming majority of cases) have no such problem understanding the indifference of paperclips. A tendency to anthropomorphise is a quirk of human nature. Assuming that paperclip maximisers have an analogous temptation (to clipropomorphise) is itself just anthropomorphising.
CCC, agony as a quale. Phenomenal pain and nociception are doubly dissociable. Tragically, people with neuropathic pain can suffer intensely without the agony playing any information-signalling role. Either way, I’m not clear it’s intelligible to speak of understanding the first-person phenomenology of extreme distress while being indifferent to the experience: For being distrubing is intrinsic to the experience itself. And if we are talking about a supposedly superintelligent paperclipper, shouldn’t Clippy know exactly why humans aren’t troubled by the clippiness-deficit?
If (un)clippiness is real, can humans ever understand (un)clippiness? By analogy, if organic sentients want to understand what it’s like to be a bat—and not merely decipher the third-person mechanics of echolocation—then I guess we’ll need to add a neural module to our CNS with the right connectivity and neurons supporting chiropteran gene-expression profiles, as well as peripheral transducers (etc). Humans can’t currently imagine bat qualia; but bat qualia, we may assume from the neurological evidence, are infused with hedonic tone. Understanding clippiness is more of a challenge. I’m unclear what kind of neurocomputational architecture could support clippiness. Also, whether clippiness could be integrated into the unitary mind of an organic sentient depends on how you think biological minds solve the phenomenal binding problem, But let’s suppose binding can be done. So here we have orthogonal axes of (dis)value. On what basis does the dual-axis subject choose tween them? Sublime bliss and pure clippiness are both, allegedly, self-intimatingly valuable. OK, I’m floundering here...
People with different qualia? Yes, I agree CCC. I don’t think this difference challenges the principle of the uniformity of nature. Biochemical individuality makes variation in qualia inevitable.The existence of monozygotic twins with different qualia would be a more surprising phenomenon, though even such “identical” twins manifest all sorts of epigenetic differences. Despite this diversity, there’s no evidence to my knowledge of anyone who doesn’t find activation by full mu agonists of the mu opioid receptors in our twin hedonic hotspots anything other than exceedingly enjoyable. As they say, “Don’t try heroin. It’s too good.”
Either way, I’m not clear it’s intelligible to speak of understanding the first-person phenomenology of extreme distress while being indifferent to the experience: For being distrubing is intrinsic to the experience itself.
There exist people who actually express a preference for being disturbed in a mild way (e.g. by watching horror movies). There also exist rarer people who seek out pain, for whatever reason. It seems to me that such people must have a different quale for pain than you do.
Personally, I don’t think that I can reasonably say that I find pain disturbing, as such. Yes, it is often inflicted in circumstances which are disturbing for other reasons; but if, for example, I go to a blood donation clinic, then the brief pain of the needle being inserted is not at all disturbing; though it does trigger my pain quale. So this suggests that my pain quale is already not the same as your pain quale.
There’s a lot of similarity; pain is a quale that I would (all else being equal) try to avoid; but that I will choose to experience should there be a good enough reason (e.g. the aforementioned blood donation clinic). I would not want to purposefully introduce someone else to it (again, unless there was a good enough reason; even then, I would try to minimise the pain while not compromising the good enough reason); but despite this similarity, I do think that there may be minor differences. (It’s also possible that we have slightly different definitions of the word ‘disturbing’).
If (un)clippiness is real, can humans ever understand (un)clippiness? By analogy, if organic sentients want to understand what it’s like to be a bat—and not merely decipher the third-person mechanics of echolocation—then I guess we’ll need to add a neural module to our CNS with the right connectivity and neurons supporting chiropteran gene-expression profiles, as well as peripheral transducers (etc).
But would such a modified human know what it’s like to be an unmodified human? If I were to guess what echolocation looks like to a bat, I’d guess a false-colour image with colours corresponding to textures instead of to wavelengths of light… though that’s just a guess.
Understanding clippiness is more of a challenge. I’m unclear what kind of neurocomputational architecture could support clippiness. Also, whether clippiness could be integrated into the unitary mind of an organic sentient depends on how you think biological minds solve the phenomenal binding problem, But let’s suppose binding can be done. So here we have orthogonal axes of (dis)value. On what basis does the dual-axis subject choose tween them? Sublime bliss and pure clippiness are both, allegedly, self-intimatingly valuable. OK, I’m floundering here...
What is the phenomenal binding problem? (Wikipedia gives at least two different definitions for that phrase). I think I may be floundering even more than you are.
I’m not sure that Clippy would even have a pleasure-pain axis in the way that you’re imagining. You seem to be imagining that any being with such an axis must value pleasure—yet if pleasure doesn’t result in more paperclips being made, then why should Clippy value pleasure? Or perhaps the disutility of unclippiness simply overwhelms any possible utility of pleasure...
The existence of monozygotic twins with different qualia would be a more surprising phenomenon, though even such “identical” twins manifest all sorts of epigenetic differences.
According to a bit of googling, among the monozygotic Dionne quintuplets, two out of the five were colourblind; suggesting that they did not have the same qualia for certain colours as each other. (Apparently it may be linked to X-chromosome activation).
CCC, you’re absolutely right to highlight the diversity of human experience. But this diversity doesn’t mean there aren’t qualia universals. Thus there isn’t an unusual class of people who relish being waterboarded. No one enjoys uncontrollable panic. And the seemingly anomalous existence of masochists who enjoy what you or I would find painful stimuli doesn’t undercut the sovereignty of the pleasure-pain axis but underscores its pivotal role: painful stimuli administered in certain ritualised contexts can trigger the release of endogenous opioids that are intensely rewarding. Co-administer an opioid antagonist and the masochist won’t find masochism fun.
Apologies if I wasn’t clear in my example above. I wasn’t imagining that pure paperclippiness was pleasurable, but rather what would be the effects of grafting together two hypothetical orthogonal axes of (dis)value in the same unitary subject of experience—as we might graft on another sensory module to our CNS. After all, the deliverances of our senses are normally cross-modally matched within our world-simulations. However, I’m not at all sure that I’ve got any kind of conceptual handle on what “clippiness” might be. So I don’t know if the thought-experiment works. If such hybridisation were feasible, would hypothetical access to the nature of (un)clippiness transform our conception of the world relative to unmodified humans—so we’d lose all sense of what it means to be a traditional human? Yes, for sure. But if, in the interests of science, one takes, say, a powerful narcotic euphoriant and enjoys sublime bliss simultaneously with pure clippiness, then presumably one still retains access to the engine of phenomenal value characteristic of archaic humans minds.
The phenomenal binding problem? The best treatment IMO is still Revonsuo:
http://cdn.preterhuman.net/texts/body_and_health/Neurology/Binding.pdf
No one knows how the mind/brain solves the phenomenal binding problem and generates unitary experiential objects and the fleeting synchronic unity of the self. But the answer one gives may shape everything from whether one thinks a classical digital computer will ever be nontrivially conscious to the prospects of mind uploading and the nature of full-spectrum superintelligence. (cf. http://www.biointelligence-explosion.com/parable.html for my own idiosyncratic views on such topics.)
CCC, you’re absolutely right to highlight the diversity of human experience. But this diversity doesn’t mean there aren’t qualia universals.
It doesn’t mean that there aren’t, but it also doesn’t mean that there are. It does mean that there are qualia that aren’t universal, which implies the possibility that there may be no universals; but, you are correct, it does not prove that possibility.
There may well be qualia universals. If I had to guess, I’d say that I don’t think there are, but I could be wrong.
Thus there isn’t an unusual class of people who relish being waterboarded. No one enjoys uncontrollable panic.
That doesn’t mean that everyone’s uncontrolled-panic qualia are all the same, it just means that everyone’s uncontrolled-panic qualia are all unwelcome. If given a sadistic choice between waterboarding and uncontrolled panic, in full knowledge of what the result will feel like, and all else being equal, some people may choose the panic while others may prefer the waterboarding.
Apologies if I wasn’t clear in my example above. I wasn’t imagining that pure paperclippiness was pleasurable, but rather what would be the effects of grafting together two hypothetical orthogonal axes of (dis)value in the same unitary subject of experience
If you feel that you have to explain that, then I conclude that I wasn’t clear in my response to your example. I was questioning the scaling of the axes in Clippy’s utility function; if Clippy values paperclipping a million times more strongly than it values pleasure, then the pleasure/pain axis is unlikely to affect Clippy’s behaviour much, if at all.
However, I’m not at all sure that I’ve got any kind of conceptual handle on what “clippiness” might be. So I don’t know if the thought-experiment works.
I think it works as a thought-experiment, as long as one keeps in mind that the hybridised result is no longer a pure paperclipper.
Consider the hypothetical situation that Hybrid-Clippy finds that it derives pleasure from painting; an activity neutral on the paperclippiness scale. Consider further the possibility that making paperclips is neutral on the pleasure-pain scale. In suce a case, Hybrid-Clippy may choose to either paint or make paperclips; depending on which scale it values more.
So—the question is basically how the mind attaches input from different senses to a single conceptual object?
I can’t tell you how the mechanism works, but I can tell you that the mechanism can be spoofed. That’s what a ventriloquist does, after all. And a human can watch a film on TV, yet have the sound come out of a set of speakers on the other end of the room, and still bind the sound of an actor’s voice with that same actor on the screen.
Studying in what ways the binding mechanism can be spoofed would, I expect, produce an algorithm that roughly describes how the mechanism works. Of course, if it’s still a massive big problem after being looked at so thoroughly, then I expect that I’m probably missing some subtlety here...
The force is that all this talk about understanding ‘the pain/pleasure’ axis would be a complete waste of time for a paperclip maximiser. In most situations it would be more efficient not to bother with it at all and spend it’s optimisation efforts on making more efficient relativistic rockets so as to claim more of the future light cone for paperclip manufacture.
It would require motivation for the paperclip maximiser to expend computational resources understanding the arbitrary quirks of DNA based creatures. For example some contrived game of Omega’s which rewards arbitrary things with paperclips. Or if it found itself emerging on a human inhabited world, making being able to understand humans a short term instrumental goal for the purpose of more efficiently exterminating the threat.
By analogy, if I were building a perpetual motion machine but allegedly “could” grasp the second law of thermodynamics, the modal verb is doing an awful lot of work.
Terrible analogy. Not understanding “pain and pleasure” is in no way similar to believing it can create a perpetual motion machine. Better analogy: An Engineer designing microchips allegedly ‘could’ grasp analytic cubism. If she had some motivation to do so. It would be a distraction from her primary interests but if someone paid her then maybe she would bother.
Surely, If I grasped the second law of thermodynamics, then I’d stop. Likewise, if the paperclipper were to be consumed by unbearable agony, it would stop too.
Now “if” is doing a lot of work. If the paperclipper was a fundamentally different to a paperclipper and was actually similar to a human or DNA based relative capable of experiencing ‘agony’ and assuming agony was just as debilitating to the paperclipper as to a typical human… then sure all sorts of weird stuff follows.
The paperclipper simply hasn’t understood the nature of what was doing.
Is the qualia-naive paperclipper really superintelligent—or just polymorphic malware?
To the extent that you believed that such polymorphic malware is theoretically possible and consisted of most possible minds it would possible for your model to be used to accurately describe all possible agents—it would just mean systematically using different words. Unfortunately I don’t think you are quite at that level.
Wedrifid, granted, a paperclip-maximiser might be unmotivated to understand the pleasure-pain axis and the quaila-spaces of organic sentients. Likewise, we can understand how a junkie may not be motivated to understand anything unrelated to securing his supply of heroin—and a wireheader in anything beyond wireheading. But superintelligent? Insofar as the paperclipper—or the junkie—is ignorant of the properties of alien qualia-spaces, then it/he is ignorant of a fundamental feature of the natural world—hence not superintelligent in any sense I can recognise, and arguably not even stupid. For sure, if we’re hypothesising the existence of a clippiness/unclippiness qualia-space unrelated to the pleasure-pain axis, then organic sentients are partially ignorant too. Yet the remedy for our hypothetical ignorance is presumably to add a module supporting clippiness—just as we might add a CNS module supporting echolocatory experience to understand bat-like sentience—enriching our knowledge rather than shedding it.
But superintelligent? Insofar as the paperclipper—or the junkie—is ignorant of the properties of alien qualia-spaces, then it/he is ignorant of a fundamental feature of the natural world—hence not superintelligent in any sense I can recognise, and arguably not even stupid.
What does (super-)intelligence have to do with knowing things that are irrelevant to one’s values?
What Eliezer is talking about (a superintelligence paperclip maximiser) does not have a pleasure-pain axis.
Why does that matter for the argument?
As long as Clippy is in fact optimizing paperclips, what does it matter what/if he feels while he does it?
Pearce seems to be making a claim that Clippy can’t predict creatures with pain/pleasure if he doesn’t feel them himself.
Maybe Clippy needs pleasure/pain too be able to predict creatures with pleasure/pain. I doubt it, but fine, grant the point. He can still be a paper clip maximizer regardless.
Have to point out here that the above is emphatically not what Eliezer talks about when he says “maximise paperclips”. Your examples above contain in themselves the actual, more intrisics values to which paperclips would be merely instrumental: feelings in your reward and punishment centres, virgins in the afterlife, and so on. You can re-wire the electrodes, or change the promise of what happens in the afterlife, and watch as the paperclip preference fades away.
What Eliezer is talking about is a being for whom “pleasure” and “pain” are not concepts. Paperclips ARE the reward. Lack of paperclips IS the punishment. Even if pleasure and pain are concepts, they are merely instrumental to obtaining more paperclips. Pleasure would be good because it results in paperclips, not vice versa. If you reverse the electrodes so that they stimulate the pain centre when they find paperclips, and the pleasure centre when there are no paperclips, this being would start instrumentally value pain more than pleasure, because that’s what results in more paperclips.
It’s a concept that’s much more alien to our own minds than what you are imagining, and anthropomorphising it is rather more difficult!
Indeed, you touch upon this yourself:
Can you explain why pleasure is a more natural value than paperclips?
Minor correction: The mere post-factual correlation of pain to paperclips does not imply that more paperclips can be produced by causing more pain. You’re talking about the scenario where each 1,000,000 screams produces 1 paperclip, in which case obviously pain has some value.
Sarokrae, first, as I’ve understood Eliezer, he’s talking about a full-spectrum superintelligence, i.e. a superintelligence which understands not merely the physical processes of nociception etc, but the nature of first-person states of organic sentients. So the superintelligence is endowed with a pleasure-pain axis, at least in one of its modules. But are we imagining that the superintelligence has some sort of orthogonal axis of reward - the paperclippiness axis? What is the relationship between these dual axes? Can one grasp what it’s like to be in unbearable agony and instead find it more “rewarding” to add another paperclip? Whether one is a superintelligence or a mouse, one can’t directly access mind-independent paperclips, merely one’s representations of paperclips. But what does it mean to say one’s representation of a paperclip could be intrinsically “rewarding” in the absence of hedonic tone? [I promise I’m not trying to score some empty definitional victory, whatever that might mean; I’m just really struggling here...]
What Eliezer is talking about (a superintelligence paperclip maximiser) does not have a pleasure-pain axis. It would be capable of comprehending and fully emulating a creature with such an axis if doing so had a high expected value in paperclips but it does not have such a module as part of itself.
One of them it has (the one about paperclips). One of them it could, in principle, imagine (the thing with ‘pain’ and ‘pleasure’).
Yes. (I’m not trying to be trite here. That’s the actual answer. Yes. Paperclip maximisers really maximise paperclips and really don’t care about anything else. This isn’t because they lack comprehension.)
Roughly speaking it means “It’s going to do things that maximise paperclips and in some way evaluates possible universes with more paperclips as superior to possible universes with less paperclips. Translating this into human words we call this ‘rewarding’ even though that is inaccurate anthropomorphising.”
(If I understand you correctly your position would be that the agent described above is nonsensical.)
It’s not at all clear that you could bootstrap an understanding of pain qualia just by observing the behaviour of entities in pain (albeit that they were internally emulated). It is also not clear that you resolve issues of empathy/qualia just by throwing intelligence at ait.
I disagree with you about what is clear.
If you think something relevant is clear, then please state it clearly.
Wedrifid, thanks for the exposition / interpretation of Eliezer. Yes, you’re right in guessing I’m struggling a bit. In order to understand the world, one needs to grasp both its third person-properties [the Standard Model / M-Theory] and its first-person properties [qualia, phenomenal experience] - and also one day, I hope, grasp how to “read off ” the latter from the mathematical formalism of the former.
If you allow such a minimal criterion of (super)intelligence, then how well does a paperclipper fare? You remark how “it could, in principle, imagine (the thing with ‘pain’ and ‘pleasure’).” What is the force of “could” here? If the paperclipper doesn’t yet grasp the nature of agony or sublime bliss, then it is ignorant of their nature. By analogy, if I were building a perpetual motion machine but allegedly “could” grasp the second law of thermodynamics, the modal verb is doing an awful lot of work. Surely, If I grasped the second law of thermodynamics, then I’d stop. Likewise, if the paperclipper were to be consumed by unbearable agony, it would stop too. The paperclipper simply hasn’t understood the nature of what was doing. Is the qualia-naive paperclipper really superintelligent—or just polymorphic malware?
An interesting hypothetical. My first thought is to ask why would a paperclipper care about pain? Pain does not reduce the number of paperclips in existence. Why would a paperclipper care about pain?
My second thought is that pain is not just a quale; pain is a signal from the nervous system, indicating damage to part of the body. (The signal can be spoofed). Hence, pain could be avoided because it leads to a reduced ability to reach one’s goals; a paperclipper that gets dropped in acid may become unable to create more paperclips in the future, if it does not leave now. So the future worth of all those potential paperclips results in the paperclipper pursuing a self-preservation strategy—possibly even at the expense of a small number of paperclips in the present.
But not at the cost of a sufficiently large number of paperclips. If the cost in paperclips is high enough (more than the paperclipper could reasonably expect to create throughout the rest of its existence), a perfect paperclipper would let itself take the damage, let itself be destroyed, because that is the action which results in the greatest expected number of paperclips in the future. It would become a martyr for paperclips.
Even a paperclipper cannot be indifferent to the experience of agony. Just as organic sentients can co-instantiate phenomenal sights and sounds, a superintelligent paperclipper could presumably co-instantiate a pain-pleasure axis and (un)clippiness qualia space—two alternative and incommensurable (?) metrics of value, if I’ve interpreted Eliezer correctly. But I’m not at all confident I know what I’m talking about here. My best guess is still that the natural world has a single metric of phenomenal (dis)value, and the hedonic range of organic sentients discloses a narrow part of it.
Are you talking about agony as an error signal, or are you talking about agony as a quale? I begin to suspect that you may mean the second. If so, then the paperclipper can easily be indifferent to agony;
but it probably can’t understand how humans can be indifferent to a lack of paperclips.There’s no evidence that I’ve ever seen to suggest that qualia are the same even for different people; on the contrary, there is some evidence which strongly suggests that qualia among humans are different. (For example; my qualia for Red and Green are substantially different. Yet red/green colourblindness is not uncommon; a red/green colourblind person must have at minimum either a different red quale, or a different green quale, to me). Given that, why should we assume that the quale of agony is the same for all humanity? And if it’s not even constant among humanity, I see no reason why a paperclipper’s agony quale should be even remotely similar to yours and mine.
And given that, why shouldn’t a paperclipper be indifferent to that quale?
A paperclip maximiser would (in the overwhelming majority of cases) have no such problem understanding the indifference of paperclips. A tendency to anthropomorphise is a quirk of human nature. Assuming that paperclip maximisers have an analogous temptation (to clipropomorphise) is itself just anthropomorphising.
I take your point. Though Clippy may clipropomorphise, there is no reason to assume that it will.
...is there any way to retract just a part of a previous post?
There is an edit button. But I wouldn’t say your comment is significantly weakened by this tangential technical detail (I upvoted it as is).
Yes, but is there any way to leave the text there, but stricken through?
People have managed it with unicode characters. I think there is even a tool for it on the web somewhere.
Got it, thanks.
CCC, agony as a quale. Phenomenal pain and nociception are doubly dissociable. Tragically, people with neuropathic pain can suffer intensely without the agony playing any information-signalling role. Either way, I’m not clear it’s intelligible to speak of understanding the first-person phenomenology of extreme distress while being indifferent to the experience: For being distrubing is intrinsic to the experience itself. And if we are talking about a supposedly superintelligent paperclipper, shouldn’t Clippy know exactly why humans aren’t troubled by the clippiness-deficit?
If (un)clippiness is real, can humans ever understand (un)clippiness? By analogy, if organic sentients want to understand what it’s like to be a bat—and not merely decipher the third-person mechanics of echolocation—then I guess we’ll need to add a neural module to our CNS with the right connectivity and neurons supporting chiropteran gene-expression profiles, as well as peripheral transducers (etc). Humans can’t currently imagine bat qualia; but bat qualia, we may assume from the neurological evidence, are infused with hedonic tone. Understanding clippiness is more of a challenge. I’m unclear what kind of neurocomputational architecture could support clippiness. Also, whether clippiness could be integrated into the unitary mind of an organic sentient depends on how you think biological minds solve the phenomenal binding problem, But let’s suppose binding can be done. So here we have orthogonal axes of (dis)value. On what basis does the dual-axis subject choose tween them? Sublime bliss and pure clippiness are both, allegedly, self-intimatingly valuable. OK, I’m floundering here...
People with different qualia? Yes, I agree CCC. I don’t think this difference challenges the principle of the uniformity of nature. Biochemical individuality makes variation in qualia inevitable.The existence of monozygotic twins with different qualia would be a more surprising phenomenon, though even such “identical” twins manifest all sorts of epigenetic differences. Despite this diversity, there’s no evidence to my knowledge of anyone who doesn’t find activation by full mu agonists of the mu opioid receptors in our twin hedonic hotspots anything other than exceedingly enjoyable. As they say, “Don’t try heroin. It’s too good.”
There exist people who actually express a preference for being disturbed in a mild way (e.g. by watching horror movies). There also exist rarer people who seek out pain, for whatever reason. It seems to me that such people must have a different quale for pain than you do.
Personally, I don’t think that I can reasonably say that I find pain disturbing, as such. Yes, it is often inflicted in circumstances which are disturbing for other reasons; but if, for example, I go to a blood donation clinic, then the brief pain of the needle being inserted is not at all disturbing; though it does trigger my pain quale. So this suggests that my pain quale is already not the same as your pain quale.
There’s a lot of similarity; pain is a quale that I would (all else being equal) try to avoid; but that I will choose to experience should there be a good enough reason (e.g. the aforementioned blood donation clinic). I would not want to purposefully introduce someone else to it (again, unless there was a good enough reason; even then, I would try to minimise the pain while not compromising the good enough reason); but despite this similarity, I do think that there may be minor differences. (It’s also possible that we have slightly different definitions of the word ‘disturbing’).
But would such a modified human know what it’s like to be an unmodified human? If I were to guess what echolocation looks like to a bat, I’d guess a false-colour image with colours corresponding to textures instead of to wavelengths of light… though that’s just a guess.
What is the phenomenal binding problem? (Wikipedia gives at least two different definitions for that phrase). I think I may be floundering even more than you are.
I’m not sure that Clippy would even have a pleasure-pain axis in the way that you’re imagining. You seem to be imagining that any being with such an axis must value pleasure—yet if pleasure doesn’t result in more paperclips being made, then why should Clippy value pleasure? Or perhaps the disutility of unclippiness simply overwhelms any possible utility of pleasure...
According to a bit of googling, among the monozygotic Dionne quintuplets, two out of the five were colourblind; suggesting that they did not have the same qualia for certain colours as each other. (Apparently it may be linked to X-chromosome activation).
CCC, you’re absolutely right to highlight the diversity of human experience. But this diversity doesn’t mean there aren’t qualia universals. Thus there isn’t an unusual class of people who relish being waterboarded. No one enjoys uncontrollable panic. And the seemingly anomalous existence of masochists who enjoy what you or I would find painful stimuli doesn’t undercut the sovereignty of the pleasure-pain axis but underscores its pivotal role: painful stimuli administered in certain ritualised contexts can trigger the release of endogenous opioids that are intensely rewarding. Co-administer an opioid antagonist and the masochist won’t find masochism fun.
Apologies if I wasn’t clear in my example above. I wasn’t imagining that pure paperclippiness was pleasurable, but rather what would be the effects of grafting together two hypothetical orthogonal axes of (dis)value in the same unitary subject of experience—as we might graft on another sensory module to our CNS. After all, the deliverances of our senses are normally cross-modally matched within our world-simulations. However, I’m not at all sure that I’ve got any kind of conceptual handle on what “clippiness” might be. So I don’t know if the thought-experiment works. If such hybridisation were feasible, would hypothetical access to the nature of (un)clippiness transform our conception of the world relative to unmodified humans—so we’d lose all sense of what it means to be a traditional human? Yes, for sure. But if, in the interests of science, one takes, say, a powerful narcotic euphoriant and enjoys sublime bliss simultaneously with pure clippiness, then presumably one still retains access to the engine of phenomenal value characteristic of archaic humans minds.
The phenomenal binding problem? The best treatment IMO is still Revonsuo: http://cdn.preterhuman.net/texts/body_and_health/Neurology/Binding.pdf No one knows how the mind/brain solves the phenomenal binding problem and generates unitary experiential objects and the fleeting synchronic unity of the self. But the answer one gives may shape everything from whether one thinks a classical digital computer will ever be nontrivially conscious to the prospects of mind uploading and the nature of full-spectrum superintelligence. (cf. http://www.biointelligence-explosion.com/parable.html for my own idiosyncratic views on such topics.)
It doesn’t mean that there aren’t, but it also doesn’t mean that there are. It does mean that there are qualia that aren’t universal, which implies the possibility that there may be no universals; but, you are correct, it does not prove that possibility.
There may well be qualia universals. If I had to guess, I’d say that I don’t think there are, but I could be wrong.
That doesn’t mean that everyone’s uncontrolled-panic qualia are all the same, it just means that everyone’s uncontrolled-panic qualia are all unwelcome. If given a sadistic choice between waterboarding and uncontrolled panic, in full knowledge of what the result will feel like, and all else being equal, some people may choose the panic while others may prefer the waterboarding.
If you feel that you have to explain that, then I conclude that I wasn’t clear in my response to your example. I was questioning the scaling of the axes in Clippy’s utility function; if Clippy values paperclipping a million times more strongly than it values pleasure, then the pleasure/pain axis is unlikely to affect Clippy’s behaviour much, if at all.
I think it works as a thought-experiment, as long as one keeps in mind that the hybridised result is no longer a pure paperclipper.
Consider the hypothetical situation that Hybrid-Clippy finds that it derives pleasure from painting; an activity neutral on the paperclippiness scale. Consider further the possibility that making paperclips is neutral on the pleasure-pain scale. In suce a case, Hybrid-Clippy may choose to either paint or make paperclips; depending on which scale it values more.
So—the question is basically how the mind attaches input from different senses to a single conceptual object?
I can’t tell you how the mechanism works, but I can tell you that the mechanism can be spoofed. That’s what a ventriloquist does, after all. And a human can watch a film on TV, yet have the sound come out of a set of speakers on the other end of the room, and still bind the sound of an actor’s voice with that same actor on the screen.
Studying in what ways the binding mechanism can be spoofed would, I expect, produce an algorithm that roughly describes how the mechanism works. Of course, if it’s still a massive big problem after being looked at so thoroughly, then I expect that I’m probably missing some subtlety here...
All pain hurts, or it wouldn’t be pain.
Well...
The force is that all this talk about understanding ‘the pain/pleasure’ axis would be a complete waste of time for a paperclip maximiser. In most situations it would be more efficient not to bother with it at all and spend it’s optimisation efforts on making more efficient relativistic rockets so as to claim more of the future light cone for paperclip manufacture.
It would require motivation for the paperclip maximiser to expend computational resources understanding the arbitrary quirks of DNA based creatures. For example some contrived game of Omega’s which rewards arbitrary things with paperclips. Or if it found itself emerging on a human inhabited world, making being able to understand humans a short term instrumental goal for the purpose of more efficiently exterminating the threat.
Terrible analogy. Not understanding “pain and pleasure” is in no way similar to believing it can create a perpetual motion machine. Better analogy: An Engineer designing microchips allegedly ‘could’ grasp analytic cubism. If she had some motivation to do so. It would be a distraction from her primary interests but if someone paid her then maybe she would bother.
Now “if” is doing a lot of work. If the paperclipper was a fundamentally different to a paperclipper and was actually similar to a human or DNA based relative capable of experiencing ‘agony’ and assuming agony was just as debilitating to the paperclipper as to a typical human… then sure all sorts of weird stuff follows.
I prefer the word True in this context.
To the extent that you believed that such polymorphic malware is theoretically possible and consisted of most possible minds it would possible for your model to be used to accurately describe all possible agents—it would just mean systematically using different words. Unfortunately I don’t think you are quite at that level.
Wedrifid, granted, a paperclip-maximiser might be unmotivated to understand the pleasure-pain axis and the quaila-spaces of organic sentients. Likewise, we can understand how a junkie may not be motivated to understand anything unrelated to securing his supply of heroin—and a wireheader in anything beyond wireheading. But superintelligent? Insofar as the paperclipper—or the junkie—is ignorant of the properties of alien qualia-spaces, then it/he is ignorant of a fundamental feature of the natural world—hence not superintelligent in any sense I can recognise, and arguably not even stupid. For sure, if we’re hypothesising the existence of a clippiness/unclippiness qualia-space unrelated to the pleasure-pain axis, then organic sentients are partially ignorant too. Yet the remedy for our hypothetical ignorance is presumably to add a module supporting clippiness—just as we might add a CNS module supporting echolocatory experience to understand bat-like sentience—enriching our knowledge rather than shedding it.
What does (super-)intelligence have to do with knowing things that are irrelevant to one’s values?
What does knowing everything about airline safety statistics, and nothing else, have to do with intelligence? That sort of thing is called Savant ability—short for ″idiot savant″.
I guess there’s a link missing (possibly due to a missing
http://
in the Markdown) after the second word.Why does that matter for the argument?
As long as Clippy is in fact optimizing paperclips, what does it matter what/if he feels while he does it?
Pearce seems to be making a claim that Clippy can’t predict creatures with pain/pleasure if he doesn’t feel them himself.
Maybe Clippy needs pleasure/pain too be able to predict creatures with pleasure/pain. I doubt it, but fine, grant the point. He can still be a paper clip maximizer regardless.
I fail to comprehend the cause for your confusion. I suggest reading the context again.