Eliezer Yudkowsky comments on By Which It May Be Judged

Eliezer Yudkowsky 10 Dec 2012 7:42 UTC
9 points
Assume the subject of reprogramming is an existing human being, otherwise minimally altered by this reprogramming, i.e., we don’t do anything that isn’t necessary to switch their motivation to paperclips. So unless you do something gratuitiously non-minimal like moving the whole decision-action system out of the range of introspective modeling, or cutting way down on the detail level of introspective modeling, or changing the empathic architecture for modeling hypothetical selves, the new person will experience themselves as having ineffable ‘qualia’ associated with the motivation to produce paperclips.

The only way to make it seem to them like their motivational quales hadn’t changed over time would be to mess with the encoding of their previous memories of motivation, presumably in a structure-destroying way since the stored data and their introspectively exposed surfaces will not be naturally isomorphic. If you carry out the change to paperclip-motivation in the obvious way, cognitive comparisions of the retrieved memories to current thoughts will return ‘unequal ineffable quales’, and if the memories are visualized in different modalities from current thoughts, ‘incomparable ineffable quales’.

Doing-what-leads-to-paperclips will also be a much simpler ‘quale’, both from the outside perspective looking at the complexity of cognitive data, and in terms of the internal experience of complexity—unless you pack an awful lot of detail into the question of what constitutes a more preferred paperclip. Otherwise, compared to the old days when you thought about justice and fairness, introspection will show that less questioning and uncertainty is involved, and that there are fewer points of variation among the motivational thought-quales being considered.

I suppose you could put in some extra work to make the previous motivations map in cognitively comparable ways along as many joints as possible, and try to edit previous memories without destroying their structure so that they can be visualized in a least common modality with current experiences. But even if you did, memories of the previous quales for rightness-motivation would appear as different in retrospect when compared to current quales for paperclip-motivation as a memory of a 3D greyscale forest landscape vs. a current experience of a 2D red-and-green fractal, even if they’re both articulated in the visual sensory modality and your modal workspace allows you to search for, focus on, and compare commonly ‘experienced’ shapes between them.
- Oligopsony 10 Dec 2012 16:45 UTC
  35 points
  Parent
  I think you and Alicorn may be talking past each other somewhat.
  
  Throughout my life, it seems that what I morally value has varied more than what rightness feels like—just as it seems that what I consider status-raising has changed more than what rising in status feels like, and what I find physically pleasurable has changed more than what physical pleasures feel like. It’s possible that the things my whole person is optimizing for have not changed at all, that my subjective feelings are a direct reflection of this, and that my evaluation of a change of content is merely a change in my causal model of the production of the desiderata (I thought voting for Smith would lower unemployment, but now I think voting for Jones would, etc.) But it seems more plausible to me that
  
  1) the whole me is optimizing for various things, and these things change over time,
  2) and that the conscious me is getting information inputs which it can group together by family resemblance, and which can reinforce or disincentivize its behavior.
  
  Imagine a ship which is governed by an anarchic assembly beneath board and captained by an employee of theirs whom they motivate through in-kind bonuses. So the assembly at one moment might be looking for buried treasure, which they think is in such-and-such a place, and so they send her baskets of fresh apples when she’s steering in that direction and baskets of stinky rotten apples when she’s steering in the wrong. For other goals (refueling, not crashing into reefs) they send her excellent or tedious movies and gorgeous or ugly cabana boys. The captain doesn’t even have direct access to what the apples or whatever are motivating her to do; although she can piece it together. She might even start thinking of apples as irreducibly connected to treasure. But if the assembly decided that they wanted to look for ports of call instead of treasure, I don’t see why in principle they couldn’t start sending her apples in order to do so. And if they did, I think her first response would be, if she was verbally asked, that the treasure—or whatever the dubloons constituting the treasure ultimately represent in terms of the desiderata of the assembly—had moved to the ports of call. This might be a correct inference—perhaps the assembly wants the treasure for money and now they think that comes better from heading to ports of call—but it hardly seems to be a necessarily correct one.
  
  If I met two vampires, and one said his desire to drink blood was mediated through hunger (and that he no longer felt hunger for food, or lust) and another said her desire to drink blood was mediated through lust (and that she no longer felt lust for sex, or hunger) then I do think—presuming they were both once human, experiencing lust and hunger like me—they’ve told me something that allows me to distinguish their experiences from one another, even though they both desire blood and not food or sex.
  
  They may or may not be able to explain to what it is like to be a bat.
  
  Unless I’m inserting a further layer of misunderstanding your position seems to be curiously disjunctivist. I or you or Alicorn or all of us may be making bad inferences in taking “feels like” to mean “reminds one of the sort of experience that brings to mind...” (“I feel like I got mauled by a bear,” says someone not just and maybe never mauled by a bear) or “constituting an experience of” (“what an algorithm feels like from the inside”) when the other is intended. This seems to be a pretty easy elision to make—consider all the philosophers who say things like “well, it feels like we have libertarian free will...”
- Alicorn 10 Dec 2012 7:47 UTC
  19 points
  Parent
  This comment expands how you’d go about reprogramming someone in this way with another layer of granularity, which is certainly interesting on its own merits, but it doesn’t strongly support your assertion about what it would feel like to be that someone. What makes you think this is how qualia work? Have you been performing sinister experiments in your basement? Do you have magic counterfactual-luminosity-powers?
  - Rob Bensinger 10 Dec 2012 19:17 UTC
    26 points
    Parent
    I think Eliezer is simply suggesting that qualia don’t in fact exist in a vacuum. Green feels the way it does partly because it’s the color of chlorophyll. In a universe where plants had picked a different color for chlorophyll (melanophyll, say), with everything else (per impossibile) held constant, we would associate an at least slightly different quale with green and with black, because part of how colors feel is that they subtly remind us of the things that are most often colored that way. Similarly, part of how ‘goodness’ feels is that it imperceptibly reminds us of the extension of good; if that extension were dramatically different, then the feeling would (barring any radical redesigns of how associative thought works) be different too. In a universe where the smallest birds were ten feet tall, thinking about ‘birdiness’ would involve a different quale for the same reason.
  - khafra 10 Dec 2012 15:53 UTC
    7 points
    Parent
    It sounds to me like you don’t think the answer had anything to do with the question. But to think that, you’d pretty much have to discard both the functionalist and physicalist theories of mind, and go full dualist/neutral monist; wouldn’t you?
    - Eliezer Yudkowsky 10 Dec 2012 19:05 UTC
      2 points
      Parent
      I think I’ll go with this as my reply—“Well, imagine that you lived in a monist universe—things would pretty much have to work that way, wouldn’t they?”
  - Nick_Tarleton 10 Dec 2012 18:40 UTC
    1 point
    Parent
    Possibly (this is total speculation) Eliezer is talking about the feeling of one’s entire motivational system (or some large part of it), while you’re talking about the feeling of some much narrower system that you identify as computing morality; so his conception of a Clippified human wouldn’t share your terminal-ish drives to eat tasty food, be near friends, etc., and the qualia that correspond to wanting those things.
    - Eliezer Yudkowsky 10 Dec 2012 19:23 UTC
      9 points
      Parent
      The Clippified human categorizes foods into a similar metric of similarity—still believes that fish tastes more like steak than like chocolate—but of course is not motivated to eat except insofar as staying alive helps to make more paperclips. They have taste, but not tastiness. Actually that might make a surprisingly good metaphor for a lot of the difficulty that some people have with comprehending how Clippy can understand your pain and not care—maybe I’ll try it on the other end of that Facebook conversation.
      - DaFranker 10 Dec 2012 19:44 UTC
        8 points
        Parent
        The metaphor seems like it could lose most of its effectiveness on people who have never applied the outside view to how taste and tastiness feel from inside—they’ve never realized that chocolate tastes good because their brain fires “good taste” when it perceives the experience “chocolate taste”. The obvious resulting cognitive dissonance (from “tastes bad for others”) predictions match my observations, so I suspect this would be common among non-rationalists. If the Facebook conversation you mention is with people who haven’t crossed that inferential gap yet, it might prove not that useful.
- Vaniver 10 Dec 2012 20:51 UTC
  16 points
  Parent
  Consider Bob. Bob, like most unreflective people, settles many moral questions by “am I disgusted by it?” Bob is disgusted by, among other things, feces, rotten fruit, corpses, maggots, and men kissing men. Internally, it feels to Bob like the disgust he feels at one of those stimuli is the same as the disgust he feels at the other stimuli, and brain scans show that they all activate the insula in basically the same way.
  
  Bob goes through aversion therapy (or some other method) and eventually his insula no longer activates when he sees men kissing men.
  
  When Bob remembers his previous reaction to that stimuli, I imagine he would remember being disgusted, but not be disgusted when he remembers the stimuli. His positions on, say, same-sex marriage or the acceptability of gay relationships have changed, and he is aware that they have changed.
  
  Do you think this example agrees with your account? If/where it disagrees, why do you prefer your account?
  - Rob Bensinger 10 Dec 2012 21:06 UTC
    11 points
    Parent
    I think this is really a sorites problem. If you change what’s delicious only slightly, then deliciousness itself seems to be unaltered. But if you change it radically — say, if circuits similar to your old gustatory ones now trigger when and only when you see a bright light — then it seems plausible that the experience itself will be at least somewhat changed, because ‘how things feel’ is affected by our whole web of perceptual and conceptual associations. There isn’t necessarily any sharp line where a change in deliciousness itself suddenly becomes perceptible; but it’s nevertheless the case that the overall extension of ‘delicious’ (like ‘disgusting’ and ‘moral’) has some effect on how we experience deliciousness. E.g., deliciousness feels more foodish than lightish.
    - Vaniver 10 Dec 2012 21:21 UTC
      9 points
      Parent
      
      it seems plausible that the experience itself will be at least somewhat changed, because ‘how things feel’ is affected by our whole web of perceptual and conceptual associations.
      
      When I look at the problem introspectively, I can see that as a sensible guess. It doesn’t seem like a sensible guess when I look at it from a neurological perspective. If the activation of the insula is disgust, then the claim that outputs of the insula will have a different introspective flavor when you rewire the inputs of the insula seems doubtful. Sure, it could be the case, but why?
      
      When we hypnotize people to make them disgusted by benign things, I haven’t seen any mention that the disgust has a different introspective flavor, and people seem to reason about that disgust in the exact same way that they reason about the disgust they had before.
      
      This seems like the claim that rewiring yourself leads to something like synesthesia, and that just seems like an odd and unsupported claim to me.
      - Rob Bensinger 10 Dec 2012 21:56 UTC
        4 points
        Parent
        
        If the activation of the insula is disgust
        
        Certain patterns of behavior at the insula correlate with disgust. But we don’t know whether they’re sufficient for disgust, nor do we know which modifications within or outside of the insula change the conscious character of disgust. There are lots of problems with identity claims at this stage, so I’ll just raise one: For all we know, activation patterns in a given brain region correlate with disgust because disgust is experienced when that brain region inhibits another part of the brain; an experience could consist, in context, in the absence of a certain kind of brain activity.
        
        When we hypnotize people to make them disgusted by benign things, I haven’t seen any mention that the disgust has a different introspective flavor
        
        Hypnosis data is especially difficult to evaluate, because it isn’t clear (a) how reliable people’s self-reports about introspection are while under hypnosis; nor (b) how reliable people’s memories-of-hypnosis are afterward. Some ‘dissociative’ people even give contradictory phenomenological reports while under hypnosis.
        
        That said, if you know of any studies suggesting that the disgust doesn’t have at all a different character, I’d be very interested to see them!
        
        If you think my claim isn’t modest and fairly obvious, then it might be that you aren’t understanding my claim. Redness feels at least a little bit bloodish. Greenness feels at least a little bit foresty. If we made a clone who sees evergreen forests as everred and blood as green, then their experience of greenness and redness would be partly the same, but it wouldn’t be completely the same, because that overtone of bloodiness would remain in the background of a variety of green experiences, and that woodsy overtone would remain in the background of a variety of red experiences.
        Vaniver 10 Dec 2012 23:09 UTC
        1 point
        Parent
        
        If you think my claim isn’t modest and fairly obvious, then it might be that you aren’t understanding my claim.
        
        I’m differentiating between “red evokes blood” and “red feels bloody,” because those seem like different things to me. The former deals with memory and association, and the second deals with introspection, and so I agree that the same introspective sensation could evoke very different memories.
        
        The dynamics of introspective sensations could plausibly vary between people, and so I’m reluctant to discuss it extensively except in the context of object-level comparisons.
        Rob Bensinger 11 Dec 2012 0:55 UTC
        1 point
        Parent
        I’m not sure exactly what you mean by “red evokes blood.” I agree that “red feels bloody” is intuitively distinct from “I tend to think explicitly about blood when I start thinking about redness,” though the two are causally related. Certain shades of green to me feel fresh, clean, ‘naturey;’ certain shades of red to me feel violent, hot, glaring; certain shades of blue feel cool; etc. My suggestion is that these qualia, which are part of the feeling of the colors themselves for most humans, would be experientially different even when decontextualized if we’d gone through life perceiving forests as blue, oceans as red, campfires as green, etc. By analogy, the feeling of ‘virtue’ may be partly independent of which things we think of under the concept ‘virtuous;’ but it isn’t completely independent of those things.
        Vaniver 11 Dec 2012 1:33 UTC
        2 points
        Parent
        
        Certain shades of green to me feel fresh, clean, ‘naturey;’ certain shades of red to me feel violent, hot, glaring; certain shades of blue feel cool; etc.
        
        I am aware that many humans have this sort of classification of colors, and have learned it because of its value in communication, but as far as I can tell this isn’t a significant part of my mental experience. A dark green might make it easier for me to think of leaves or forests, but I don’t have any experiences that I would describe as feeling ‘naturey’. If oceans and forests swapped colors, I imagine that seeing the same dark green would make it easier for me to think of waves and water, but I think my introspective experience would be the same.
        
        If I can simplify your claim a bit, it sounds like if both oceans and forests were dark green, then seeing dark green would make you think of leaves and waves / feel associated feelings, and that this ensemble would be different from your current sensation of ocean blue or forest green. It seems sensible to me that the ensembles are different because they have different elements.
        
        I’m happier with modeling that as perceptual bleedover- because forests and green are heavily linked, even forests that aren’t green are linked to green, and greens that aren’t on leaves are linked with forests- than I am modeling that as an atom of consciousness- the sensation of foresty greens- but if your purposes are different, a different model may be more suitable.
        Rob Bensinger 13 Dec 2012 4:55 UTC
        1 point
        Parent
        Part of the problem may be that I’m not so sure I have a distinct, empirically robust idea of an ‘atom of consciousness.’ I took for granted your distinction between ‘evoking blood’ and ‘feeling bloody,’ but in practice these two ideas blend together a great deal. Some ideas—phonological and musical ones, for example—are instantiated in memory by certain temporal sequences and patterns of association. From my armchair, I’m not sure how much my idea of green (or goodness, or clippiness) is what it is in virtue of its temporal and associative dispositions, too. And I don’t know if Eliezer is any less confused than I.
      - NancyLebovitz 12 Dec 2012 8:10 UTC
        0 points
        Parent
        It wouldn’t surprise me if the sensation of disgust has some variation from one person to another, and even for the same person, from one object to another.
  - adamisom 11 Dec 2012 18:32 UTC
    4 points
    Parent
    I just wanted to tell everyone that it is great fun to read this in the voice of that voice actor for the Enzyte commercial :)
  - FeepingCreature 15 Dec 2012 1:41 UTC
    1 point
    Parent
    I think this is easier because disgust is relatively arbitrary to begin with, in that it seems to implement a function over the world-you relation (roughly, things that are bad for you to eat/be near). We wouldn’t expect that relation to have much coherence to begin with, so there’d be not much loss of coherence from modifying it—though, arguably, the same thing could be said for most qualia—elegance is kind of the odd one out.
- Armok_GoB 10 Dec 2012 18:40 UTC
  5 points
  Parent
  I wouldn’t be all that suprised if the easiest way to get a human maximizing papperclips was to make it believe paperclips had epiphenomenal consciousnesses experiencing astronomical amounts of pleasure.
  
  edit: or you could just give them a false memory of god telling them to do it.
  - FeepingCreature 15 Dec 2012 1:48 UTC
    3 points
    Parent
    
    I wouldn’t be all that suprised if the easiest way to get a human maximizing papperclips was to make it believe paperclips had epiphenomenal consciousnesses
    
    The Enrichment Center would like to remind you that the Paperclip cannot speak. In the event that the Paperclip does speak, the Enrichment Center urges you to disregard its advice.
- MugaSofer 10 Dec 2012 16:59 UTC
  2 points
  Parent
  Wouldn’t it be easier to have the programee remember themself as misunderstanding morality—like a reformed racist who previously preferred options that harmed minorities. I know when I gain more insight into my ethics I remember making decisions that, in retrospect, are incomprehensible (unless I deliberately keep in mind how I thought I should act.)
  - Eugine_Nier 11 Dec 2012 2:14 UTC
    0 points
    Parent
    
    Wouldn’t it be easier to have the programee remember themself as misunderstanding morality
    
    That depends on the details of how the human brain stores goals and memories.
    - MugaSofer 11 Dec 2012 9:09 UTC
      2 points
      Parent
      Cached thoughts regularly supersede actual moral thinking, like all forms of thinking, and I am capable of remembering this experience. Am I misunderstanding your comment?
      - Eugine_Nier 13 Dec 2012 4:42 UTC
        0 points
        Parent
        My point is that in order to “fully reprogram” someone it is also necessary to clear their “moral cache” at the very least.
        MugaSofer 13 Dec 2012 9:06 UTC
        1 point
        Parent
        Well … is it? Would you notice if your morals changed when you weren’t looking?
        Eugine_Nier 14 Dec 2012 3:05 UTC
        0 points
        Parent
        I probably would, but then again I’m in the habit of comparing the out of my moral intuitions with stored earlier versions of that output.
        MugaSofer 14 Dec 2012 11:02 UTC
        0 points
        Parent
        I guess it depends on how much you rely on cached thoughts in your moral reasoning.
        
        Of course, it can be hard to tell how much you’re using ’em. Hmm...
- JoachimSchipper 13 Dec 2012 7:56 UTC
  1 point
  Parent
  I have no problem with this passage. But it does not seem obviously impossible to create a device that stimulates that-which-feels-rightness proportionally to (its estimate of) the clippiness of the universe—it’s just a very peculiar kind of wireheading.
  
  As you point out, it’d be obvious, on reflection, that one’s sense of rightness has changed; but that doesn’t necessarily make it a different qualia, any more than having your eyes opened to the suffering of (group) changes your experience of (in)justice qua (in)justice.
- Gust 3 Jan 2013 14:14 UTC
  0 points
  Parent
  Although I think your point here is plausible, I don’t think it fits in a post where you are talking about the logicalness of morality. This qualia problem is physical; whether your feeling changes when the structure of some part of your decision system changes depends on your implementation.
  
  Maybe your background understanding of neurology is enough for you to be somewhat confident stating this feeling/logical-function relation for humans. But mine is not and, although I could separate your metaethical explanations from your physical claims when reading the post, I think it would be better off without the latter.