MugaSofer comments on By Which It May Be Judged

MugaSofer 12 Dec 2012 9:15 UTC
−2 points

If one a single agent has conflicting desires (each of which it values equally) then it should work to alter its desires, so it chooses consistent desires that are most likely to be fulfilled.

Hahaha no. If it doesn’t desire these other desires, then they are less likely to be fulfilled.

Figuring out morality isn’t going to give you the powers to talk down Clippy from killing you for more paper clips. You aren’t going to show how human ‘morality’, which actualises what humans prefer, is any more preferable than ‘Clippy’ ethics. He is just going to kill you.

Well, if you could persuade him our morality is “better” by his standards—results in more paperclips—than it could work. But obviously arguing that Murder Is Wrong is about as smart as them telling you that killing it would be Wrong because it results in less paperclips.

So, let’s now figure out exactly what we want most, (if we had our own CEV) and then go out and do it. Nobody else is gonna do it for us.

Indeed. (Although “us” here includes an FAI, obviously.)
- Ben Pace 12 Dec 2012 19:41 UTC
  0 points
  Parent
  
  If one a single agent has conflicting desires (each of which it values equally) then it should work to alter its desires, so it chooses consistent desires that are most likely to be fulfilled.
  
  Hahaha no. If it doesn’t desire these other desires, then they are less likely to be fulfilled.
  
  I don’t understand… I said it has two equally valued desires? So, it doesn’t desire one over the other. So, if it desired x, y and z, equally well, except that x --> <~y v ~z>, but y or z ( or both ) implied just ~x, then even though it desires x, it would be optimal to alter its desires, so as to not desire x. Then, it will always be happy fulfilling y and z, and not continue to be dissatisfied.
  
  I was saying this in response to dspeyer saying he had two axiomations of morality (I took that to mean two desires, or sets of) which were in conflict. I was saying that there is no universal maxim against which he could measure the two—he just needs to figure out which ones will be optimal in the long term, and (attempt to) discard the rest.
  
  Edit: Oh, I now realise I originally added the word ‘one’ to the first sentence of the earlier post you were quoting. If this was somehow the cause of confusion, my apologies.
  - MugaSofer 12 Dec 2012 20:05 UTC
    3 points
    Parent
    “I value both saving orphans from fires and eating chocolate. I’m a horrible person, so I can’t choose whether to abandon my chocolate and save the orphanage.”
    
    Should I self-modify to ignore the orphans? Hell no. If future-me doesn’t want to save orphans then he never will, even if it would cost no chocolate.
    - Ben Pace 12 Dec 2012 22:34 UTC
      0 points
      Parent
      That’s a very big counterfactual hypothesis, that there exists someone who holds equal moral weight to the statements ‘I am saving orphans from fires’ and ‘I am eating chocolate’. It would certainly show a lack of empathy—or a near self-destructive need for chocolate! In fact, the best choice for someone (if it would still be ‘human’) with those qualities in our society would be to keep the desire to save orphans, so as to retain a modicum of humanity. The only reason I suggest it would want such a modicum, would be so as to survive in the human society it finds itself (assuming wishes to stay alive, so as to continue fulfilling desires). Of course, this whole counter-example assumes that the two desires are equally desired, and at odds. Which is quite difficult even to imagine. But I still think that the earlier idea, that there would be no universal moral standard against which it could compare its decision, remains. It is certainly wrong, and evil to choose the chocolate from my point of view, but I am, alas, only human. And, I will do everything in my power to encourage the sorts of behaviour that makes agents prefer to save orphans from fires, than to eat chocolate!!!
      - MugaSofer 13 Dec 2012 8:53 UTC
        1 point
        Parent
        Hey, it doesn’t have to be orphans. Or it could be two different kinds of orphan—boys and girls, say. The boy’s orphanage is on fire! So is the nearby girl’s orphanage! Which one do you save!
        
        Protip: The correct response is not “I self-modify to only care about one sex.”
        
        EDIT: Also, aren’t you kind of fighting the counterfactual?
        Ben Pace 26 Dec 2012 16:52 UTC
        0 points
        Parent
        I was just talking about sets of desires that clash in principle. When you have to desires that clash over one thing, then you will act to fulfill the stronger of your desires. But, as I’ve tried to make clear, if one desire is to ‘kill all humans’ and another is ‘to save all humans’ then the best idea is to (attempt to) self-modify to have only the desire that will produce the most utility. Having both will mean disutility always.
        
        I’m sorry, I don’t understand what you mean when you say ‘fighting the counterfactual’.
        arundelo 26 Dec 2012 19:07 UTC
        1 point
        Parent
        “Fighting the counterfactual” presumably means “fighting the hypo[thetical]”.
        Ben Pace 26 Dec 2012 19:14 UTC
        0 points
        Parent
        Thanks.
        Richard_Kennaway 26 Dec 2012 17:57 UTC
        0 points
        Parent
        
        But, as I’ve tried to make clear, if one desire is to ‘kill all humans’ and another is ‘to save all humans’
        
        ...then you have a conflict. The best idea is not to cut off one of those desires, but to find out where the conflict comes from and what higher goals are giving rise to these as instrumental subgoals.
        
        If you can’t, then:
        
        You have failed.
        Sucks to be you.
        If you’re screwed enough, you’re screwed.
        Ben Pace 26 Dec 2012 18:07 UTC
        0 points
        Parent
        (For then record, I meant terminal values.)
        Richard_Kennaway 26 Dec 2012 18:38 UTC
        1 point
        Parent
        
        (For then record, I meant terminal values.)
        
        But how do you know something is a terminal value? They don’t come conveniently labelled. Someone else just claimed that not killing people is a terminal value for all “neurotypical” people, but unless they’re going to define every soldier, everyone exonerated at an inquest by reason of self defence, and every doctor who has acceded to a terminal patient’s desire for an easy exit, as non-”neurotypical”, “not killing people” bears about as much resemblance to a terminal value as a D&D character sheet does to an actual person.
        Ben Pace 26 Dec 2012 19:29 UTC
        0 points
        Parent
        I was oversimplifying things. Updated now, thanks.
        MugaSofer 26 Dec 2012 17:48 UTC
        −2 points
        Parent
        
        I’m sorry, I don’t understand what you mean when you say ‘fighting the counterfactual’.
        
        Try the search bar. It’s a pretty common concept here, although I don’t recall where it originated.
        
        I was just talking about sets of desires that clash in principle. When you have to desires that clash over one thing, then you will act to fulfill the stronger of your desires. But, as I’ve tried to make clear, if one desire is to ‘kill all humans’ and another is ‘to save all humans’ then the best idea is to (attempt to) self-modify to have only the desire that will produce the most utility. Having both will mean disutility always.
        
        Well, that disutility is only lower according to my new preferences; my old one’s remain sadly unfulfilled.
        
        More specifically, if I value both freedom and safety (for everyone), should I self-modify not to hate reprogramming others? Or not to care that people will decide to kill each other sometimes?
        Ben Pace 26 Dec 2012 19:30 UTC
        2 points
        Parent
        Hmm… I don’t think my point necessarily helps here. I meant that you will always get disutility when you have two desires that always clash (x and not x); whichever way you choose, the other desire won’t be fulfilled.
        
        However, in the case you offered (and probably most cases) it’s not a good idea to self-modify, as desires don’t clash in principle, always. Like with the chocolate and saving kids one, you just have to perform utility calculations to see which way to go (that one is saving kids).
        MugaSofer 27 Dec 2012 2:28 UTC
        −2 points
        Parent
        
        you will always get disutility when you have two desires that always clash (x and not x); whichever way you choose, the other desire won’t be fulfilled.
        
        Yup. And if you stop caring about one of those values, then modified!you will be happier. But you don’t care about what modified!you wants, you care about x and not-x.