JamesAndrix comments on Would Your Real Preferences Please Stand Up?

JamesAndrix 9 Aug 2009 15:08 UTC
1 point
I don’t like the idea of becoming the kind of entity that consistently decides not to do the fun frivolous thing.

If a general decides that it’s more strategically important to defend a particular bridge than a particular city, He could self modify so that he no longer cares about the city, so that those desires don’t get in the way of defending the bridge. He can only do that so many times before he no longer cares about the nation, just ‘defending’ it.
- kpreid 9 Aug 2009 16:25 UTC
  9 points
  Parent
  
  I don’t like the idea of becoming the kind of entity that consistently decides not to do the fun frivolous thing.
  
  Then don’t.
  
  If a general decides that it’s more strategically important to defend a particular bridge than a particular city, He could self modify so that he no longer cares about the city, so that those desires don’t get in the way of defending the bridge. He can only do that so many times before he no longer cares about the nation, just ‘defending’ it.
  
  In general, this is subgoal stomp. The hypothetical general’s error is in discarding what he starts out wanting (supporting his nation, I assume) in favor of wanting what he initially thinks will help achieve it (military defense of specific positions).
  
  (I’m pretty sure humans do this already, even without general self-modification.)
  
  If you don’t want to lose what you care about, don’t change what you care about.
  - JamesAndrix 9 Aug 2009 17:20 UTC
    5 points
    Parent
    Then don’t.
    
    Well, I’m also concerned about other people here doing that.
    
    If you don’t want to lose what you care about, don’t change what you care about.
    
    Isn’t that what anti-akrasia does though? If I like coffee but dislike some effect of coffee and I selfmodify into some who at least doesn’t drink coffee, and maybe doesn’t like it anymore, then I think I’ve lost something for something else, but not things in an easy to parse subgoal-supergoal relationship.
    
    A general choosing between two cities might have been a better example.
    
    In other words, you should switch the train to the tracks with one person instead of five, but you shouldn’t self modify to so that it is an easy thing to do.
    - pjeby 10 Aug 2009 11:58 UTC
      1 point
      Parent
      
      Isn’t that what anti-akrasia does though? If I like coffee but dislike some effect of coffee and I selfmodify into some who at least doesn’t drink coffee, and maybe doesn’t like it anymore, then I think I’ve lost something for something else, but not things in an easy to parse subgoal-supergoal relationship.
      
      The wonderful thing about the brain is that if what you get out of something is actually important to you, you probably won’t succeed in getting rid of it for long, or will find some other way to get whatever you got out of it before.
      
      (That’s also the really terrible thing about the brain, since that same principle is also where akrasia and “meta” akrasia come from!)
    - Psychohistorian 10 Aug 2009 16:55 UTC
      0 points
      Parent
      
      If I like coffee but dislike some effect of coffee and I selfmodify into some who at least doesn’t drink coffee, and maybe doesn’t like it anymore, then I think I’ve lost something for something else
      
      There’s a big qualitative difference between not doing something and not liking it. This is based on personal experience, so YMMV, but if you can self-modify to the point of not liking something that you think you should avoid (coffee, sweets, etc), you actually experience positive utility from disliking it. Merely avoiding it may cause some difficulty, but actually self-modifying to dislike a perceived bad thing feels good.
      
      The counterfactual can’t be compared too easily, but if it’s something you were feeling guilt over, you’re probably better off.
      - JamesAndrix 10 Aug 2009 22:19 UTC
        0 points
        Parent
        But how far can you go down that road before you’re not human any more? (in a bad way) Coffee isn’t a bad thing, it’s a good thing with some side effects people don’t like. Self modifying to no longer like a good thing may be good, but it seems unstable.
        
        what if we happened to live in a world where everything that tasted good was bad for us, would we be better off eradicating taste? What if the things we don’t yet know about the universe have similar features?
        taryneast 26 Jul 2011 17:34 UTC
        0 points
        Parent
        Yes good point. This reminds me of a question once brought up at the meetup:
        
        “If you could modify yourself so that you really liked working hard to improve the world, over your current life enjoyments, would you?”
        
        The idea being—would you modify yourself so you valued, say, working for charities, rather than playing computer games and listening to music (or whatever it is you do now instead of working at the local soup kitchen every weekend)?
        
        If you did it—you know that you would actually be made happy by doing the kinds of things you currently think you “should do more of but don’t get around to very often”.
        
        My gut reaction is to flinch away from doing this… I’d be interested in exploring why that is… but have no idea really where to start.
        TheOtherDave 26 Jul 2011 18:53 UTC
        1 point
        Parent
        Well, a somewhat obvious answer is that you might fear that valuing doing the things you think you should do more of will leave you worse off by your current standards.
        taryneast 26 Jul 2011 21:32 UTC
        0 points
        Parent
        Hmm, not sure of that’s it. According to my current set of “wouldn’t it be awesome if” standards, I’d actually be much better off. I do get a feeling that I’d be less “real” though → programmed to be a task-loving machine instead of a person with real, human desires.
        
        However I certainly see the potential for “ick” scenarios where I would consider myself to be worse off (in a “poor sad person” kind of way) - eg programming myself to love housework—and ending up loving it so much that I turn myself into a slave/maid for other people who take advantage of it… poor example perhaps, but hopefully you get the drift.