Manfred comments on Why No Wireheading?

Manfred Jun 19, 2011, 1:31 AM
5 points
How familiar are you with expected utility maximizers? Do you know about the difference between motivation and reward (or “wanting” and “liking”) in the brain?

We can model “wanting” as a motivational thing—that is, if there was an agent that knew itself perfectly (unlike humans), it could predict in advance what it would do, and this prediction would be what it wanted to do. If we model humans as similar to this self-knowing agent, then “wanting” is basically “what we would do in a hypothetical situation.” For example, I want to eat a candy bar, so if I had a candy bar I would eat it. And some people want to wirehead, which is the same as saying that if they could they would.

Although “wanting” and “liking” aren’t the same, they are correlated, so you could make an argument for wireheading that went something like this: “wireheading would make my pleasure centers light up, therefore it’s much more likely to be what I want than something that doesn’t make my pleasure centers light up, like many parts of ordinary life.” But the trouble with this argument is that it doesn’t take into account other sorts of evidence, with the most notable being the output of our self-modeling processes. If I could, I wouldn’t.
- [deleted]Jun 19, 2011, 1:03 PM
  1 point
  Parent
  
  How familiar are you with expected utility maximizers? Do you know about the difference between motivation and reward (or “wanting” and “liking”) in the brain?
  
  I think I’m familiar with that and understand the difference. I don’t see it’s relevance. Assuming “wanting” is basically the dopamine version of “liking” seems more plausible and strictly simpler than assuming there’s a really complex hypothetical calculation based on states of the world being performed.
  
  Also, I suspect you are understanding wireheading as too narrow here. It’s not just the pleasure center (or even just some part of it, like in “inducing permanent orgasms”), but it would take care of all desirable sensations, including the sensation of having one’s wants fulfilled. The intuition “I get wireheaded and still feel like I want something else” is false, which is why I used “rewards” instead of “pleasure”. (And it doesn’t require rewiring one’s preferences.)
  
  But the trouble with this argument is that it doesn’t take into account other sorts of evidence, with the most notable being the output of our self-modeling processes. If I could, I wouldn’t.
  
  Confabulation and really bad introspective access seem much more plausible to me. If you modify details in thought experiments that shouldn’t affect wireheading results (like reversing Nozick’s experience machine), people do actually change their answers, even though they previously claimed to have based their decisions on criteria that clearly can’t have mattered.
  
  I’d much rather side with revealed preferences, which show that plenty of people are interested in crude wireheading (heroin, WoW and FarmVille come to mind) and the better those options get, the more people choose them.
  - Manfred Jun 19, 2011, 5:25 PM
    1 point
    Parent
    
    Assuming “wanting” is basically the dopamine version of “liking” seems more plausible and strictly simpler
    
    Why assume? It’s there in the brain. It’s okay to model reality with simpler stuff sometimes, but to look at reality and say “not simple enough” is bad. The model that says “it would be rewarding, therefore I must want it” is too simple.
    
    than assuming there’s a really complex hypothetical calculation based on states of the world being performed.
    
    Except the brain is a computer that processes data from sensory organs and outputs commands—it’s not like we’re assuming this from nothing, it’s an experimental result. I’m including all sorts of thing in “the world” here (maybe more than you intended), but that’s as it should be. And ever since mastering the art of peek-a-boo, I’ve had this concept of a real world, and I (i.e. me, my brain) use it in computation all the time.
    
    Also, I suspect you are understanding wireheading as too narrow here. It’s not just the pleasure center [...] The intuition “I get wireheaded and still feel like I want something else” is false, which is why I used “rewards” instead of “pleasure”.
    
    This is part of why I referenced expected utility maximizers. Expected utility maximizers don’t choose what just makes them feel like they’ve done something. They evaluate the possibilities with their current utility function. The goal (for an agent who does this) truly isn’t to make the utility meter read a big number, but to do things that would make their current utility function read a big number. An expected utility maximizer leading a worthwhile life will always turn down the offer to be overwritten with orgasmium (as long as one of their goals isn’t something internal like “get overwritten with ogasmium”).
    
    I’d much rather side with revealed preferences, which show that plenty of people are interested in crude wireheading
    
    And plenty of people aren’t, or will play tetris but won’t do heroin. And of course there are people who will lay down their lives for another—to call wireheading a revealed preference of humans is flat wrong.
    - [deleted]Jun 20, 2011, 4:44 PM
      0 points
      Parent
      
      This is part of why I referenced expected utility maximizers. Expected utility maximizers don’t choose what just makes them feel like they’ve done something.
      
      Correct and I don’t disagree with this. An actual expected utility maximizer (or an approximation of one) would have no interest in wireheading. Why do you think humans are best understood as such utility maximizers? If we were, shouldn’t everyone have an aversion, or rather, indifference to wireheading? After all, if you offered an expected paperclip maximizer the option of wireheading, it would simply reject it as if you had offered to build a bunch of staples. It would have no strong reaction either way. That isn’t what’s happening with humans.
      
      I’m trying to think of a realistic complex utility function that would predict such behavior, but can’t think of anything.
      
      And plenty of people aren’t, or will play tetris but won’t do heroin. And of course there are people who will lay down their lives for another—to call wireheading a revealed preference of humans is flat wrong.
      
      True, there isn’t anything like a universally compelling wirehead option available. Each option is, so far, preferred only by minorities, although in total, they are still fairly widespread and their market share is rising. I did express this to sloppily.
      - Manfred Jun 20, 2011, 9:18 PM
        0 points
        Parent
        
        Why do you think humans are best understood as such utility maximizers? If we were, shouldn’t everyone have an aversion, or rather, indifference to wireheading? After all, if you offered an expected paperclip maximizer the option of wireheading, it would simply reject it as if you had offered to build a bunch of staples. It would have no strong reaction either way. That isn’t what’s happening with humans.
        
        I’m trying to think of a realistic complex utility function that would predict such behavior, but can’t think of anything.
        
        Yeah, true. For humans, pleasure is at least a consideration. I guess I see it as part of our brain structure used in learning, a part that has acquired its own purpose because we’re adaptation-executers, not fitness maximizers. But then, so is liking science, so it’s not like I’m dismissing it. If I had a utility function, pleasure would definitely be in there.
        
        So how do you like something without having it be all-consuming? First, care about other things too—I have terms in my hypothetical utility function that refer to external reality. Second, have there be a maximum possible effect—either because there is a maximum amount of reward we can feel, or because what registers in the brain as “reward” quickly decreases in value as you get more of it. Third, have the other stuff you care about outweigh just pursuing the one term to its maximum.
        
        I actually wrote a comment about this recently, which is an interesting coincidence :D I’ve become more and more convinced that a bounded utility function is most human-like. The question is then whether the maximum possible utility from internal reward outweighs everyday values of everything else or not.
        [deleted]Jun 20, 2011, 10:15 PM
        0 points
        Parent
        I agree with you on the bounded utility function.
        
        I still need to think more about whether expected utility maximizers are a good human model. My main problem is that I can’t see realistic implementations in the brain (and pathways for evolution to get them there). I’ll focus my study more on that; I think I dismissed them too easily.