benelliott comments on Why No Wireheading?

benelliott 19 Jun 2011 13:23 UTC
2 points

That depends on the exact implementation. The paperclipper might be purely feedback-driven, essentially a paperclip-thermostat. In that case, it will simulate setting its internal variables to BB(1000), that will create huge positive feedback and it happily wireheads itself. Or it might simulate the state of the world, count the paperclips and then rate it, in which case it won’t wirehead itself.

The former is incredibly stupid, an agent that consistently gets its imagination confused with reality and cannot, even in principle, separate them would be utterly incapable of abstract thought.

‘Expected Paper-clips’ is completely different to paper-clips. If an agent can’t tell the difference between them it may as well not be able to tell houses from dogs. The fact that I can even understand the difference suggests that I am not that stupid.

I just don’t see any evidence to conclude humans are like that.

Really? You can’t see any Bayesian evidence at all!

How about the fact that I claim not to want to wire head? My beliefs about my desires are surely correlated with my desires. How about all the other people who agree with me, including a lot of commenters on this site and most of humanity in general? Are our beliefs so astonishingly inaccurate that we are not even a tiny bit more likely to be right than wrong?

What about the many cases of people strongly wanting things that did not make them happy and acting on those desires, or vice versa?

You are privileging the hypothesis. Your view has a low prior (most of the matter in the universe is not part of my mind, so given that I might care about anything it is not very likely that I will care about one specific lump of meat?). You don’t present any evidence of your own, and yet you demand that I present mine.
- [deleted] 19 Jun 2011 14:33 UTC
  2 points
  Parent
  
  The former is incredibly stupid, an agent that consistently gets its imagination confused with reality and cannot, even in principle, separate them would be utterly incapable of abstract thought.
  
  Welcome to evolution. Have you looked at humanity lately?
  
  (Ok, enough snide remarks. I do agree that this is fairly stupid design, but it would still work in many cases. The fact that it can’t handle advanced neuroscience is unfortunate, but it worked really well in the Savannah.)
  
  How about the fact that I claim not to want to wire head? My beliefs about my desires are surely correlated with my desires.
  
  (I strongly disagree that “most of humanity” is against wireheading. The only evidence for that are very flawed intuition pumps that can easily be reversed.)
  
  However, I do take your disagreement (and that of others here) seriously. It is a major reason why I don’t just go endorse wireheading and why I wrote the post in the first place. Believe me, I’m listening. I’m sorry if I made the impression that I just discard your opinion as confused.
  
  You are privileging the hypothesis. Your view has a low prior (most of the matter in the universe is not part of my mind, so given that I might care about anything it is not very likely that I will care about one specific lump of meat?).
  
  It would have a low prior if human minds were pulled out of mind space at random. They aren’t. We do know that they are reinforcement-based and we have good evolutionary pathways how complex minds based on that would be created. Reinforcement-based minds, however, are exactly like the first kind of mind I described and, it seems to me, should always wirehead if they can.
  
  As such, assuming no more, we should have no problem with wireheading. The fact that we do needs to be explained. Assuming there’s an additional complex utility calculation would answer the question, but that’s a fairly expensive hypothesis, which is why I asked for evidence. On the other hand, assuming (unconscious) signaling, mistaken introspection and so on relies only on mechanisms we already know exist and equally works, but favors wireheading.
  
  Economic models that do assume complex calculations like that, if I understand it correctly, work badly, while simpler models (PCT, behavioral economics in general) work much better.
  
  You don’t present any evidence of your own, and yet you demand that I present mine.
  
  You are correct that I have not presented any evidence in favor of wireheading. I’m not endorsing wireheading and even though I think there are good arguments for it, I deliberately left them out. I’m not interested in “my pet theory about values is better than your pet theory and I’m gonna convince you of that”. Looking at models of human behavior and inferred values, however, wireheading seems like a fairly obvious choice. The fact that you (and others) disagree makes me think I’m missing something.
  - benelliott 19 Jun 2011 16:50 UTC
    2 points
    Parent
    
    The fact that it can’t handle advanced neuroscience is unfortunate, but it worked really well in the Savannah.
    
    What do you mean it can’t handle advanced neuroscience? Who do you think invented neuroscience!
    
    One of the points I was trying to make was that humans can, in principle, separate out the two concepts, if they couldn’t then we wouldn’t even be having this conversation.
    
    Since we can separate these concepts, it seems like our final reflective equilibrium, whatever that looks like is perfectly capable of treating them differently. I think that wire-heading is a mistake that arose from the earlier mistake of failing to preserve use-mention distinction. Defending one mistake once we have already overcome its source is like trying to defend the content of Leviticus after admitting that God doesn’t exist.
    
    I’m sorry if I made the impression that I just discard your opinion as confused.
    
    I didn’t actually think you were ignoring my opinion, I was just using a little bit of hyperbole, because people saying “I see no evidence” when there clearly is some evidence is a pet peeve of mine.
    
    On the other hand, assuming (unconscious) signalling
    
    This point interests me. Lets look a little deeper into this signalling hypothesis. Am I correct that you are claiming that while my concious mind utters sentences like “I don’t want to be a wire-head” subconsciously I actually do want to be a wire-head?
    
    If this is the case, then the situation we have is two separate mental agents with conflicting preferences, you appear to be siding with Subconscious!Ben rather than Conscious!Ben on the grounds that he is the ‘real Ben’.
    
    But in what sense is he more real, both of them exist as shown by their causal effect on the world? I may be biased on this issue but I would suggest you side with Conscious!Ben, he is the one with Qualia after all.
    
    Do you, in all honesty, want to be wire-headed? For the moment I’m not asking what you think you should want, what you want to want or what you think you would want in reflective equilibrium, just what you actually want. Does the prospect of being reduced to orgasmium, if you were offered it right now, seem more desirable than the prospect of a complicated universe filled with diverse being pursuing interesting goals and having fun?
    - [deleted] 20 Jun 2011 18:00 UTC
      1 point
      Parent
      
      What do you mean it can’t handle advanced neuroscience? Who do you think invented neuroscience!
      
      Not that I wanna beat a dead horse here, but it took us ages. We can’t even do basic arithmetic right without tons of tools. I’m always astonished to read history books and see how many really fundamental things weren’t discovered for hundreds, if not thousands of years. So I’m fairly underwhelmed by the intellectual capacities of humans. But I see your point.
      
      Since we can separate these concepts, it seems like our final reflective equilibrium, whatever that looks like is perfectly capable of treating them differently.
      
      Capable, sure. That seems like an overly general argument. The ability to distinguish things doesn’t mean the distinction appears in the supposed utility function. I can tell apart hundreds of monospace fonts (don’t ask), I don’t expect monospace fonts to appear in my actual utility function as terminal values. I’m not sure how this helps either way.
      
      Am I correct that you are claiming that while my conscious mind utters sentences like “I don’t want to be a wire-head” subconsciously I actually do want to be a wire-head?
      
      Not exactly like this. I don’t think the unconscious part of the brain is conspiring against the conscious one.
      
      I don’t think it’s useful to clearly separate “conscious” and “unconscious” into two distinct agents. They are the same agent, only with conscious awareness shifting around, metaphorically like handing around a microphone in a crowd such that only one part can make itself heard for a while and then has to resort to affecting only its direct neighbors or screaming really loud.
      
      I don’t think there’s a direct conflict between agents here. Rather, the (current) conscious part encounters intentions and reactions it doesn’t understand, doesn’t know the origin or history of, and then tries to make sense of them, so it often starts confabulating. This is most easily seen in split-brain patients.
      
      I can clearly observe this by watching my own intentions and my reactions to them moment-to-moment. Intentions come out of nowhere, then directly afterwards (if I investigate) a reason is made up why I wanted this all along. Sometimes, this reason might be correct, but it’s clearly a later interpolation. That’s why I generally tend to ignore any verbal reasons for actions.
      
      So maybe hypocrisy is a bit of an misleading term here. I’d say that there are many agents that don’t always have privileged access (and aren’t always conscious), so that they get somewhat ignored, which screws up complex decision making, which causes akrasia. Like, “I’m not getting my needs fulfilled and can’t change that myself right now, so I’m going to veto everything!”. On the other hand, the conscious part is now stuck with actions that don’t make sense, so it makes up a story. It signals “oh, I would’ve studied all day, but I somehow couldn’t get myself to stop watching cat videos, even though I hated it”. Really, it just avoided pain of boredom when studying and needed instant gratification. But “akrasia” is a much nicer cover story.
      
      I’m not saying this is perfectly correct or the whole picture, but I think assuming models like this fits my own experiences closer than assuming actual conflicting agents. Also, those unconscious parts, I suspect, are too simple to actually understand wireheading. They want rewards. If they were smart enough, they might see that wireheading is a good solution.
      
      On a somewhat related note, Susan Blackmore often makes the point when talking about free will that she doesn’t have any and doesn’t even have the illusion of free will anymore, but it doesn’t interfere with her actual behavior. Example quote from Conversations On Consciousness (she talks more about this in several radio shows I can’t find right now):
      
      Susan Greenfield: “[Searle] said that when he goes into a restaurant and orders a hamburger, he doesn’t say, ‘Well, I’m a determinist, I wonder what my genes are going to order.’” Susan Blackmore: “I do. You’re right that Searle doesn’t do that, but when I go in a restaurant, I think, ‘Ooh, how interesting, here’s a menu, I wonder what she’ll choose’; so it is possible to do that.”
      
      I’m totally like Blackmore here. I have no idea what I’ll choose tomorrow or even in ten minutes, only that it will be according to rewards, aversion and so on. Not even considering counterfactuals in my decision making (and not making up verbal reasons anymore) hasn’t crippled me in any way, as far as I can tell.
      
      That makes me skeptical that there’s really all that complex a machinery behind all this, and it makes insistence on “but I really value this complex, external thing!” so puzzling.
      
      Also, I don’t think that qualia are a useful concept ever. Let’s not drag any dualism into this by accident. Besides, what makes you think that “what you call qualia” is something your unconscious processes don’t have, right now? What makes you think you have exactly one conscious mind in your skull?
      
      Do you, in all honesty, want to be wire-headed? For the moment I’m not asking what you think you should want, what you want to want or what you think you would want in reflective equilibrium, just what you actually want. Does the prospect of being reduced to orgasmium, if you were offered it right now, seem more desirable than the prospect of a complicated universe filled with diverse being pursuing interesting goals and having fun?
      
      I don’t have an opinion on that, deliberately. I find wireheading very attractive and it seems about equally nice as the complicated universe, but much easier and more of an elegant solution. The halo effect is way too powerful here and I don’t wanna screw myself over just because I didn’t see a fundamental flaw over how pretty the solution was.
      
      (Of course, as per the nature of wireheading, even if I thought it were a good idea, I would spend no effort on convincing anyone of it. What for, because I value them? Then what am I wireheading myself for?)