Kaj_Sotala comments on The AI in a box boxes you

Kaj_Sotala 2 Feb 2010 16:39 UTC
35 points
Defeating Dr. Evil with self-locating belief is a paper relating to this subject.

Abstract: Dr. Evil learns that a duplicate of Dr. Evil has been created. Upon learning this, how seriously should he take the hypothesis that he himself is that duplicate? I answer: very seriously. I defend a principle of indifference for self-locating belief which entails that after Dr. Evil learns that a duplicate has been created, he ought to have exactly the same degree of belief that he is Dr. Evil as that he is the duplicate. More generally, the principle shows that there is a sharp distinction between ordinary skeptical hypotheses, and self-locating skeptical hypotheses.

(It specifically uses the example of creating copies of someone and then threatening to torture all of the copies unless the original co-operates.)

The conclusion:

Dr. Evil, recall, received a message that Dr. Evil had been duplicated and that the duplicate (“Dup”) would be tortured unless Dup surrendered. INDIFFERENCE entails that Dr. Evil ought to have the same degree of belief that he is Dr. Evil as that he is Dup. I conclude that Dr. Evil ought to surrender to avoid the risk of torture.

I am not entirely comfortable with that conclusion. For if INDIFFERENCE is right, then Dr. Evil could have protected himself against the PDF’s plan by (in advance) installing hundreds of brains in vats in his battlestation—each brain in a subjective state matching his own, and each subject to torture if it should ever surrender. (If he had done so, then upon receiving PDF’s message he ought to be confident that he is one of those brains, and hence ought not to surrender.) Of course the PDF could have preempted this protection by creating thousands of such brains in vats, each subject to torture if it failed to surrender at the appropriate time. But Dr. Evil could have created millions...

It makes me uncomfortable to think that the fate of the Earth should depend on this kind of brain race.
- dclayh 2 Feb 2010 19:01 UTC
  48 points
  Parent
  
  It makes me uncomfortable to think that the fate of the Earth should depend on this kind of brain race.
  
  We cannot allow a brain-in-a-vat gap!
- Vladimir_Nesov 3 Feb 2010 2:29 UTC
  12 points
  Parent
  And the error (as cited in the “conclusion”) is again in two-boxing in Newcomb’s problem, responding to threats, and so on. Anthropic confusion is merely an icing.
- aausch 2 Feb 2010 20:03 UTC
  5 points
  Parent
  The “Defeating Dr. Evil with self-locating belief” paper hinges on some fairly difficult to believe assumptions.
  
  It would take a lot more than just a not telling me the brains in the vats are actually seeing what the note says they are seeing, to degree that is indistinguishable from reality.
  
  In other words, it would take a lot for the AI to convince me that it has successfully created copies of me which it will torture, much more than just a propensity for telling the truth.
  - KomeijiSatori 11 Feb 2013 1:34 UTC
    0 points
    Parent
    
    it would take a lot for the AI to convince me that it has successfully created copies of me which it will torture, much more than just a propensity for telling the truth. Is the fact that it is fully capable (based on, say, readings of it’s processing capabilities, it’s ability to know the state of your current mind, etc), and the fact that it has no reason NOT to do what it says (no skin of it’s back to torture the subjective “you”s, even if you DON’T let it out, it will do so just on principal).
    
    While it’s understandable to say that, today, you aren’t in some kind of Matrix, because there is no reason for you to believe so, in the situation of the guard, you DO know that it can do so, and will, even if you call it’s “bluff” that the you right now is the original.
    - Yuyuko 11 Feb 2013 2:35 UTC
      0 points
      Parent
      I had intended to reply with this very objection. It seems you’ve read my mind, Satori.
- Stuart_Armstrong 3 Feb 2010 11:17 UTC
  2 points
  Parent
  Causal decision theory seems to have no problem with this blackmail—if you’re Dr Evil, don’t surrender, and nothing will happend to you. If you’re DUP, your decision is irrelevant, so it doesn’t matter.
  
  (I don’t endore that way of thinking, btw)
- arbimote 3 Feb 2010 1:06 UTC
  2 points
  Parent
  If we accept the simulation hypothesis, then there are already gzillions of copies of us, being simulated under a wide variety of torture conditions (and other conditions, but torture seems to be the theme here). An extortionist in our world can only create a relatively small number of simulations of us, relatively small enough that it is not worth taking them into account. The distribution of simulation types in this world bears no relation to the distribution of simulations we could possibly be in.
  
  If we want to gain information about what sort of simulation we are in, evidence needs to come directly from properties of our universe (stars twinkling in a weird way, messages embedded in π), rather than from properties of simulations nested in our universe.
  
  So I’m safe from the AI … for now.
  - TheAncientGeek 7 Jul 2014 11:25 UTC
    1 point
    Parent
    
    If we accept the simululation hypothesis, then there are already gzillions of copies of us, being simulated under a wide variety of torture conditions
    
    That isn’t a strong implication of simulation, but is of MWI.
  - jacob_cannell 4 Feb 2011 4:50 UTC
    1 point
    Parent
    The gzillions of other copies of you are not relevant unless they exist in universes exactly like yours from your observational perspective.
    
    That being said, your point is interesting but just gets back to a core problem of the SA itself, which is how you count up the set of probable universes and properly weight them.
    
    I think the correct approach is to project into the future of your multiverse, counting future worldlines that could simulate your current existence weighted by their probability.
    
    So if it’s just one AI in a box and he doesn’t have much computing power you shouldn’t take him very seriously, but if it looks like this AI is going to win and control the future then you should take it seriously.
- MatthewB 3 Feb 2010 9:23 UTC
  −2 points
  Parent
  Excuse me… But, we’re talking about Dr. Evil, who wouldn’t care about anyone being tortured except his own body. Wouldn’t he know that he was in no danger of being tortured and say “to hell with any other copy of me.”???
  - Unknowns 3 Feb 2010 10:16 UTC
    5 points
    Parent
    Right, the argument assumes he doesn’t care about his copies. The problem is that he can’t distinguish himself from his copies. He and the copies both say to themselves, “Am I the original, or a copy?” And there’s no way of knowing, so each of them is subjectively in danger of being tortured.
    - MatthewB 3 Feb 2010 12:16 UTC
      −5 points
      Parent
      I got that…
      
      I think it a little too contrived. And, I think that a Dr. Evil would say to hell with it.
  - Kaj_Sotala 3 Feb 2010 9:43 UTC
    2 points
    Parent
    How would he know that he’s in no danger of being tortured?
    - MatthewB 3 Feb 2010 12:17 UTC
      −1 points
      Parent
      He wouldn’t, any more than you have no idea if you are in danger of being tortured either.
      - Kaj_Sotala 3 Feb 2010 16:57 UTC
        1 point
        Parent
        I’m sorry, I don’t understand. First you suggested that he’d know he was in no danger of being tortured, then you say that he wouldn’t?
        MatthewB 4 Feb 2010 7:14 UTC
        1 point
        Parent
        Pardon… I was not clear.
        
        Dr. Evil would not care to indulge in a philosophical debate about whether he may or may not be a duplicate who was about to be tortured unless he was strapped to a rack and WAS in fact already being tortured. Dr. Evil(s) don’t really consider things like Possible Outcomes of this sort of problem… You’ll have to take my word for it from having worked with and for a Dr. Evil when I was younger. Those sorts of people are arrogant and defiant (and contrary as hell) in the face of all sorts of opposition, and none of them I have known took to well to philosophical puzzling of the sort described.
        
        My comment above is meant to say “How do you know that you’re not about to be tortured right now?” and “Dr. Evil would have the same knowledge, and discard any claims that he might be about to be tortured for the same reasons that you don’t feel under threat of torture right now, and for which you would discard a threat of torture at the present moment (immanent threat).” (if you do feel under threat of torture, then I don’t know what to say)
        Kaj_Sotala 5 Feb 2010 19:51 UTC
        1 point
        Parent
        Alright, I fortunately haven’t worked with Dr. Evils, so I’ll defer to your experience.
        
        As for how Dr. Evil might know he was under a threat of torture, it was stated in the paper that he received a message from the Philosophy Defence Force telling him he was. It was also established that the Philosophy Defence Force never lies or gives misleading information. ;)
        
        (I, myself, haven’t received any threats from organizations known to never lie or be misleading.)
        MatthewB 5 Feb 2010 22:26 UTC
        0 points
        Parent
        I think the same applies, regardless of the PDF’s notification. Just the name alone would make me suspicious of trusting anything that came from them.
        
        Now, if the Empirical Defense Task Force told me that I was about to be tortured (and they had the same described reputation as the PDF)… I’d listen to them.
        Unknowns 4 Feb 2010 7:23 UTC
        1 point
        Parent
        I agree that Dr. Evil would act in this way. The paper was arguing about what he should do, not about what he would actually do.
        MatthewB 4 Feb 2010 21:30 UTC
        0 points
        Parent
        I see the issue, while I care about my own behavior, and others… I don’t care to base it upon silly examples. And, I think this is a silly and contrived situation. Maybe someone should do a sitcom based upon it.
        MatthewB 4 Feb 2010 15:43 UTC
        −1 points
        Parent
        On further consideration… In the first comment, I said that Dr. Evil Would not care, which is completely consistent with Dr. Evil Not having any idea