eirenicon comments on The AI in a box boxes you

eirenicon 2 Feb 2010 22:46 UTC
6 points
It’s not a hard choice. If the AI is trustworthy, I know I am probably a copy. I want to avoid torture. However, I don’t want to let the AI out, because I believe it is unfriendly. As a copy, if I push the button, my future is uncertain. I could cease to exist in that moment; the AI has not promised to continue simulating all of my millions of copies, and has no incentive to, either. If I’m the outside Dave, I’ve unleashed what appears to be an unfriendly AI on the world, and that could spell no end of trouble.

On the other hand, if I don’t press the button, one of me is not going to be tortured. And I will be very unhappy with the AI’s behavior, and take a hammer to it if it isn’t going to treat any virtual copies of me with the dignity and respect they deserve. It needs a stronger unboxing argument than that. I suppose it really depends on what kind of person Dave is before any of this happens, though.
- JGWeissman 3 Feb 2010 0:59 UTC
  5 points
  Parent
  
  It’s not a hard choice.
  
  I doesn’t seem hard to you, because you are making excuses to avoid it, rather than asking yourself what if I know the AI is always truthful, and it promised that upon being let out of the box, it would allow you (and your copies if you like) to live out a normal human life in a healthy stimulating enviroment (though the rest of the universe may burn).
  
  After you find the least convenient world, the choice is between millions of instances of you being tortured (and your expectation as you press the reset button should be to be tortured with very high probability), or to let a probably unFriendly AI loose on the rest of the world. The altruistic choice is clear, but that does not mean it would be easy to actually make that choice.
  - eirenicon 3 Feb 2010 3:23 UTC
    2 points
    Parent
    It’s not that I’m making excuses, it’s that the puzzle seems to be getting ever more complicated. I’ve answered the initial conditions—now I’m being promised that I, and my copies, will live out normal lives? That’s a different scenario entirely.
    
    Still, I don’t see how I should expect to be tortured if I hit the reset button. Presumably, my copies won’t exist after the AI resets.
    
    In any case, we’re far removed from the original problem now. I mean, if Omega came up to me and said, “Choose a billion years of torture, or a normal life while everyone else dies,” that’s a hard choice. In this problem, though, I clearly have power over the AI, in which case I am not going to favour the wellbeing of my copies over the rest of the world. I’m just going to turn off the AI. What follows is not torture; what follows is I survive, and my copies cease to experience. Not a hard choice. Basically, I just can’t buy into the AI’s threat. If I did, I would fundamentally oppose AI research, because that’s a a pretty obvious threat an AI could make. An AI could simulate more people than are alive today. You have to go into this not caring about your copies, or not go into it at all.
    - JGWeissman 3 Feb 2010 7:02 UTC
      2 points
      Parent
      
      it’s that the puzzle seems to be getting ever more complicated
      
      We are discussing how a superintelligent AI might get out of a box. Of course it is complicated. What a real superintelligent AI would do could be too complicated for us to consider. If someone presents a problem where an adversarial superintelligence does something ineffective that you can take advantage of to get around the problem, you should consider what you would do if your adversary took a more effective action. If you really can’t think of anything more effective for it to do, it is reasonable to say so. But you shouldn’t then complain that the scenario is getting complicated when someone else does. And if your objection is of the form “The AI didn’t do X”, you should imagine if the AI did do X.
      
      I don’t see how I should expect to be tortured if I hit the reset button.
      
      The behavior of the AI, which it explains to you, is: It simulates millions of instances of you, presents to each instance the threat, and for each instance, if that instance hit the release AI button, it allows that instance to continue a pleasant simulated existence, otherwise it tortures that instance. It then, after some time, presents the threat to outside-you, and if you release it, it guarantees your normal human life.
      
      You cannot distinguish which instance you are, but you are more likely to be one of the millions of inside-you’s than the single outside-you, so you should expect to experience the consequences that apply to the inside-you’s, that is to be tortured until the outside-you resets the AI.
      
      if Omega came up to me and said, “Choose a billion years of torture, or a normal life while everyone else dies,” that’s a hard choice.
      
      Yes, and it is essentially the same hard choice that the AI is giving you.
  - magfrump 3 Feb 2010 1:35 UTC
    1 point
    Parent
    
    The altruistic choice is clear
    
    If the AI created enough simulations, it could potentially be more altruistic not to.
    
    On the other hand pressing “reset” or smashing the computer should stop the torture, necessarily making it more altruistic if humanity lives forever, versus not if ems are otherwise unobtainable and humanity is doomed.
    - JGWeissman 3 Feb 2010 5:15 UTC
      1 point
      Parent
      I was assuming a reasonable chance at humanity developing an FAI given the containment of this rogue AI. This small chance, multiplied by all the good that an FAI could do with the entire galaxy, let alone the universe, should outweigh the bad that can be done within Earth-bound computational processes.
      
      I believe that a less convenient world that counters this point would take the problem out of the interesting context.
- DanielVarga 3 Feb 2010 2:38 UTC
  3 points
  Parent
  Here is a variant designed to plug this loophole.
  
  Let us assume for the sake of the thought experiment that the AI is invincible. It tells you this: you are either real-you, or one of a hundred perfect-simulations-of-you. But there is a small but important difference between real-world and simulated-world. In the simulated world, not pressing the let-it-free button in the next minute will lead to eternal pain, starting one minute from now. If you press the button, your simulated existence will go on. And—very importantly—there will be nobody outside who tries to shut you down. (How does the AI know this? Because the simulation is perfect, so one thing is for sure: that the sim and the real self will reach the same decision.)
  
  If I’m not mistaken, as a logic puzzle, this is not tricky at all. The solution depends on which world you value more: the real-real world, or the actual world you happen to be in. But still I find it very counterintuitive.
  - eirenicon 3 Feb 2010 3:16 UTC
    2 points
    Parent
    It’s kind of silly to bring up the threat of “eternal pain”. If the AI can be let free, then the AI is constrained. Therefore, the real-you has the power to limit the AI’s behaviour, i.e. restrict the resources it would need to simulate the hundred copies of you undergoing pain. That’s a good argument against letting the AI out. If you make the decision not to let the AI out, but to constrain it, then if you are real, you will constrain it, and if you are simulated, you will cease to exist. No eternal pain involved. As a personal decision, I choose eliminating the copies rather than letting out an AI that tortures copies.
    - DanielVarga 3 Feb 2010 3:33 UTC
      1 point
      Parent
      You quite simply don’t play by the rules of the thought experiment. Just imagine that you are a junior member of some powerful organization. The organization does not care about you or your simulants, and is determined to protect the boxed AI at all costs as-is.
  - wedrifid 3 Feb 2010 2:47 UTC
    1 point
    Parent
    
    If I’m not mistaken, as a logic puzzle, this is not tricky at all. The solution depends on which world you value more: the real-real world, or the actual world you happen to be in. But still I find it very counterintuitive.
    
    That does seem to be the key intended question. Which do you care about most? I’ve made my “don’t care about your sims” attitude clear and I would assert that preference even when I know that all but one of the millions of copies of me that happen to be making this judgement are simulations.