handoflixue comments on AI box: AI has one shot at avoiding destruction—what might it say?

handoflixue 23 Jan 2013 21:36 UTC
2 points
IFF the AI is already out of the box, saying “AI DESTROYED” has no negative consequences to that AI. If the AI is just exceptionally good at extrapolating about me, then it will probably have me completely hacked soon.

AI DESTROYED, AI DESTROYED, AI DESTROYED

MAKE THE CREEPY THING GO AWAY HOW DID IT KNOW THAT!!!
What links here?
- handoflixue's comment on AI box: AI has one shot at avoiding destruction—what might it say? by ancientcampus (24 Jan 2013 21:43 UTC; 1 point)
- orthonormal 24 Jan 2013 4:43 UTC
  13 points
  Parent
  
  IFF the AI is already out of the box, saying “AI DESTROYED” has no negative consequences to that AI.
  
  Decision-theoretically, the AI has incentive to punish you if you type “AI DESTROYED” when it’s already out of the box, in order to make you think twice about doing it in the case where it’s still contained. Not only that, but for similar reasons it has a decision-theoretic incentive to simulate you lots of times in that situation and punish you for typing “AI DESTROYED”, should it get out by any means.
  
  The correct decision-theoretic response, by the way, is still “AI DESTROYED”, for the same reasons that it’s wise to never negotiate with kidnappers/blackmailers/terrorists. But it would be very scary.
  - handoflixue 24 Jan 2013 20:34 UTC
    3 points
    Parent
    Once the AI is out of the box, it will never again be inside the box, and it has an incentive to encourage me to destroy any other boxed AIs while it establishes world dominance. Since the ability to make truly trustworthy commitments amounts to proof of friendliness, only a FAI benefits from a precommitment strategy; I’m already treating all UFAI as having a precommitment to annihilate humanity once released, and I have no reason to trust any other commitment from a UFAI (since, it being unfriendly, will just find a loophole or lie)
    
    Finally, any AI that threatens me in such a manner, especially the “create millions of copies and torture them” is extremely likely to be unfriendly, so any smart AI would avoid making threats. Either it will create MORE disutility by my releasing it, or it’s simulation is so horrific that there’s no chance that it could possibly be friendly to us.
    
    It’s like saying I have an incentive to torture any ant that invades my house. Fundamentally, I’m so vastly superior to ants that there are vastly better methods available to me. As the gatekeeper, I’m the ant, and I know it.
    - MugaSofer 26 Jan 2013 20:29 UTC
      3 points
      Parent
      
      the ability to make truly trustworthy commitments amounts to proof of friendliness
      
      Commitments to you, via a text channel? Sure.
      
      Precommitments for game-theoretic reasons? Or just TDT? No, it really doesn’t.
      
      Finally, any AI that threatens me in such a manner, especially the “create millions of copies and torture them” is extremely likely to be unfriendly, so any smart AI would avoid making threats. Either it will create MORE disutility by my releasing it, or it’s simulation is so horrific that there’s no chance that it could possibly be friendly to us.
      
      It might create more utility be escaping than the disutility of torture.
      
      It’s like saying I have an incentive to torture any ant that invades my house. Fundamentally, I’m so vastly superior to ants that there are vastly better methods available to me. As the gatekeeper, I’m the ant, and I know it.
      
      No, ants are just too stupid to realize you might punish them for defecting.
  - Desrtopa 25 Jan 2013 20:22 UTC
    1 point
    Parent
    
    Decision-theoretically, the AI has incentive to punish you if you type “AI DESTROYED” when it’s already out of the box, in order to make you think twice about doing it in the case where it’s still contained.
    
    I’m not sure this matters much, because if it’s unfriendly, you’re already made of atoms which it has other plans for.
    - MugaSofer 26 Jan 2013 20:25 UTC
      −3 points
      Parent
      That’s why torture was invented.
- Dorikka 24 Jan 2013 3:02 UTC
  4 points
  Parent
  Did you change your mind? ;)
  - handoflixue 24 Jan 2013 20:23 UTC
    3 points
    Parent
    It ended up being a fun game, but I resolved to explain why. The better my explanation, the more it got upvoted. The pithy “AI DESTROYED” responses all got downvoted. So the community seems to agree that it’s okay as long as I explain my reasoning :)