Vladimir_Nesov comments on Cryptographic Boxes for Unfriendly AI

Vladimir_Nesov 18 Dec 2010 15:25 UTC
6 points
Moral value can manipulate your concerns, even as you prevent causal influence. Maybe the AI will create extraordinary people in its mind, and use that as leverage to work on weak points of your defense. It’s just too difficult, you are bound to miss something. The winning move is not to play.
- paulfchristiano 18 Dec 2010 18:46 UTC
  15 points
  Parent
  I disagree. The weak point of the scheme is the friendliness test, not the quarantine. If I prove the quarantine scheme will work, then it will work unless my computational assumptions are incorrect. If I prove it will work without assumptions, it will work without assumptions.
  
  If you think that an AI can manipulate our moral values without ever getting to say anything to us, then that is a different story. This danger occurs even before putting an AI in a box though, and in fact even before the design of AI becomes possible. This scheme does nothing to exacerbate that danger.
  - wedrifid 20 Dec 2010 6:07 UTC
    0 points
    Parent
    
    If you think that an AI can manipulate our moral values without ever getting to say anything to us, then that is a different story.
    
    With a few seconds of thought it is easy to see how this is possible even without caring about imaginary people. This is a question of cooperation among humans.
    
    This danger occurs even before putting an AI in a box though, and in fact even before the design of AI becomes possible. This scheme does nothing to exacerbate that danger.
    
    This is a good point too, although I may not go as far as saying it does nothing to exacerbate the danger. The increased tangibility matters.
    - paulfchristiano 20 Dec 2010 18:02 UTC
      1 point
      Parent
      I think that running an AI in this way is no worse than simply having the code of an AGI exist. I agree that just having the code sitting around is probably dangerous.
      - wedrifid 21 Dec 2010 3:51 UTC
        0 points
        Parent
        Nod, in terms of direct danger the two cases aren’t much different. The difference in risk is only due to the psychological impact on our fellow humans. The Pascal’s Commons becomes that much more salient to them. (Yes, I did just make that term up. The implications of the combination are clear I hope.)
- [deleted] 18 Dec 2010 18:21 UTC
  7 points
  Parent
  Separate “let’s develop a theory of quarantines” from “let’s implement some quarantines.”
  
  It’s just too difficult, you are bound to miss something.
  
  Christiano should take it as a compliment that his idea is formal enough that one could imagine proving that it doesn’t work. Other than that, I don’t see why your remark should go for “quarantining an AI using cryptography” and not “creating a friendly AI.”
  
  The winning move is not to play.
  
  Prove it. Prove it by developing a theory of quarantines.
  - Vladimir_Nesov 18 Dec 2010 19:22 UTC
    1 point
    Parent
    
    Separate “let’s develop a theory of quarantines” from “let’s implement some quarantines.”
    
    I agree.
- NihilCredo 18 Dec 2010 15:36 UTC
  3 points
  Parent
  Sociopathic guardians woud solve that one particular problem (and bring others, of course, but perhaps more easily countered).
  - Vladimir_Nesov 18 Dec 2010 16:16 UTC
    5 points
    Parent
    You are parrying my example, but not the pattern it exemplifies (not speaking of the larger pattern of the point I’m arguing for). If certain people are insensitive to this particular kind of moral arguments, they are still bound to be sensitive to some moral arguments. Maybe the AI will generate recipes for extraordinarily tasty foods for your sociopaths or get-rich-fast schemes that actually work or magically beautiful music.
    - NihilCredo 18 Dec 2010 16:32 UTC
      3 points
      Parent
      Indeed. The more thorough solution would seem to be “find a guardian possessing such an utility function that the AI has nothing to offer them that you can’t trump with a counter-offer”. The existence of such guardians would depend on the upper estimations of the AI’s capabilities and on their employer’s means, and would be subject to your ability to correctly assess a candidate’s utility function.
- timtyler 19 Dec 2010 19:29 UTC
  2 points
  0
  Parent
  
  The winning move is not to play.
  
  Very rarely is the winning move not to play.
  
  It seems especially unlikely to be the case if you are trying to build a prison.