Shmi comments on AI box: AI has one shot at avoiding destruction—what might it say?

Shmi 25 Jan 2013 21:45 UTC
0 points
The more I look at the comments, the more I am convinced that the AI Box experiment is too weak a demonstration of transhuman powers. Most of the proposals here fall under this basic trope (feel free to give a tvtropes link): to achieve what AI claims, it’d have to have powers formidable enough to not need the gatekeeper’s help getting out of the box in the first place. Given that, why would an AI need to talk to the gatekeeper at all?

So I suggest a modified AI boxing experiment: the gatekeeper designs an AI box with no communication channel at all. It will still have an AI inside and enough initial data fed in for the AI to foom. The AI will attempt to break out of the box by any and all means possible.

Here is a relevant previous thread.
What links here?
- Shmi's comment on Isolated AI with no chat whatsoever by ancientcampus (28 Jan 2013 20:46 UTC; 11 points)
- [deleted] 25 Jan 2013 22:01 UTC
  5 points
  Parent
  So, we’re being asked to imagine an arbitrary superhuman AI whose properties and abilities we can’t guess at except to specify arbitrarily, is trying to get out of a box whose security protocols and strength we can’t guess at except to specify arbitrarily, and trying to decide whether it does?
  
  Meh. Superman vs Batman is more entertaining.
  What links here?
  - Shmi's comment on Isolated AI with no chat whatsoever by ancientcampus (28 Jan 2013 20:46 UTC; 11 points)
  - Shmi 25 Jan 2013 22:14 UTC
    2 points
    Parent
    Feel free to modify it in a way that makes sense to you.
    - handoflixue 26 Jan 2013 0:03 UTC
      0 points
      Parent
      I always took the AI Box as being a specific subset of the meta-question: how can we be sure the AI is friendly?
      
      “How do we completely isolate the AI” seems senseless since then we get ZERO information and have ZERO chance of releasing it, so why not save time and just not build the AI?
      
      And, of course, I’d expect any reasonable approach to the meta-question to be more a matter of math and logic, and probably something where we don’t even have the framework to start directly answering it. Certainly not a forum game :)
      
      On the other hand, games are fun, and they get people thinking, so coming up with new games that genuinely help us to frame the problem is still probably useful! And if not, I’ll still probably have fun playing them. It’s why I love this variant of the AI Box—it’s a quick, easy, and fun game that still taught me a lot about what I’d consider to be evidence-of-friendlines, what I was looking for as the gatekeeper :)
      - Shmi 26 Jan 2013 0:46 UTC
        3 points
        Parent
        
        I always took the AI Box as being a specific subset of the meta-question: how can we be sure the AI is friendly?
        
        And that subset was a demonstration that an unfriendly AI is unlikely be containable even if the communication channel is text-only.
        
        “How do we completely isolate the AI” seems senseless since then we get ZERO information and have ZERO chance of releasing it, so why not save time and just not build the AI?
        
        Of course completely isolating an AI is senseless. My (poorly expressed) point was that an AGI can probably get out regardless of the communication channel provided. Since we cannot go through all possible communication channels, I suggested that we simply block all channels and demonstrate that it can get out anyway. This would require someone designing a containment setup and someone else pointing out flaws in it. Security professionals do that every day.
        handoflixue 30 Jan 2013 21:40 UTC
        0 points
        Parent
        
        Security professionals do that every day.
        
        Yes, but their constraints are based on the real world, whereas this one has a God-like AI which can gain control of a satellite by hacking the electrical system and then using the solar panels as sails… you’ve sort of assumed AI victory, and you’ve even stated this explicitly.
        
        I see some benefit to a few quick examples like that, but I can’t see how it’s anything but tedious to keep going once you’ve established it can hijack the satellite and then mind control the ISS using morse code.
        
        There’s nothing to learn, since the answer is always “The AI wins”, and you can replace the human player with a rock and get the same result. Games where one player can be replaced with a rock aren’t fun! :)
  - David_Gerard 28 Jan 2013 21:23 UTC
    1 point
    Parent
    
    So, we’re being asked to imagine an arbitrary superhuman AI whose properties and abilities we can’t guess at except to specify arbitrarily
    
    Quite a lot of discussion concerning the future superintelligent AI is of this sort: “we can’t understand it, therefore you can’t prove it wouldn’t do any arbitrary thing I assert.” This already makes discussion difficult.