handoflixue comments on AI box: AI has one shot at avoiding destruction—what might it say?

handoflixue 26 Jan 2013 0:03 UTC
0 points
I always took the AI Box as being a specific subset of the meta-question: how can we be sure the AI is friendly?

“How do we completely isolate the AI” seems senseless since then we get ZERO information and have ZERO chance of releasing it, so why not save time and just not build the AI?

And, of course, I’d expect any reasonable approach to the meta-question to be more a matter of math and logic, and probably something where we don’t even have the framework to start directly answering it. Certainly not a forum game :)

On the other hand, games are fun, and they get people thinking, so coming up with new games that genuinely help us to frame the problem is still probably useful! And if not, I’ll still probably have fun playing them. It’s why I love this variant of the AI Box—it’s a quick, easy, and fun game that still taught me a lot about what I’d consider to be evidence-of-friendlines, what I was looking for as the gatekeeper :)
- Shmi 26 Jan 2013 0:46 UTC
  3 points
  Parent
  
  I always took the AI Box as being a specific subset of the meta-question: how can we be sure the AI is friendly?
  
  And that subset was a demonstration that an unfriendly AI is unlikely be containable even if the communication channel is text-only.
  
  “How do we completely isolate the AI” seems senseless since then we get ZERO information and have ZERO chance of releasing it, so why not save time and just not build the AI?
  
  Of course completely isolating an AI is senseless. My (poorly expressed) point was that an AGI can probably get out regardless of the communication channel provided. Since we cannot go through all possible communication channels, I suggested that we simply block all channels and demonstrate that it can get out anyway. This would require someone designing a containment setup and someone else pointing out flaws in it. Security professionals do that every day.
  - handoflixue 30 Jan 2013 21:40 UTC
    0 points
    Parent
    
    Security professionals do that every day.
    
    Yes, but their constraints are based on the real world, whereas this one has a God-like AI which can gain control of a satellite by hacking the electrical system and then using the solar panels as sails… you’ve sort of assumed AI victory, and you’ve even stated this explicitly.
    
    I see some benefit to a few quick examples like that, but I can’t see how it’s anything but tedious to keep going once you’ve established it can hijack the satellite and then mind control the ISS using morse code.
    
    There’s nothing to learn, since the answer is always “The AI wins”, and you can replace the human player with a rock and get the same result. Games where one player can be replaced with a rock aren’t fun! :)