Richard_Kennaway comments on What can go wrong with the following protocol for AI containment?

Richard_Kennaway 12 Jan 2016 12:46 UTC
4 points
1. Keep the AI in a box and don’t interact with it.
The rest of your posting is about how to interact with it.

Don’t have any conversations with it whatsoever.

Interaction is far broader than just conversation. If you can affect it and it can affect you, that’s interaction. If you’re going to have no interaction, you might as well not have created it; any method of getting answers from it about your questions is interacting with it. The moment it suspects what it going on, it can start trying to play you, to get out of the box.

I’m at a loss to imagine how they would take over the world.

This is a really bad argument for safety. It’s what the scientist says of his creation in sci-fi B-movies, shortly before the monster/plague/AI/alien/nanogoo escapes.
- ZoltanBerrigomo 12 Jan 2016 20:03 UTC
  0 points
  Parent
  These are good points. Perhaps I should not have said “interact” but chosen a different word instead. Still, its ability to play us is limited since (i) we will be examining the records of the world after it is dead (ii) it has no opportunity to learn anything about us.
  
  Edit: we might even make it impossible for it to game us in the following way. All records of the simulated world are automatically deleted upon completion—except for a specific prime factorization we want to know.
  
  This is a really bad argument for safety.
  
  You are right, of course. But you wrote that in response to what was a parenthetical remark on my part—the real solution is to use program checking to make sure the laws of physics of the simulated world are never violated.
- Silver_Swift 12 Jan 2016 13:14 UTC
  0 points
  Parent
  To be fair, all interactions described happen after the AI has been terminated, which does put up an additional barrier for the AI to get out of the box. It would have to convince you to restart it without being able to react to your responses (apart from those it could predict in advance) and then it still has to convince you to let it out of the box.
  
  Obviously, putting up additional barriers isn’t the way to go and this particular barrier is not as impenetrable for the AI as it might seem to a human, but still, it couldn’t hurt.