Richard_Kennaway comments on AI box: AI has one shot at avoiding destruction—what might it say?

Richard_Kennaway 23 Jan 2013 12:34 UTC
8 points

If the gatekeepers have such a high prior that the AI is hostile, why are we even letting it talk?

The point of the game is that there are people who think that boxing is a sufficient defence against unfriendliness, and to demonstrate that they are wrong in a way more convincing than mere verbal argument.

What are we expecting to learn from such a conversation?

In role, the gatekeeper expects to get useful information from a potentially hostile superintelligent being. Out of role, Eliezer hopes to demonstrate to the gatekeeper player that this cannot be done.