Prompted by Tuxedage learning to win, and various concerns about the current protocol, I have a plan to enable more AI-Box games whilst preserving the logs for public scrutiny.
Conversations with Tuxedage indicate that substantive prior research on a gatekeeper opponent is a key element of an effective escape strategy. Such research seems to me to violate the spirit of the experiment—the AI will know no more about the researcher than they reveal over the terminal
That’s not quite right. The AI and the researcher may have been interacting on the variety of issues before the AI decided to break out. This is nearly identical to Tuxedage talking to his future opponents on IRC or similar interactive media before they decided to run the experiment.
What I was getting at is that the current setup allows for side-channel methods of getting information on your opponent. (Digging to find their identity, reading their Facebook page, etc.).
While I accept that this interaction could be one of many between the AI and the researcher, this can be simulated in the anonymous case via a ‘I was previously GatekeeperXXX, I’m looking to resume a game with AIYYY’ declaration in the public channel while still preserving the player’s anonymity.
For the basic interaction setup, yes. For a sense of community and for reliable collection of the logs, perhaps not. I’m also not sure how anonymous Omegle makes users to each other and itself.
You forgot to adress Eliezers point that “10% of AI box experiments were won even by the human emulation of an AI” is more effective against future proponents of deliberately creating boxed AIs than “Careful, the guardian might be persuaded by these 15 arguments we have been able to think of”.
I don’t think the probability of “AIs can find unboxing arguments we didn’t” is sub-1 enough for preparation to matter. If there is any chance of a mathematical exhaustability of those arguments, its research should be conducted by a select circle of individuals that won’t disclose our critical unboxers until a proof of safety.
Prompted by Tuxedage learning to win, and various concerns about the current protocol, I have a plan to enable more AI-Box games whilst preserving the logs for public scrutiny.
See this: http://bæta.net/posts/anonymous-ai-box.html
I support this and I hope it becomes a thing.
That’s not quite right. The AI and the researcher may have been interacting on the variety of issues before the AI decided to break out. This is nearly identical to Tuxedage talking to his future opponents on IRC or similar interactive media before they decided to run the experiment.
What I was getting at is that the current setup allows for side-channel methods of getting information on your opponent. (Digging to find their identity, reading their Facebook page, etc.).
While I accept that this interaction could be one of many between the AI and the researcher, this can be simulated in the anonymous case via a ‘I was previously GatekeeperXXX, I’m looking to resume a game with AIYYY’ declaration in the public channel while still preserving the player’s anonymity.
By the way, wouldn’t Omegle with the common interests specified as AIBOX basically do the trick?
For the basic interaction setup, yes. For a sense of community and for reliable collection of the logs, perhaps not. I’m also not sure how anonymous Omegle makes users to each other and itself.
You forgot to adress Eliezers point that “10% of AI box experiments were won even by the human emulation of an AI” is more effective against future proponents of deliberately creating boxed AIs than “Careful, the guardian might be persuaded by these 15 arguments we have been able to think of”.
I don’t think the probability of “AIs can find unboxing arguments we didn’t” is sub-1 enough for preparation to matter. If there is any chance of a mathematical exhaustability of those arguments, its research should be conducted by a select circle of individuals that won’t disclose our critical unboxers until a proof of safety.