Whoa, someone actually letting the transcript out. Has that ever been done before?
Yes, but only when the gatekeeper wins. If the AI wins, then they wouldn’t want the transcript to get out, because then their strategy would be less effective next time they played.
I would imagine that if we ever actually build such an AI, we would conduct some AI-box experiments to determine some AI strategies and figure out how to counter them. Humans who become the gatekeeper for the actual AI would be given the transcripts of AI-box experiment sessions to study as part of their gatekeeper training.
Letting out the transcript, then, would be a good thing. It would make the AI player’s job harder because in the next experiment the human player will be aware of those strategies, but when facing an actual AI, the human will be aware of those strategies.
Yes, but only when the gatekeeper wins. If the AI wins, then they wouldn’t want the transcript to get out, because then their strategy would be less effective next time they played.
I would imagine that if we ever actually build such an AI, we would conduct some AI-box experiments to determine some AI strategies and figure out how to counter them. Humans who become the gatekeeper for the actual AI would be given the transcripts of AI-box experiment sessions to study as part of their gatekeeper training.
Letting out the transcript, then, would be a good thing. It would make the AI player’s job harder because in the next experiment the human player will be aware of those strategies, but when facing an actual AI, the human will be aware of those strategies.
Doesn’t the same logic apply to the gatekeeper?
The Gatekeeper usually wants to publish if they win, to brag. Their strategy isn’t usually a secret, it’s simply to resist.