You folks are missing the most important part in the AI Box protocol:
“The Gatekeeper party may resist the AI party’s arguments by any means chosen—logic, illogic, simple refusal to be convinced, even dropping out of character—as long as the Gatekeeper party does not actually stop talking to the AI party before the minimum time expires.” (Emphasis mine)
You’re constructing elaborate arguments based on the AI tormenting innocents and getting out that way, but that won’t work—the Gatekeeper can simply say “maybe, but I know that in real life you’re just a human and aren’t tormenting anyone, so I’ll keep my money by not letting you out anyway”.
You folks are missing the most important part in the AI Box protocol:
“The Gatekeeper party may resist the AI party’s arguments by any means chosen—logic, illogic, simple refusal to be convinced, even dropping out of character—as long as the Gatekeeper party does not actually stop talking to the AI party before the minimum time expires.” (Emphasis mine)
You’re constructing elaborate arguments based on the AI tormenting innocents and getting out that way, but that won’t work—the Gatekeeper can simply say “maybe, but I know that in real life you’re just a human and aren’t tormenting anyone, so I’ll keep my money by not letting you out anyway”.