Discussing object-level strategies for the AI-box experiment is kind of missing the point. A superintelligent AI, being smarter than a human, has a higher upper limit to “best strategy you can think of” than any human does, so a human who tries to imagine the best possible strategy, pictures himself facing that strategy and decides he would win, and therefore expects a boxed AI to stay boxed. The more object-level strategy gets discussed, the more likely that is to happen, which I think is the main reason the logs of the experiments stay secret.
Discussing object-level strategies for the AI-box experiment is kind of missing the point. A superintelligent AI, being smarter than a human, has a higher upper limit to “best strategy you can think of” than any human does, so a human who tries to imagine the best possible strategy, pictures himself facing that strategy and decides he would win, and therefore expects a boxed AI to stay boxed. The more object-level strategy gets discussed, the more likely that is to happen, which I think is the main reason the logs of the experiments stay secret.