The sample game is great, featuring the player written about here. If you are familiar with Diplomacy or otherwise want more color, I recommend watching the video.
I think this is wrong—I don’t think that’s Andrew Goff in that video.
The AI is thus heavily optimized for exactly the world in which it succeeded …
Hmm, it seems more likely to me that the main reason they opted for Blitz Diplomacy was not that humans benefit more than the AI from extra time, not that it prevents humans from identifying the AI, but that the dialogue model, like many chat bots (I might’ve said all, but then ChatGPT arrived), couldn’t keep up a coherent conversation for longer than that. I’m not very confident in this, though, and I do think the other factors matter a bit too, but maybe not as much.
The strategic engine, as I evaluated it based on a sample game with six bots and a human, seemed to me to be mediocre at tactics and lousy at strategy.
This seems wrong to me. Bakhtin (2021) achieved superhuman performance in 2-player No-Press Diplomacy. Also, the guy in the video you link calls, in another video, an earlier model (playing 7-player Gunboat Diplomacy) “”exceptionally strong tactically”; I see no reason why CICERO should be much worse (Gunboat Diplomacy isn’t that different from Blitz, I think).
Disclaimer: I never played diplomacy.
I think this is wrong—I don’t think that’s Andrew Goff in that video.
Hmm, it seems more likely to me that the main reason they opted for Blitz Diplomacy was not that humans benefit more than the AI from extra time, not that it prevents humans from identifying the AI, but that the dialogue model, like many chat bots (I might’ve said all, but then ChatGPT arrived), couldn’t keep up a coherent conversation for longer than that. I’m not very confident in this, though, and I do think the other factors matter a bit too, but maybe not as much.
This seems wrong to me. Bakhtin (2021) achieved superhuman performance in 2-player No-Press Diplomacy. Also, the guy in the video you link calls, in another video, an earlier model (playing 7-player Gunboat Diplomacy) “”exceptionally strong tactically”; I see no reason why CICERO should be much worse (Gunboat Diplomacy isn’t that different from Blitz, I think).