I don’t know anything about Diplomacy and I just watched this video, could someone expand a bit on why this game is a particularly alarming capability gain? The chat logs seemed pretty tame, the bot didn’t even seem to attempt psychological manipulation or gaslighting or anything similar. What important real world capability does Diplomacy translate into that other games don’t? (People for instance don’t seem very alarmed nowadays about AI being vastly superhuman at chess or Go.)
So Diplomacy is not a computationally complex game, it’s a game about out-strategizing your opponents where roughly all of the strategy is convincing other of your opponents to work with you. There are no new tactics to invent and an AI can’t really see deeper into the game than other players, it just has to be more persuasive and make decisions about the right people at the right time. You often have to do things like plan ahead to make your actions so that in a future turn someone else will choose to ally with you. The AI didn’t do any specific psychological manipulation, it was just good at being persuasive and strategic in the normal human way. It’s also notable for being able to both play the game and talk with people about the game.
This could translate into something like being good at convincing that the AI should be let out of its box, but I think mostly it’s just being better at multiple skills simultaneously than many people expected.
(Disclaimer: I’ve only played Diplomacy in person before, and not at this high of a level)
I don’t think the game is an alarming capability gain at all—I agree with LawrenceC’s comment below. It’s more of a “gain-of-function research” scenario to me. Like, maybe we shouldn’t deliberately try to train a model to be good at this? If you’ve ever played Diplomacy, you know the whole point of the game is manipulating and backstabbing your way to world domination. I think it’s great that the research didn’t actually seem to come up with any scary generalizable techniques or dangerous memetics, but I think ideally shouldn’t even be trying in the first place.
I don’t know anything about Diplomacy and I just watched this video, could someone expand a bit on why this game is a particularly alarming capability gain? The chat logs seemed pretty tame, the bot didn’t even seem to attempt psychological manipulation or gaslighting or anything similar. What important real world capability does Diplomacy translate into that other games don’t? (People for instance don’t seem very alarmed nowadays about AI being vastly superhuman at chess or Go.)
So Diplomacy is not a computationally complex game, it’s a game about out-strategizing your opponents where roughly all of the strategy is convincing other of your opponents to work with you. There are no new tactics to invent and an AI can’t really see deeper into the game than other players, it just has to be more persuasive and make decisions about the right people at the right time. You often have to do things like plan ahead to make your actions so that in a future turn someone else will choose to ally with you. The AI didn’t do any specific psychological manipulation, it was just good at being persuasive and strategic in the normal human way. It’s also notable for being able to both play the game and talk with people about the game.
This could translate into something like being good at convincing that the AI should be let out of its box, but I think mostly it’s just being better at multiple skills simultaneously than many people expected.
(Disclaimer: I’ve only played Diplomacy in person before, and not at this high of a level)
I don’t think the game is an alarming capability gain at all—I agree with LawrenceC’s comment below. It’s more of a “gain-of-function research” scenario to me. Like, maybe we shouldn’t deliberately try to train a model to be good at this? If you’ve ever played Diplomacy, you know the whole point of the game is manipulating and backstabbing your way to world domination. I think it’s great that the research didn’t actually seem to come up with any scary generalizable techniques or dangerous memetics, but I think ideally shouldn’t even be trying in the first place.