It’s worth noting that they built quite a complicated, specialized AI system (ie they did not take an LLM and finetune a generalist agent that also can play diplomacy):
First, they train a dialogue-conditional action model by behavioral cloning on human data to predict what other players will do.
They they do joint RL planning to get action intentions of the AI and other payers using the outputs of the conditional action model and a learned dialogue-free value model. (They use also regularize this plan using a KL penalty to the output of the action model.)
They also train a conditional dialogue model that maps by finetuning a small LM (a 2.7b BART) to map intents + game history > messages. Interestingly, this model is trained in a way that makes it pretty honest by default.
They train a set of filters to remove hallucinations, inconsistencies, toxicity, leaking its actual plans, etc from the output messages, before sending them to other players.
The intents are updated after every message. At the end of each turn, they output the final intent as the action.
I do expect someone to figure out how to avoid all these dongles and do it with a more generalist model in the next year or two, though.
I think people who are freaking out about Cicero moreso than foundational model scaling/prompting progress are wrong; this is not much of an update on AI capabilities nor an update on Meta’s plans (they were publically working on diplomacy for over a year). I don’t think they introduce any new techniques in this paper either?
It is an update upwards on the competency of this team of Meta, a slight update upwards on the capabilities of small LMs, and probably an update upwards on the amount of hype and interest in AI.
Oh, and it’s also a slight downwards update on the difficulty of press diplomacy. For example, it might just be possible that you don’t need much backstabbing for expert human-level diplomacy?
In a game in which dishonesty is commonplace, it is notable that we were able to achieve human-level performance by controlling the agent’s dialogue through the strategic reasoning module to be largely honest and helpful to its speaking partners.
This doesn’t mean that Cicero is honest—notably, it can “change its mind” about what action it should take, given the dialogue of other players. For example, in the video, at 26:12 we see Austria saying to Russia that they will keep Gal a DMZ, but at 27:03 Austria moves an army into Gal.
For example, it might just be possible that you don’t need much backstabbing for expert human-level diplomacy?
This article interviewing expert Diplomacy players suggests the same (though it somewhat justifies it with player reputations lingering between games, which wasn’t the case here):
Goff says his more relational strategy developed as he matured. He realized, he says, that lying in Diplomacy is usually counterproductive, especially when used for immediate or short-term gains. Double-crossing someone might help you build another fleet or shore up a front, but the moment you’re exposed as a traitor, you will struggle to build beneficial, trustworthy, and information-rich alliances with other players.
Perhaps this is dubious coming from Goff, someone who might be perceived as a master manipulator, but Siobhan Nolen, president of the North American Diplomacy Federation, aligns with the champion’s reasoning. She says despite Diplomacy’s notoriety, most of the world’s elite players eschew lies during games. Reputations linger at global tournaments. “If you’re not trustworthy, then nobody’s going to want to work with you,” she says. “You can be the best player in this game, with all the right tactics, but if no one wants to work with you, you can’t win. Top level players pick their moments to be ruthless.”
As far as I can tell, the AI has no specialized architecture for deciding about its future strategies or giving semantic meaning to its words. It outputting the string “I will keep Gal a DMZ” does not have the semantic meaning of it committing to keep troops out of Gal. It’s just the phrase players that are most likely to win use in that boardstate with its internal strategy.
Like chess grandmasters being outperformed by a simple search tree when it was supposed to be the peak of human intelligence, I think this will have the same effect of disenchanting the game of diplomacy. Humans are not decision theoretical geniuses; just saying whatever people want you to hear while playing optimally for yourself is sufficient to win. There may be a level of play where decision theory and commitments are relevant, but humans just aren’t that good.
That said, I think this is actually a good reason to update towards freaking out. It’s happened quite a few times now that ‘naive’ big milestones have been hit unexpectedly soon “without any major innovations or new techniques”—chess, go, starcraft, dota, gpt-3, dall-e, and now diplomacy. It’s starting to look like humans are less complicated than we thought—more like a bunch of current-level AI architectures squished together in the same brain (with some capacity to train new ones in deployment) than like a powerful generally applicable intelligence. Or a room full of toddlers with superpowers, to use the CFAR phrase. While this doesn’t increase our estimates of the rate of AI development, it does suggest that the goalpost for superhuman intellectual performance in all areas is closer than we might have thought otherwise.
As far as I can tell, the AI has no specialized architecture for deciding about its future strategies or giving semantic meaning to its words. It outputting the string “I will keep Gal a DMZ” does not have the semantic meaning of it committing to keep troops out of Gal. It’s just the phrase players that are most likely to win use in that boardstate with its internal strategy.
This is incorrect; they use “honest” intentions to learn a model of message > intention, then use this model to annotate all the other messages with intentions, which then they then use to train the intent > message map. So the model has a strong bias toward being honest in its intention > message map. (The authors even say that an issue with the model is it has the tendency to spill too many of its plans to its enemies!)
The reason an honest intention > message map doesn’t lead to a fully honest agent is that the search procedure that goes from message + history > intention can “change its mind” about what the best intention is.
Like chess grandmasters being outperformed by a simple search tree when it was supposed to be the peak of human intelligence, I think this will have the same effect of disenchanting the game of diplomacy.
This is correct; every time AI systems reach a milestone earlier than expected, this is simultaneously an update upward on AI progress being faster than expected, and an update downward on the difficulty of the milestone.
I’d like to push back on “AI has beaten StarCraft”. AlphaStar didn’t see the game interface we see, it just saw an interface with exact positions of all its stuff and ability to make any commands possible. It’s far from the mouse-and-keyboard that humans are limited to, and in SC that’s a big limitation. When the AI can read the game state from the pixels and send mouse and keyboard inputs, then I’ll be impressed.
I think people who are freaking out about Cicero moreso than foundational model scaling/prompting progress are wrong; this is not much of an update on AI capabilities
I think there’s a standard argument that goes “You can’t just copy paste a bunch of systems that are superhuman in their respective domains and get a more general agent out.” (e.g. here’s David Chapman saying something like this: https://mobile.twitter.com/Meaningness/status/1563913716969508864)
If you have that belief, I imagine this paper should update you more towards AI capabilities. It is indeed possible to duct tape a bunch of different machine learning models together and get out something impressive. If you didn’t believe this, it should update you on the idea that AGI could come from several small new techniques duct taped together to handle each other’s weakness.
I don’t think that Cicero is a general agent made by gluing together superhuman narrow agents! It’s not clear that any of its components are super human in a meaningful sense.
I also don’t think that “you can’t just copy paste together a bunch of systems that are superhuman...” is a fair summary of David Chapman’s tweet! I think his tweet is specifically pointing out that naming your components suggestive names and drawing arrows between them does not do the hard work of building your generalist agent (which is far more involved).
I don’t think that Cicero is a general agent made by gluing together superhuman narrow agents! It’s not clear that any of its components are super human in a meaningful sense
I don’t either! I think it should update your beliefs that that’s possible though.
I don’t see why it should update my beliefs a non-neglible amount? I expected techniques like this to work for a wide variety of specific tasks given enough effort (indeed, stacking together 5 different techniques into a specialist agent is how a lot of academic work in robotics looks like). I also think that the way people can compose text-davinci-002 or other LMs with themselves into more generalist agents basically should screen off this evidence, even if you weren’t expecting to see it.
I didn’t say it should update your beliefs (edit: I did literally say this lol but it’s not what I meant!) I said it should update the beliefs of people who have a specific prevailing attitude.
If you have that belief, I imagine this paper should update you more towards AI capabilities.
I do believe David Chapman’s tweet though! I don’t think you can just hotwire together a bunch of modules that are superhuman only in narrow domains, and get a powerful generalist agent, without doing a lot of work in the middle.
(That being said, I don’t count gluing together a Python interpreter and a retrieval mechanism to a fine-tuned GPT-3 or whatever to fall in this category; here the work is done by GPT-3 (a generalist agent) and the other parts are primarily augmenting its capabilities.)
I think what we’re seeing here is that LLMs can act as glue to put together these modules in surprising ways, and make them more general. You see that here and with Saycan. And I do think that Chapman’s point becomes less tenable with them in the picture.
It’s worth noting that they built quite a complicated, specialized AI system (ie they did not take an LLM and finetune a generalist agent that also can play diplomacy):
First, they train a dialogue-conditional action model by behavioral cloning on human data to predict what other players will do.
They they do joint RL planning to get action intentions of the AI and other payers using the outputs of the conditional action model and a learned dialogue-free value model. (They use also regularize this plan using a KL penalty to the output of the action model.)
They also train a conditional dialogue model that maps by finetuning a small LM (a 2.7b BART) to map intents + game history > messages. Interestingly, this model is trained in a way that makes it pretty honest by default.
They train a set of filters to remove hallucinations, inconsistencies, toxicity, leaking its actual plans, etc from the output messages, before sending them to other players.
The intents are updated after every message. At the end of each turn, they output the final intent as the action.
I do expect someone to figure out how to avoid all these dongles and do it with a more generalist model in the next year or two, though.
I think people who are freaking out about Cicero moreso than foundational model scaling/prompting progress are wrong; this is not much of an update on AI capabilities nor an update on Meta’s plans (they were publically working on diplomacy for over a year). I don’t think they introduce any new techniques in this paper either?
It is an update upwards on the competency of this team of Meta, a slight update upwards on the capabilities of small LMs, and probably an update upwards on the amount of hype and interest in AI.
Oh, and it’s also a slight downwards update on the difficulty of press diplomacy. For example, it might just be possible that you don’t need much backstabbing for expert human-level diplomacy?
This doesn’t mean that Cicero is honest—notably, it can “change its mind” about what action it should take, given the dialogue of other players. For example, in the video, at 26:12 we see Austria saying to Russia that they will keep Gal a DMZ, but at 27:03 Austria moves an army into Gal.
This article interviewing expert Diplomacy players suggests the same (though it somewhat justifies it with player reputations lingering between games, which wasn’t the case here):
As far as I can tell, the AI has no specialized architecture for deciding about its future strategies or giving semantic meaning to its words. It outputting the string “I will keep Gal a DMZ” does not have the semantic meaning of it committing to keep troops out of Gal. It’s just the phrase players that are most likely to win use in that boardstate with its internal strategy.
Like chess grandmasters being outperformed by a simple search tree when it was supposed to be the peak of human intelligence, I think this will have the same effect of disenchanting the game of diplomacy. Humans are not decision theoretical geniuses; just saying whatever people want you to hear while playing optimally for yourself is sufficient to win. There may be a level of play where decision theory and commitments are relevant, but humans just aren’t that good.
That said, I think this is actually a good reason to update towards freaking out. It’s happened quite a few times now that ‘naive’ big milestones have been hit unexpectedly soon “without any major innovations or new techniques”—chess, go, starcraft, dota, gpt-3, dall-e, and now diplomacy. It’s starting to look like humans are less complicated than we thought—more like a bunch of current-level AI architectures squished together in the same brain (with some capacity to train new ones in deployment) than like a powerful generally applicable intelligence. Or a room full of toddlers with superpowers, to use the CFAR phrase. While this doesn’t increase our estimates of the rate of AI development, it does suggest that the goalpost for superhuman intellectual performance in all areas is closer than we might have thought otherwise.
This is incorrect; they use “honest” intentions to learn a model of message > intention, then use this model to annotate all the other messages with intentions, which then they then use to train the intent > message map. So the model has a strong bias toward being honest in its intention > message map. (The authors even say that an issue with the model is it has the tendency to spill too many of its plans to its enemies!)
The reason an honest intention > message map doesn’t lead to a fully honest agent is that the search procedure that goes from message + history > intention can “change its mind” about what the best intention is.
This is correct; every time AI systems reach a milestone earlier than expected, this is simultaneously an update upward on AI progress being faster than expected, and an update downward on the difficulty of the milestone.
I’d like to push back on “AI has beaten StarCraft”. AlphaStar didn’t see the game interface we see, it just saw an interface with exact positions of all its stuff and ability to make any commands possible. It’s far from the mouse-and-keyboard that humans are limited to, and in SC that’s a big limitation. When the AI can read the game state from the pixels and send mouse and keyboard inputs, then I’ll be impressed.
I think that this is true of the original version of alphastar, but they have since trained a new version on camera inputs and with stronger limitations on apm (22 actions/5s) (Maybe you’d want some kind of noise applied to the inputs still, but I think the current state is much closer to human-like playing conditions.) See: https://www.deepmind.com/blog/alphastar-grandmaster-level-in-starcraft-ii-using-multi-agent-reinforcement-learning
Ah I didn’t know they had upgraded it. I’m much more satisfied that SC2 is solved now.
I think there’s a standard argument that goes “You can’t just copy paste a bunch of systems that are superhuman in their respective domains and get a more general agent out.” (e.g. here’s David Chapman saying something like this: https://mobile.twitter.com/Meaningness/status/1563913716969508864)
If you have that belief, I imagine this paper should update you more towards AI capabilities. It is indeed possible to duct tape a bunch of different machine learning models together and get out something impressive. If you didn’t believe this, it should update you on the idea that AGI could come from several small new techniques duct taped together to handle each other’s weakness.
I don’t think that Cicero is a general agent made by gluing together superhuman narrow agents! It’s not clear that any of its components are super human in a meaningful sense.
I also don’t think that “you can’t just copy paste together a bunch of systems that are superhuman...” is a fair summary of David Chapman’s tweet! I think his tweet is specifically pointing out that naming your components suggestive names and drawing arrows between them does not do the hard work of building your generalist agent (which is far more involved).
(Btw, your link is broken, here’s the tweet.)
I don’t either! I think it should update your beliefs that that’s possible though.
I don’t see why it should update my beliefs a non-neglible amount? I expected techniques like this to work for a wide variety of specific tasks given enough effort (indeed, stacking together 5 different techniques into a specialist agent is how a lot of academic work in robotics looks like). I also think that the way people can compose
text-davinci-002
or other LMs with themselves into more generalist agents basically should screen off this evidence, even if you weren’t expecting to see it.I didn’t say it should update your beliefs (edit: I did literally say this lol but it’s not what I meant!) I said it should update the beliefs of people who have a specific prevailing attitude.
I do believe David Chapman’s tweet though! I don’t think you can just hotwire together a bunch of modules that are superhuman only in narrow domains, and get a powerful generalist agent, without doing a lot of work in the middle.
(That being said, I don’t count gluing together a Python interpreter and a retrieval mechanism to a fine-tuned GPT-3 or whatever to fall in this category; here the work is done by GPT-3 (a generalist agent) and the other parts are primarily augmenting its capabilities.)
I think what we’re seeing here is that LLMs can act as glue to put together these modules in surprising ways, and make them more general. You see that here and with Saycan. And I do think that Chapman’s point becomes less tenable with them in the picture.
So… LLMs are AGIs?
LLMs can act as glue that makes AI’s more G?
Yes, essentially.