Update: with the very newest version of AlphaStar, Deepmind won a series of showmatches with Serral (the 2018 world champion, who plays Zerg), with 4 wins and 1 loss. The resulting algorithm is impressively polished at early and mid-game economy and battles, enough so to take down top players, but my original assessment of it still looks good to me.
In particular, AlphaStar still had serious problems with building placement; moreover, it still showed signs of not having mastered scouting, reactive defense, and late-game strategy (especially for Zerg), and thus failed to respond adequately to a player capable of going that far down the game tree.
The game it lost was the one it played as Terran. While it played better than it had in the summer matches, it still failed to wall off its bases, and more crucially it only built units that would help crush an early Zerg attack, not units that would beat a later Zerg army. Even when it started losing armies to Serral’s powerful late-game units, it continued building the same units until it lost completely. This looks, again, like the AlphaStar Zerg agents never figured out late-game strategies, so the Terran never had to learn to counter them.
AlphaStar played 3 of the 5 games as Protoss, the race it learned most effectively as seen in the summer matches. (I’m pretty sure it was intentional on DeepMind’s part to play most of the games as AlphaStar’s preferred race..) These games were won with fantastic economic production and impeccable unit control (especially with multiple simultaneous Disruptor attacks, which are incredibly difficult for humans to control perfectly in the heat of the battle), which overcame noticeable flaws: leaving holes between the buildings so that Zerg units could come in and kill workers, and failing to build the right units against Serral’s army (and thereby losing one army entirely before barely winning with one final push).
It’s hard to learn much from the one game AlphaStar played as Zerg, since there it went for a very polished early attack that narrowly succeeded; it looked to me as if Serral got cocky after seeing the attack coming, and he could have defended easily from that position had he played it safer.
In summary, my claim that DeepMind was throwing in the towel was wrong; they came back with a more polished version that was able to beat the world champion 4 out of 5 times (though 2 of those victories were very narrow, and the loss was not). But the other analyses I made in the post are claims I still stand behind when applied to this version: a major advance for reinforcement learning, but still clearly lacking any real advance in causal reasoning.
In summary, my claim that DeepMind was throwing in the towel was wrong; they came back with a more polished version that was able to beat the world champion 4 out of 5 times
This statement, while technically correct, seems a bit misleading because being the world champion in Starcraft 2 really doesn’t correlate well with being proficient at playing against AI. Check out thisstreamer who played against AlphaStar at the same event, without warm-up or his own setup (just like Serral) and went 7-3. What’s more, he’s pretty much got AlphaStar figured out by the end, and I’m fairly confident that if he was paid to play another 100 games, his win rate would be 90%+
Update: with the very newest version of AlphaStar, Deepmind won a series of showmatches with Serral (the 2018 world champion, who plays Zerg), with 4 wins and 1 loss. The resulting algorithm is impressively polished at early and mid-game economy and battles, enough so to take down top players, but my original assessment of it still looks good to me.
In particular, AlphaStar still had serious problems with building placement; moreover, it still showed signs of not having mastered scouting, reactive defense, and late-game strategy (especially for Zerg), and thus failed to respond adequately to a player capable of going that far down the game tree.
The game it lost was the one it played as Terran. While it played better than it had in the summer matches, it still failed to wall off its bases, and more crucially it only built units that would help crush an early Zerg attack, not units that would beat a later Zerg army. Even when it started losing armies to Serral’s powerful late-game units, it continued building the same units until it lost completely. This looks, again, like the AlphaStar Zerg agents never figured out late-game strategies, so the Terran never had to learn to counter them.
AlphaStar played 3 of the 5 games as Protoss, the race it learned most effectively as seen in the summer matches. (I’m pretty sure it was intentional on DeepMind’s part to play most of the games as AlphaStar’s preferred race..) These games were won with fantastic economic production and impeccable unit control (especially with multiple simultaneous Disruptor attacks, which are incredibly difficult for humans to control perfectly in the heat of the battle), which overcame noticeable flaws: leaving holes between the buildings so that Zerg units could come in and kill workers, and failing to build the right units against Serral’s army (and thereby losing one army entirely before barely winning with one final push).
It’s hard to learn much from the one game AlphaStar played as Zerg, since there it went for a very polished early attack that narrowly succeeded; it looked to me as if Serral got cocky after seeing the attack coming, and he could have defended easily from that position had he played it safer.
In summary, my claim that DeepMind was throwing in the towel was wrong; they came back with a more polished version that was able to beat the world champion 4 out of 5 times (though 2 of those victories were very narrow, and the loss was not). But the other analyses I made in the post are claims I still stand behind when applied to this version: a major advance for reinforcement learning, but still clearly lacking any real advance in causal reasoning.
This statement, while technically correct, seems a bit misleading because being the world champion in Starcraft 2 really doesn’t correlate well with being proficient at playing against AI. Check out this streamer who played against AlphaStar at the same event, without warm-up or his own setup (just like Serral) and went 7-3. What’s more, he’s pretty much got AlphaStar figured out by the end, and I’m fairly confident that if he was paid to play another 100 games, his win rate would be 90%+