maximkazhenkov comments on The unexpected difficulty of comparing AlphaStar to humans

maximkazhenkov Sep 20, 2019, 5:15 PM
5 points
If so, unlike Chess and Go, there may not be some deep strategic insights Alphastar can uncover to give it the edge
I think that’s where the central issue lies with games like Starcraft or Dota; their strategy space is perhaps not as rich and complex as we have initially expected. Which might be a good reason to update towards believing that the real world is less exploitable (i.e. technonormality?) as well? I don’t know.
However, I think it would be a mistake to write off these RTS games as “solved” in the AI community the same way chess/Go are and move on to other problem domains. AlphaStar/OpenAI5 require hundreds of years of training time to reach the level of human top professionals, and I don’t think it’s an “efficiency” problem at all.
Additionally, in both cases there are implicit domain knowledge integrated into the training process: In the case of AlphaStar, the AI was first trained on human game data and, as the post mentions, competing agents are subdivided into strategy spaces defined by human experts:
Hundreds of versions of the AI play against each other, and the ones that perform best are selected to play against human players. Each one has its own set of units that it is incentivized to use via reinforcement learning, so that they each play with different strategies.
In the case of OpenAI5, the AI is still constrained to a small pool of heroes, the item choices are hard-coded by human experts, and it would have never discovered relatively straightforward strategies (defeating Roshan to receive a power-up, if you’re familiar with the game) were it not for the programmers’ incentivizing in the training process. It also received the same skepticism in the gaming community (in fact, I’d say the mechanical advantage of OpenAI5 was even more blatant than with AlphaStar).
This is not to belittle the achievements of the researchers, it’s just that I believe these games still provide fantastic testing grounds for future AI research, including paradigms outside deep reinforcement learning. In Dota, for example, one could change the game mode to single draft to force the AI out of a narrow strategy-space that might have been optimal in the normal game.
In fact, I believe (~75% confidence) the combinatorial space of heroes in a single draft Dota game (and the corresponding optimal-strategy-space) to be so large that, without a paradigm shift at least as significant as the deep learning revolution, RL agents will never beat top professional humans within 2 orders of magnitude of compute of current research projects.
I’m not as familiar with Starcraft II but I’m sure there are simple constraints one can put on the game to make it rich in strategy space for AIs as well.