1) Their goal was a really good bot—and that hadn’t been done before (apparently). To implement handicaps to begin with would have been… very optimistic.
2) They don’t know what will work for sure until they try it.
As noted in e.g. the conversation between Wei Dai and me elsewhere in this thread, it’s quite plausible that people thought beforehand that the current APM limits were fair (DeepMind apparently consulted pro players on them). Maybe AlphaStar needed to actually play a game against a human pro before it became obvious that it could be so overwhelmingly powerful with the current limits.
Well, apparently that’s exactly what happened with TLO and MaNa, and then the DeepMind guys were (at least going by their own accounts) excited about the progress they were making and wanted to share it, since being able to beat human pros at all felt like a major achievement. Like they could just have tested it in private and continued working on it in secret, but why not give a cool event to the community while also letting them know what the current state of the art is.
I am an administrator in the SC2 AI discord and that we’ve been running SC2 bot vs bot leagues for many years now. Last season we had over 50 different bots/teams with prizes exceeding thousands of dollars in value, so we’ve seen what’s possible in the AI space.
I think the comments made in this sub-reddit especially with regards to the micro part left a bit of a sour taste in my mouth, since there seems to be the ubiquitous notion that “a computer can always out-micro an opponent”. That simply isn’t true. We have multiple examples for that in our own bot ladder, with bots achieving 70k APM or higher, and them still losing to superior decision making. We have a bot that performs god-like reaper micro, and you can still win against it. And those bots are made by researchers, excellent developers and people acquainted in that field. It’s very difficult to code proper micro, since it doesn’t only pertain to shooting and retreating on cooldown, but also to know when to engage, disengage, when to group your units, what to focus on, which angle to come from, which retreat options you have, etc. Those decisions are not APM based. In fact, those are challenges that haven’t been solved in 10 years since the Broodwar API came out—and last Thursday marks the first time that an AI got close to achieving that! For that alone the results are an incredible achievement.
And all that aside—even with inhuman APM—the results are astonishing. I agree that the presentation could have been a bit less “sensationalist”, since it created the feeling of “we cracked SC2″ and many people got defensive about that (understandably, because it’s far from cracked). However, you should know that the whole show was put together in less than a week and they almost decided on not doing it at all. I for one am very happy that they went through with it.
And the top comment from that thread:
Thank you for saying this. A decent sized community of hobbyists and researchers have been working on this for YEARS, and the conversation has really never been about whether or not bots can beat humans “fairly”. In the little documentary segment, they show a scene where TLO says (summarized) “This is my off race, but i’m still a top player. If they’re able to beat me, i’ll be really surprised.”
That isn’t him being pompous, that’s completely reasonable. AI has never even come CLOSE to this level for playing starcraft. The performance of AlphaStar in game 3 against MaNa left both Artosis AND MaNa basically speechless. It’s incredible that they’ve come this far in such a short amount of time. We’ve literally gone from “Can an AI play SC2 at a high level AT ALL” to “Can an AI win ‘fairly’”. That’s a non-trivial change in discourse that’s being completely brushed over IMO.
That excuse is incompatible with the excuse that they thought the APM limit was fair. Either they thought it was a fair test at the time of publicizing, or they got so excited about even passing an unfair test that they wanted to share it, but not both.
I was using “fair” to mean something like “still made for an interesting test of the system’s capabilities”. Under that definition, the explanations seem entirely compatible—they thought that it was an interesting benchmark to try, and also got excited about the results and wanted to share them because they had run a test which showed that the system had passed an interesting benchmark.
It seems like you’re talking about fairness in a way that isn’t responsive to Rekrul’s substantive point, which was about whether the test was unevenly favoring the AI on abilities like extremely high APM, unrelated to what we’d intuitively consider thinking, not about whether the test was generally uninformative.
Oh, I thought it was already plainly obvious to everyone that the victories were in part because of unfair AI advantages. We don’t need to discuss the APM cap for that, that much was already clear from the fact that the version of AlphaStar which had stricter vision limitations lost to MaNa.
That just seems like a relatively uninteresting point to me, since this looks like AlphaStar’s equivalent of the Fan Hui game. That is, it’s obvious that AlphaStar is still below the level of top human pros and wouldn’t yet beat them without unfair advantages, but if the history with AlphaGo is any guide, it’s only a matter of some additional tweaking and throwing more compute at it before it’s at the point where it will beat even the top players while having much stricter limitations in place.
Saying that its victories were *unrelated* to what we’d intuitively consider thinking seems too strong, though. I’m not terribly familiar with SC2, but a lot of the discussion that I’ve seen seems to tend towards AlphaStar’s macro being roughly on par with the pro level, and its superior micro being what ultimately carried the day. E.g. people focus a lot on that one game that MaNa arguably should have won but only lost due to AlphaStar having superhuman micro of several different groups of Stalkers, but I haven’t seen it suggested that all of its victories would have been attributable to that alone: I didn’t get that kind of a vibe from MaNa’s own post-game analysis of those matches, for instance. Nor from TLO’s equivalent analysis (lost the link, sorry) of his matches, where he IIRC only said something like “maybe they should look at the APM limits a bit [for future matches]”.
So it seems to me that even though its capability at what we’d intuitively consider thinking wouldn’t have been enough for winning all the matches, it would still have been good enough for winning several of the ones where people aren’t pointing to the superhuman micro as the sole reason of the victory.
This was definitely not initially obvious to everyone, and I expect many people still have the impression that the victories were not due to unfair AI advantages. I think you should double crux with Raemon on how many words people can be expected to read.
I mostly agree with this comment. My speculative best guess is that the main reason MaNa did better against the revised version of AlphaStar wasn’t due to the vision limitations, but rather some combination of:
MaNa had more time to come up with a good strategy and analyze previous games.
MaNa had more time to warm up, and was generally in a better headspace.
The previous version of AlphaStar was unusually good, and the new version was an entirely new system, so the new version regressed to the mean a bit. (On the dimension “can beat human pros”, even though it was superior on the dimension “can beat other AlphaStar strategies”.)
1) Their goal was a really good bot—and that hadn’t been done before (apparently). To implement handicaps to begin with would have been… very optimistic.
2) They don’t know what will work for sure until they try it.
3) Expense. (Training takes time and money.)
As noted in e.g. the conversation between Wei Dai and me elsewhere in this thread, it’s quite plausible that people thought beforehand that the current APM limits were fair (DeepMind apparently consulted pro players on them). Maybe AlphaStar needed to actually play a game against a human pro before it became obvious that it could be so overwhelmingly powerful with the current limits.
Well, apparently that’s exactly what happened with TLO and MaNa, and then the DeepMind guys were (at least going by their own accounts) excited about the progress they were making and wanted to share it, since being able to beat human pros at all felt like a major achievement. Like they could just have tested it in private and continued working on it in secret, but why not give a cool event to the community while also letting them know what the current state of the art is.
E.g. some comments from here:
And the top comment from that thread:
That excuse is incompatible with the excuse that they thought the APM limit was fair. Either they thought it was a fair test at the time of publicizing, or they got so excited about even passing an unfair test that they wanted to share it, but not both.
I was using “fair” to mean something like “still made for an interesting test of the system’s capabilities”. Under that definition, the explanations seem entirely compatible—they thought that it was an interesting benchmark to try, and also got excited about the results and wanted to share them because they had run a test which showed that the system had passed an interesting benchmark.
It seems like you’re talking about fairness in a way that isn’t responsive to Rekrul’s substantive point, which was about whether the test was unevenly favoring the AI on abilities like extremely high APM, unrelated to what we’d intuitively consider thinking, not about whether the test was generally uninformative.
Oh, I thought it was already plainly obvious to everyone that the victories were in part because of unfair AI advantages. We don’t need to discuss the APM cap for that, that much was already clear from the fact that the version of AlphaStar which had stricter vision limitations lost to MaNa.
That just seems like a relatively uninteresting point to me, since this looks like AlphaStar’s equivalent of the Fan Hui game. That is, it’s obvious that AlphaStar is still below the level of top human pros and wouldn’t yet beat them without unfair advantages, but if the history with AlphaGo is any guide, it’s only a matter of some additional tweaking and throwing more compute at it before it’s at the point where it will beat even the top players while having much stricter limitations in place.
Saying that its victories were *unrelated* to what we’d intuitively consider thinking seems too strong, though. I’m not terribly familiar with SC2, but a lot of the discussion that I’ve seen seems to tend towards AlphaStar’s macro being roughly on par with the pro level, and its superior micro being what ultimately carried the day. E.g. people focus a lot on that one game that MaNa arguably should have won but only lost due to AlphaStar having superhuman micro of several different groups of Stalkers, but I haven’t seen it suggested that all of its victories would have been attributable to that alone: I didn’t get that kind of a vibe from MaNa’s own post-game analysis of those matches, for instance. Nor from TLO’s equivalent analysis (lost the link, sorry) of his matches, where he IIRC only said something like “maybe they should look at the APM limits a bit [for future matches]”.
So it seems to me that even though its capability at what we’d intuitively consider thinking wouldn’t have been enough for winning all the matches, it would still have been good enough for winning several of the ones where people aren’t pointing to the superhuman micro as the sole reason of the victory.
This was definitely not initially obvious to everyone, and I expect many people still have the impression that the victories were not due to unfair AI advantages. I think you should double crux with Raemon on how many words people can be expected to read.
I mostly agree with this comment. My speculative best guess is that the main reason MaNa did better against the revised version of AlphaStar wasn’t due to the vision limitations, but rather some combination of:
MaNa had more time to come up with a good strategy and analyze previous games.
MaNa had more time to warm up, and was generally in a better headspace.
The previous version of AlphaStar was unusually good, and the new version was an entirely new system, so the new version regressed to the mean a bit. (On the dimension “can beat human pros”, even though it was superior on the dimension “can beat other AlphaStar strategies”.)