This wouldn’t be the first time Deepmind pulled these shenanigans.
My impression of Deepmind is they like playing up the impressiveness of their achievements to give an impression of having ‘solved’ some issue, never saying anything technically false, while suspiciously leaving out relevant information and failing to do obvious tests of their models which would reveal a less impressive achievement.
For Alphastar they claimed ‘grandmaster’ level, but didn’t show any easily available stats which would make it possible to verify. As someone who was in Grandmaster league at the time of it playing (might even have run into it on ladder, some of my teammates did), its play at best felt like low grandmaster to me.
At their event showing an earlier prototype off, they had one player (TLO) play their off-race with which he certainly was not at a grandmaster level. The pro player (Mana) playing their main race beat it at the event, when they had it play with the same limited camera access humans have. I don’t remember all the details anymore, but I remember being continuously annoyed by suspicious omission after suspicious omission.
What annoys me most is that this still was a wildly impressive achievement! Just state in the paper: “we managed to reach grandmaster with one out of three factions”—Nobody has ever managed to create AI that played remotely as well as this!
Similarly Deepminds no-search chess engine is surely the furthest anyone has gotten without search. Even if it didn’t quite make grandmaster, just say so!
DeepMind’s no-search chess engine is surely the furthest anyone has gotten without search.
This is quite possibly not true! The cutting-edge Lc0 networks (BT3/BT4, T3) have much stronger policy and value than the AlphaZero networks, and the Lc0 team fairly regularly make claims of “grandmaster” policy strength.
This wouldn’t be the first time Deepmind pulled these shenanigans.
My impression of Deepmind is they like playing up the impressiveness of their achievements to give an impression of having ‘solved’ some issue, never saying anything technically false, while suspiciously leaving out relevant information and failing to do obvious tests of their models which would reveal a less impressive achievement.
For Alphastar they claimed ‘grandmaster’ level, but didn’t show any easily available stats which would make it possible to verify. As someone who was in Grandmaster league at the time of it playing (might even have run into it on ladder, some of my teammates did), its play at best felt like low grandmaster to me.
At their event showing an earlier prototype off, they had one player (TLO) play their off-race with which he certainly was not at a grandmaster level. The pro player (Mana) playing their main race beat it at the event, when they had it play with the same limited camera access humans have. I don’t remember all the details anymore, but I remember being continuously annoyed by suspicious omission after suspicious omission.
What annoys me most is that this still was a wildly impressive achievement! Just state in the paper: “we managed to reach grandmaster with one out of three factions”—Nobody has ever managed to create AI that played remotely as well as this!
Similarly Deepminds no-search chess engine is surely the furthest anyone has gotten without search. Even if it didn’t quite make grandmaster, just say so!
This is quite possibly not true! The cutting-edge Lc0 networks (BT3/BT4, T3) have much stronger policy and value than the AlphaZero networks, and the Lc0 team fairly regularly make claims of “grandmaster” policy strength.
That sounds interesting. Do they have any writeups on this?
they do now! https://lczero.org/blog/2024/02/how-well-do-lc0-networks-compare-to-the-greatest-transformer-network-from-deepmind/
Apparently not a writeup (yet?), but there appears to be a twitter post here from LC0 with an comparison plot of accuracy on tactics puzzles: https://x.com/LeelaChessZero/status/1757502430495859103?s=20