Yup. Of course it’s not playing on “equal” hardware—their TPUs pack a big punch—but for sure this latest work is impressive. (They’re beating Stockfish while searching something like 1000x fewer positions per second.)
Let’s see. AlphaZero was using four TPUs, each of which does about 180 teraflops. Stockfish was using 64 threads, and let’s suppose each of those gets about 2 GIPS. Naively equating flops to CPU operations, that says AZ had ~5000x more processing power. My hazy memory is that the last time I looked, which was several years ago, a 2x increase in speed bought you about 50 ELO rating points; I bet there are some diminishing returns so let’s suppose the figure is 20 points per 2x speed increase at the Stockfish/AZ level. With all these simplistic assumptions, throwing AZ’s processing power at Stockfish by more conventional means might buy you about 240 points of ELO rating, corresponding to a win probability (taking a draw as half a win) of about 0.8. Which is a little better than AZ is reported as getting against Stockfish.
Tentative conclusion: AZ is an impressive piece of work, especially as it learns entirely from self-play, but it’s not clear that it’s doing any better against Stockfish than the same amount of hardware would if dedicated to running Stockfish faster; so the latest evidence is still (just about) consistent with the hypothesis that the optimal size of a neural network for computer chess is zero.
Yup. Of course it’s not playing on “equal” hardware—their TPUs pack a big punch—but for sure this latest work is impressive. (They’re beating Stockfish while searching something like 1000x fewer positions per second.)
Let’s see. AlphaZero was using four TPUs, each of which does about 180 teraflops. Stockfish was using 64 threads, and let’s suppose each of those gets about 2 GIPS. Naively equating flops to CPU operations, that says AZ had ~5000x more processing power. My hazy memory is that the last time I looked, which was several years ago, a 2x increase in speed bought you about 50 ELO rating points; I bet there are some diminishing returns so let’s suppose the figure is 20 points per 2x speed increase at the Stockfish/AZ level. With all these simplistic assumptions, throwing AZ’s processing power at Stockfish by more conventional means might buy you about 240 points of ELO rating, corresponding to a win probability (taking a draw as half a win) of about 0.8. Which is a little better than AZ is reported as getting against Stockfish.
Tentative conclusion: AZ is an impressive piece of work, especially as it learns entirely from self-play, but it’s not clear that it’s doing any better against Stockfish than the same amount of hardware would if dedicated to running Stockfish faster; so the latest evidence is still (just about) consistent with the hypothesis that the optimal size of a neural network for computer chess is zero.