gjm comments on AlphaGo Zero and the Foom Debate

gjm 22 Oct 2017 2:34 UTC
8 points
I wouldn’t be so sure it’ll work well on chess without modifications. In go, the board is fairly large relative to many important configurations that occur on it. This is a good fit for convolutional NNs, which I believe AlphaGo uses. In chess, the board is tiny. In go, even a crude evaluation of a position is subtle and probably expensive, and much of the time strategic considerations outweigh tactical ones. In chess, a simple material count is really cheap to maintain and tells a substantial fraction of the truth about who’s ahead, and tactics commonly dominate. This makes expensive “leaf” evaluation with an NN much more attractive in go than in chess. In go, the raw branching factor is rather large (hundreds of legal moves) but it’s “easy” (ha!) to prune it down a lot and be reasonably sure you aren’t missing important moves. In chess, the raw branching factor isn’t so large but almost any legal move could turn out to be best. For all these reasons, fairly naive searching + fairly simple evaluation is much more effective in chess, and fancy deep-NN evaluation (and move selection) seems less likely to make a good tradeoff. I’m sure AlphaGo-like chess programs are possible and would play reasonably well, but I’d be surprised if they could avoid just getting outsearched by something like Stockfish on “equal” hardware.
(I am not a very strong player of either game, nor have I implemented a program that plays either game well. I could well be wrong. But I think the above is the conventional wisdom, and it seems plausible to me.)
- Zvi 22 Oct 2017 14:34 UTC
  8 points
  Parent
  I do think Go is more tactical / sensitive to detailed changes than all that, and also that it’s not as easy as it looks to narrow the range of plausible moves especially when there’s nothing tactical going on, so this working in Go makes me optimistic about Chess. I agree that it’s not a slam dunk but I’d certainly like to see it tried. One frustrating thing is that negative results aren’t reported, so there might be lots of things that interestingly didn’t work and they wouldn’t tell us.
  - gjm 22 Oct 2017 17:19 UTC
    5 points
    Parent
    Giraffe is the most successful attempt I know of to build a NN-based chess program. It played pretty well but much weaker (on standard PC hardware) than more standard fast-searching rival programs. Of course it’s entirely possible that the DeepMind team could do better. Giraffe’s NN is a fairly simple and shallow one; I forget what AlphaGo’s looks like but I think it’s much bigger. (That doesn’t mean a bigger one would be better for chess; as I indicated above, I suspect the optimal size of a NN for playing chess may be zero.)
- Zvi 6 Dec 2017 13:25 UTC
  7 points
  Parent
  https://arxiv.org/pdf/1712.01815.pdf
  Shows AlphaGoZero training for a day, and then beating Stockfish at Chess. Also answers the question of why they hadn’t applied it to Chess, since they have now done that (and to Shogi).
  - gjm 6 Dec 2017 13:58 UTC
    3 points
    Parent
    Yup. Of course it’s not playing on “equal” hardware—their TPUs pack a big punch—but for sure this latest work is impressive. (They’re beating Stockfish while searching something like 1000x fewer positions per second.)
    Let’s see. AlphaZero was using four TPUs, each of which does about 180 teraflops. Stockfish was using 64 threads, and let’s suppose each of those gets about 2 GIPS. Naively equating flops to CPU operations, that says AZ had ~5000x more processing power. My hazy memory is that the last time I looked, which was several years ago, a 2x increase in speed bought you about 50 ELO rating points; I bet there are some diminishing returns so let’s suppose the figure is 20 points per 2x speed increase at the Stockfish/AZ level. With all these simplistic assumptions, throwing AZ’s processing power at Stockfish by more conventional means might buy you about 240 points of ELO rating, corresponding to a win probability (taking a draw as half a win) of about 0.8. Which is a little better than AZ is reported as getting against Stockfish.
    Tentative conclusion: AZ is an impressive piece of work, especially as it learns entirely from self-play, but it’s not clear that it’s doing any better against Stockfish than the same amount of hardware would if dedicated to running Stockfish faster; so the latest evidence is still (just about) consistent with the hypothesis that the optimal size of a neural network for computer chess is zero.