gwern comments on Is AlphaZero any good without the tree search?

gwern 1 Jul 2019 3:17 UTC
7 points
You want the original ‘AlphaGo Zero’ paper, not the later ‘AlphaZero’ papers, which merely simplify it and reuse it in other domains; the AGZ paper is more informative than the AZ papers. See Figure 6b, and pg25 for the tree search details:

Figure 6b shows the performance of each program on an Elo scale. The raw neural network, without using any lookahead, achieved an Elo rating of 3,055. AlphaGo Zero achieved a rating of 5,185, compared to 4,858 for AlphaGo Master, 3,739 for AlphaGo Lee and 3,144 for AlphaGo Fan.

So the raw NN—a single forward pass and selecting the max—is 3k ELO, about 100 ELO under AlphaGo Fan, which soundly defeated a human professional (Fan Hui). I’m not sure whether −100 ELO is enough to demote it to ‘amateur’ status, but it’s at least clearly not that far from professional in the worst case.

EDIT: for a much more thorough and rigorous discussion of how you can exchange training for runtime tree search, see Jones 2021; this lets you calculate how much you’d have to spend to train a (probably larger) AlphaZero to close that 100 ELO gap, or to try to get up to 4,858 ELO with solely a forward pass and no search.
What links here?
- The Unreasonable Feasibility Of Playing Chess Under The Influence by Jan (12 Jan 2022 23:09 UTC; 29 points)