I also looked into number of training points very briefly, Googling suggests AlexNet used 90 epochs on ImageNet’s 1.3 million train images, while AlphaZero played 44 million games for chess (I didn’t quickly find a number for Go), suggesting that the number of images was roughly similar to the number of games.
So I think probably the remaining orders of magnitude are coming from the tree search part of MCTS (which causes there to be > 200 forward passes per game).
I also looked into number of training points very briefly, Googling suggests AlexNet used 90 epochs on ImageNet’s 1.3 million train images, while AlphaZero played 44 million games for chess (I didn’t quickly find a number for Go), suggesting that the number of images was roughly similar to the number of games.
So I think probably the remaining orders of magnitude are coming from the tree search part of MCTS (which causes there to be > 200 forward passes per game).