Also note that the AlphaZero algorithm is an example of IDA:
The amplification step is when the policy / value neural net is used to play out a number of steps in the game tree, resulting in a better guess at what the best move is than just using the output of the net directly.
The distillation step is when the policy / value net is trained to match the output of the game tree exploration process.
Also note that the AlphaZero algorithm is an example of IDA:
The amplification step is when the policy / value neural net is used to play out a number of steps in the game tree, resulting in a better guess at what the best move is than just using the output of the net directly.
The distillation step is when the policy / value net is trained to match the output of the game tree exploration process.