So I was very surprised when I learned that a single general method in deep learning (training an artificial neural network on massive amounts of data using gradient descent)[2] led to performance comparable or superior to humans’ in tasks as disparate as image classification, speech synthesis, and playing Go. I found superhuman Go performance particularly surprising—intuitive judgments of Go boards encode distillations of high-level strategic reasoning, and are highly sensitive to small changes in input.
I think it may be important to recognize that AlphaGo (and AlphaZero) use more than deep learning to solve Go. They also use tree search, which is ideally suited to strategic reasoning. Neural networks, on the other hand, are famously bad at symbolic reasoning tasks, which may ultimately have some basis in the fact that probability does not extend logic.
And MuZero, which beats AlphaZero and which does not use symbolic search over a simulator of board states but internal search over hidden state and value estimates?
Neural networks, on the other hand, are famously bad at symbolic reasoning tasks, which may ultimately have some basis in the fact that probability does not extend logic.
Considering all the progress on graph and relational networks and inference and theorem-proving and whatnot, this statement is giving a lot of hostages to fortune.
I think it may be important to recognize that AlphaGo (and AlphaZero) use more than deep learning to solve Go. They also use tree search, which is ideally suited to strategic reasoning. Neural networks, on the other hand, are famously bad at symbolic reasoning tasks, which may ultimately have some basis in the fact that probability does not extend logic.
And MuZero, which beats AlphaZero and which does not use symbolic search over a simulator of board states but internal search over hidden state and value estimates?
Considering all the progress on graph and relational networks and inference and theorem-proving and whatnot, this statement is giving a lot of hostages to fortune.
Yep! I addressed this point in footnote [3].
The raw neural network does use search during training though, and does not rely on search only during evaluation.