Well I try to demonstrate you can derive neural networks from first principles, starting with SI. I don’t think you can derive decision trees or other ML algorithms in a similar way.
Further, NNs are completely general. In theory recurrent neural nets can learn to simulate any computer program, or at least logical circuits. With certain modifications they can even be given a memory “tape” like a turing machine and become turing complete. Most machine learning methods do not have this property or anything like it. They can only learn “shallow” functions and can’t handle recurrency.
Basically every learning algorithm can be seen as a crude approximation of Solomonoff induction. What makes one approximation better than the others?
Well I try to demonstrate you can derive neural networks from first principles, starting with SI. I don’t think you can derive decision trees or other ML algorithms in a similar way.
Further, NNs are completely general. In theory recurrent neural nets can learn to simulate any computer program, or at least logical circuits. With certain modifications they can even be given a memory “tape” like a turing machine and become turing complete. Most machine learning methods do not have this property or anything like it. They can only learn “shallow” functions and can’t handle recurrency.