Yes, I wrote in the main post that the authors of tree regularization penalized large trees. Certainly small decision trees are interpretable, and large trees much less so.
Oh, sure, but if you train on a complex task like image classification, you’re only going to get large decision trees (assuming you get decent accuracy), even with regularization.
(Also, why not just use the decision tree if it’s interpretable? Why bother with a neural net at all?)
The hope is that we can design some regularization scheme that provides us the performance of neural networks while providing an interpretation that is understandable. Tree regularization does not do this, but it is one approach that uses regularization for interpretability.
Yes, I wrote in the main post that the authors of tree regularization penalized large trees. Certainly small decision trees are interpretable, and large trees much less so.
Oh, sure, but if you train on a complex task like image classification, you’re only going to get large decision trees (assuming you get decent accuracy), even with regularization.
(Also, why not just use the decision tree if it’s interpretable? Why bother with a neural net at all?)
The hope is that we can design some regularization scheme that provides us the performance of neural networks while providing an interpretation that is understandable. Tree regularization does not do this, but it is one approach that uses regularization for interpretability.