If I’m not mistaken, a similar principle is at work in explaining why Random Forests / Extremely Randomized Trees empirically work so well on machine learning tasks (and why they also seem to be fairly robust to numerous irrelevant variables). They aren’t linear models in terms of the original variables, but if each tree is a new variable than the collection of trees is a linear model of equally weighted predictors.
Maybe. The explanation I’ve seen floated is that the tree methods are exploiting nearest-neighbor effects with adaptive distances; maybe that winds up being about the same thing.
If I’m not mistaken, a similar principle is at work in explaining why Random Forests / Extremely Randomized Trees empirically work so well on machine learning tasks (and why they also seem to be fairly robust to numerous irrelevant variables). They aren’t linear models in terms of the original variables, but if each tree is a new variable than the collection of trees is a linear model of equally weighted predictors.
Maybe. The explanation I’ve seen floated is that the tree methods are exploiting nearest-neighbor effects with adaptive distances; maybe that winds up being about the same thing.