I mean, it’s true as a special case of minimum description length epistemology favoring simpler models. Chapter 18 of the Koller and Friedman book has a section about the Bayesian score for model comparison, which has a positive term involving the mutual information between variables and their parents (rewarding “fit”), and a negative term for the number of parameters (penalizing complexity).
What’s less clear to me (Wentworth likely knows more) is how closely that kind of formal model comparison corresponds to my intuitive sense of causality. The first network in this post intuitively feels very wrong—two backwards arrows and one spurious arrow—out of proportion to its complexity burden of one measly extra parameter. (Maybe the disparity scales up with non-tiny examples, though?)
I mean, it’s true as a special case of minimum description length epistemology favoring simpler models. Chapter 18 of the Koller and Friedman book has a section about the Bayesian score for model comparison, which has a positive term involving the mutual information between variables and their parents (rewarding “fit”), and a negative term for the number of parameters (penalizing complexity).
What’s less clear to me (Wentworth likely knows more) is how closely that kind of formal model comparison corresponds to my intuitive sense of causality. The first network in this post intuitively feels very wrong—two backwards arrows and one spurious arrow—out of proportion to its complexity burden of one measly extra parameter. (Maybe the disparity scales up with non-tiny examples, though?)
Yup, that’s right.