>First crucial point which this post is missing: the first (intuitively wrong) net reconstructed represents the probabilities using 9 parameters (i.e. the nine rows of the various truth tables), whereas the second (intuitively right) represents the probabilities using 8. That means the second model uses fewer bits; the distribution is more compressed by the model. So the “true” network is favored even before we get into interventions. > >Implication of this for causal epistemics: we have two models which make the same predictions on-distribution, and only make different predictions under interventions. Yet, even without actually observing any interventions, we do have reason to epistemically favor one model over the other.
For people interested in learning more about this idea: This is described in Section 2.3 of Pearl’s book Causality. The beginning of Ch. 2 also contains some information about the history of this idea. There’s also a more accessible post by Yudkowsky that has popularized these ideas on LW, though it contains some inaccuracies, such as explicitly equating causal graphs and Bayes nets.
>First crucial point which this post is missing: the first (intuitively wrong) net reconstructed represents the probabilities using 9 parameters (i.e. the nine rows of the various truth tables), whereas the second (intuitively right) represents the probabilities using 8. That means the second model uses fewer bits; the distribution is more compressed by the model. So the “true” network is favored even before we get into interventions.
>
>Implication of this for causal epistemics: we have two models which make the same predictions on-distribution, and only make different predictions under interventions. Yet, even without actually observing any interventions, we do have reason to epistemically favor one model over the other.
For people interested in learning more about this idea: This is described in Section 2.3 of Pearl’s book Causality. The beginning of Ch. 2 also contains some information about the history of this idea. There’s also a more accessible post by Yudkowsky that has popularized these ideas on LW, though it contains some inaccuracies, such as explicitly equating causal graphs and Bayes nets.