This approach goes back to Hans Reichenbach’s book The Direction of Time. I think the problem is that the set of independencies alone is not sufficient to determine a causal and temporal order. For example, the same independencies between three variables could be interpreted as the chains A→B→C and A←B←C. I think Pearl talks about this issue in the last chapter.
The critical insight is that this is not always the case!
Let’s call two graphs I-equivalent if their set of independencies (implied by d-separation) are identical. A theorem of Bayes Nets say that two graphs are I-equivalent if they have the same skeleton and the same set of immoralities.
This last constraint, plus the constraint that the graph must be acyclic, allows some arrow directions to be identified—namely, across all I-equivalent graphs that are the perfect map of a distribution, some of the edges have identical directions assigned to them.
The IC algorithm (Verma & Pearl, 1990) for finding perfect maps (hence temporal direction) is exactly about exploiting these conditions to orient as many of the edges as possible:
More intuitively, (Verma & Pearl, 1992) and (Meek, 1995) together shows that the following four rules are necessary and sufficient operations to maximally orient the graph according to the I-equivalence (+ acyclicity) constraint:
Anyone interested in further detail should consult Pearl’s Causality Ch 2. Note that for some reason Ch 2 is the only chapter in the book where Pearl talks about Causal Discovery (i.e. inferring time from observational distribution) and the rest of the book is all about Causal Inference (i.e. inferring causal effect from (partially) known causal structure).
Ah yes, the fork asymmetry. I think Pearl believes that correlations reduce to causations, so this is probably why he wouldn’t particularly try to, conversely, reduce causal structure to a set of (in)dependencies. I’m not sure whether the latter reduction is ultimately possible in the universe. Are the correlations present in the universe, e.g. defined via the Albert/Loewer Mentaculus probability distribution, sufficient to recover the familiar causal structure of the universe?
This approach goes back to Hans Reichenbach’s book The Direction of Time. I think the problem is that the set of independencies alone is not sufficient to determine a causal and temporal order. For example, the same independencies between three variables could be interpreted as the chains A→B→C and A←B←C. I think Pearl talks about this issue in the last chapter.
The critical insight is that this is not always the case!
Let’s call two graphs I-equivalent if their set of independencies (implied by d-separation) are identical. A theorem of Bayes Nets say that two graphs are I-equivalent if they have the same skeleton and the same set of immoralities.
This last constraint, plus the constraint that the graph must be acyclic, allows some arrow directions to be identified—namely, across all I-equivalent graphs that are the perfect map of a distribution, some of the edges have identical directions assigned to them.
The IC algorithm (Verma & Pearl, 1990) for finding perfect maps (hence temporal direction) is exactly about exploiting these conditions to orient as many of the edges as possible:
More intuitively, (Verma & Pearl, 1992) and (Meek, 1995) together shows that the following four rules are necessary and sufficient operations to maximally orient the graph according to the I-equivalence (+ acyclicity) constraint:
Anyone interested in further detail should consult Pearl’s Causality Ch 2. Note that for some reason Ch 2 is the only chapter in the book where Pearl talks about Causal Discovery (i.e. inferring time from observational distribution) and the rest of the book is all about Causal Inference (i.e. inferring causal effect from (partially) known causal structure).
Ah yes, the fork asymmetry. I think Pearl believes that correlations reduce to causations, so this is probably why he wouldn’t particularly try to, conversely, reduce causal structure to a set of (in)dependencies. I’m not sure whether the latter reduction is ultimately possible in the universe. Are the correlations present in the universe, e.g. defined via the Albert/Loewer Mentaculus probability distribution, sufficient to recover the familiar causal structure of the universe?