FFS has a different notion of causality, which is both weaker and stronger in important ways. FFS defines the history of an event X to be the minimal set of factors which uniquely specify an element of X. Then an event Y is thought of as “after” X (written X≤Y) if the history of X is a subset of the history of Y.
This makes one huge departure from Bayes nets: no information is allowed to be lost. If we have a system where X directly causes Y, but there is some “noise” in X which does not affect the downstream value of Y, then X is not “before” Y.
Is this actually different from Bayes nets? Like let’s say you have a probability distribution X→Y→Z, but then you measure each variable with noise, X→^X, Y→^Y, Z→^Z. In that case the measurements won’t satisfy ^X→^Y→^Z, because ^Z will be dependent on ^X even after conditioning on ^Y.
Yes there will be some special-cases where Bayes nets support it, e.g. if you’ve got only two variables X→Y, but these special cases rapidly fail as you have even slightly more complexity.
I was thinking about causality in terms of forced directional arrows in Bayes nets, rather than in terms of d-separation. I don’t think your example as written is helpful because Bayes nets rely on the independence of variables to do causal inference: X→Y→Z is equivalent to X←Y←Z.
It’s more important to think about cases like X→Y←Z where causality can be inferred. If we change this to ^X,^Y,^Z by adding noise then we still get a distribution satisfying ^X→^Y←^Z (as ^X and ^Z are still independent).
Even if we did have other nodes forcing X→Y→Z (such as a node U which is parent to Y, and another node V which is parent to Z), then I still don’t think adding noise lets us swap the orders round.
On the other hand, there are certainly issues in Bayes nets of more elements, particularly the “diamond-shaped” net with arrows W→X,W→Y,X→Z,Y→Z. Here adding noise does prevent effective temporal inference, since, if ^X and ^Y are no longer d-separated by ^W, we cannot prove from correlations alone that no information goes between them through ^Z.
Is this actually different from Bayes nets? Like let’s say you have a probability distribution X→Y→Z, but then you measure each variable with noise, X→^X, Y→^Y, Z→^Z. In that case the measurements won’t satisfy ^X→^Y→^Z, because ^Z will be dependent on ^X even after conditioning on ^Y.
Yes there will be some special-cases where Bayes nets support it, e.g. if you’ve got only two variables X→Y, but these special cases rapidly fail as you have even slightly more complexity.
I was thinking about causality in terms of forced directional arrows in Bayes nets, rather than in terms of d-separation. I don’t think your example as written is helpful because Bayes nets rely on the independence of variables to do causal inference: X→Y→Z is equivalent to X←Y←Z.
It’s more important to think about cases like X→Y←Z where causality can be inferred. If we change this to ^X,^Y,^Z by adding noise then we still get a distribution satisfying ^X→^Y←^Z (as ^X and ^Z are still independent).
Even if we did have other nodes forcing X→Y→Z (such as a node U which is parent to Y, and another node V which is parent to Z), then I still don’t think adding noise lets us swap the orders round.
On the other hand, there are certainly issues in Bayes nets of more elements, particularly the “diamond-shaped” net with arrows W→X,W→Y,X→Z,Y→Z. Here adding noise does prevent effective temporal inference, since, if ^X and ^Y are no longer d-separated by ^W, we cannot prove from correlations alone that no information goes between them through ^Z.