From your recent comments, it sounds like you’re trying to talk about approximating causal inference when you don’t have completely reliable information about how the data points for individual variables are sorted into samples, which I guess could make an interesting problem, though this intention was not apparent in your original post. Obviously if two variables are correlated, the observed correlation will be stronger the better information you have about how the data points are sorted into samples, though this is not a d-separation and should not be treated as a d-separation. If you have perfect information about how the variables t and T line up, and perfect information about how the variables t and e line up, then pretending that you don’t have any information about how the variables T and e line up doesn’t seem like an operation that it makes any sense to do. If you really don’t have any information about how T and e line up, then this will remain true no matter what you condition on. If you want to talk about what changes when you gain information about how T and e line up, that’s an entirely different thing.
Yeah, when I wrote the post I hadn’t thought through it enough to put it in clear terms. I hadn’t even realized that there was an open question in here. Bouncing back and forth with commenters helped a lot. (Thankyou!)
At this point it’s pretty clear that we can’t treat the associations as variables in the same way that we usually treat variables in a causal net (as I did in the original post). As you say, we can’t treat it as d-separation in the usual way. But we still need some way to integrate this information when trying to learn causal nets. So I guess the interesting question is how. I may write another discussion post with a better example and a clean formulation of the question.
May I suggest some standard books on learning causal structure from data? (Causation, Prediction and Search, for instance).
Structure learning is a huge area, lots of low (and not so low) hanging fruit has been picked.
The other thing to keep in mind about learning structure from data is that it (often) relies on faithfulness. That is, typically (A d-separated from B given C) in a graph implies (A independent of B given C) in the distribution Markov relative to the graph. But the converse is not necessarily true. If the converse is true, faithfulness holds. Lots of distributions out there are not faithful. That is, it may be by pure chance that the price of beans in China and the traffic patterns in LA are perfectly correlated. This does not allow us to conclude anything causally interesting.
Know any other good books on the subject? I’ve had trouble finding good books in the area. I’d especially appreciate something Bayesian. I’ve never even seen anyone do the math for Bayesian structure learning with multivariate normals.
From your recent comments, it sounds like you’re trying to talk about approximating causal inference when you don’t have completely reliable information about how the data points for individual variables are sorted into samples, which I guess could make an interesting problem, though this intention was not apparent in your original post. Obviously if two variables are correlated, the observed correlation will be stronger the better information you have about how the data points are sorted into samples, though this is not a d-separation and should not be treated as a d-separation. If you have perfect information about how the variables t and T line up, and perfect information about how the variables t and e line up, then pretending that you don’t have any information about how the variables T and e line up doesn’t seem like an operation that it makes any sense to do. If you really don’t have any information about how T and e line up, then this will remain true no matter what you condition on. If you want to talk about what changes when you gain information about how T and e line up, that’s an entirely different thing.
Yeah, when I wrote the post I hadn’t thought through it enough to put it in clear terms. I hadn’t even realized that there was an open question in here. Bouncing back and forth with commenters helped a lot. (Thankyou!)
At this point it’s pretty clear that we can’t treat the associations as variables in the same way that we usually treat variables in a causal net (as I did in the original post). As you say, we can’t treat it as d-separation in the usual way. But we still need some way to integrate this information when trying to learn causal nets. So I guess the interesting question is how. I may write another discussion post with a better example and a clean formulation of the question.
May I suggest some standard books on learning causal structure from data? (Causation, Prediction and Search, for instance).
Structure learning is a huge area, lots of low (and not so low) hanging fruit has been picked.
The other thing to keep in mind about learning structure from data is that it (often) relies on faithfulness. That is, typically (A d-separated from B given C) in a graph implies (A independent of B given C) in the distribution Markov relative to the graph. But the converse is not necessarily true. If the converse is true, faithfulness holds. Lots of distributions out there are not faithful. That is, it may be by pure chance that the price of beans in China and the traffic patterns in LA are perfectly correlated. This does not allow us to conclude anything causally interesting.
Know any other good books on the subject? I’ve had trouble finding good books in the area. I’d especially appreciate something Bayesian. I’ve never even seen anyone do the math for Bayesian structure learning with multivariate normals.
Why do you care if the method is Bayesian or not?
Greg Cooper’s paper is one classic reference on Bayesian methods:
http://www.inf.ufrgs.br/~alvares/CMP259DCBD/Bayes.pdf