Compilation of studies comparing observational results with randomized experimental results on the same intervention, compiled from medicine/economics/psychology, indicating that a large fraction of the time (although probably not a majority) correlation ≠ causality.
Those are not randomly selected pairs, however. There are 3 major causal patterns: A->B, A<-B, and A<-C->B. Daecaneus is pointing out that for a random pair of correlations of some variables, we do not assign a uniform prior of 33% to each of these. While it may sound crazy to try to argue for some specific prior like ‘we should assign 1% to the direct causal patterns of A->B and A<-B, and 99% to the confounding pattern of A<-C->B’, this is a lot closer to the truth than thinking that ‘a third of the time, A causes B; a third of the time, B causes A; and the other third of the time, it’s just some confounder’.
For example, only children are nearly twice as likely to be Presbyterian than Baptist in Minnesota, more than half of the Episcopalians “usually like school” but only 45% of Lutherans do, 55% of Presbyterians feel that their grades reflect their abilities as compared to only 47% of Episcopalians, and Episcopalians are more likely to be male whereas Baptists are more likely to be female.
Like, if you randomly assigned Baptist children to be converted to Presbyterianism, it seems unlikely that their school-liking will suddenly jump because they go somewhere else on Sunday, or that siblings will appear & vanish; it also seems unlikely that if they start liking school (maybe because of a nicer principal), that many of those children would spontaneously convert to Presbyterianism. Similarly, it seems rather unlikely that undergoing sexual-reassignment surgery will make Episcopalian men and Baptist women swap places, and it seems even more unlikely that their religious status caused their gender at conception. In all of these 5 cases, we are pretty sure that we can rule out one of the direct patterns, and that it was probably the third, and we could go through the rest of Meehl’s examples. (Indeed, this turns out to be a bad example because we can apply our knowledge that sex must have come many years before any other variable like “has cold hands” or “likes poetry” to rule out one pattern, but even so, we still don’t find any 50%s: it’s usually pretty obviously direct causation from the temporally earlier variable, or confounding, or both.)
So what I am doing in ‘How Often Does Correlation=Causality?’ is testing the claim that “yes, of course it would be absurd to take pairs of arbitrary variables and calculate their causal patterns for prior probabilities, because yeah, it would be low, maybe approaching 0 - but that’s irrelevant because that’s not what you or I are discussing when we discuss things like medicine. We’re discussing the good correlations, for interventions which have been filtered through the scientific process. All of the interventions we are discussing are clearly plausible and do not require time travel machines, usually have mechanisms proposed, have survived sophisticated statistical analysis which often controls for covariates or confounders, are regarded as credible by highly sophisticated credentialed experts like doctors or researchers with centuries of experience, and may even have had quasi-randomized or other kinds of experimental evidence; surely we can repose at least, say, 90% credibility, by the time that some drug or surgery or educational program has gotten that far and we’re reading about it in our favorite newspaper or blog? Being wrong 1 in 10 times would be painful, but it certainly doesn’t justify the sort of corrosive epistemological nihilism you seem to be espousing.”
But unfortunately, it seems that the error rate, after everything we humans can collectively do, is still a lot higher than 1 in 10 before the randomized version gets run. (Which implies that the scientific evidence is not very good in terms of providing enough Bayesian evidence to promote the hypothesis from <1% to >90%, or that it’s <<1% because causality is that rare.)
Thanks for these references! I’m a big fan, but for some reason your writing sits in the silly under-exploited part of my 2-by-2 box of “how much I enjoy reading this” and “how much of this do I actually read”, so I’d missed all of your posts on this topic! I caught up with some of it, and it’s far further along than my thinking. On a basic level, it matches my intuitive model of a sparse-ish network of causality which generates a much much denser network of correlation on top of it. I too would have guessed that the error rate on “good” studies would be lower!
This seems pretty different from Gwern’s paper selection trying to answer this topic in How Often Does Correlation=Causality?, where he concludes
Also see his Why Correlation Usually ≠ Causation.
Those are not randomly selected pairs, however. There are 3 major causal patterns: A->B, A<-B, and A<-C->B. Daecaneus is pointing out that for a random pair of correlations of some variables, we do not assign a uniform prior of 33% to each of these. While it may sound crazy to try to argue for some specific prior like ‘we should assign 1% to the direct causal patterns of A->B and A<-B, and 99% to the confounding pattern of A<-C->B’, this is a lot closer to the truth than thinking that ‘a third of the time, A causes B; a third of the time, B causes A; and the other third of the time, it’s just some confounder’.
What would be relevant there is “Everything is Correlated”. If you look at, say, Meehl’s examples of correlations from very large datasets, and ask about causality, I think it becomes clearer. Let’s take one of his first examples:
Like, if you randomly assigned Baptist children to be converted to Presbyterianism, it seems unlikely that their school-liking will suddenly jump because they go somewhere else on Sunday, or that siblings will appear & vanish; it also seems unlikely that if they start liking school (maybe because of a nicer principal), that many of those children would spontaneously convert to Presbyterianism. Similarly, it seems rather unlikely that undergoing sexual-reassignment surgery will make Episcopalian men and Baptist women swap places, and it seems even more unlikely that their religious status caused their gender at conception. In all of these 5 cases, we are pretty sure that we can rule out one of the direct patterns, and that it was probably the third, and we could go through the rest of Meehl’s examples. (Indeed, this turns out to be a bad example because we can apply our knowledge that sex must have come many years before any other variable like “has cold hands” or “likes poetry” to rule out one pattern, but even so, we still don’t find any 50%s: it’s usually pretty obviously direct causation from the temporally earlier variable, or confounding, or both.)
So what I am doing in ‘How Often Does Correlation=Causality?’ is testing the claim that “yes, of course it would be absurd to take pairs of arbitrary variables and calculate their causal patterns for prior probabilities, because yeah, it would be low, maybe approaching 0 - but that’s irrelevant because that’s not what you or I are discussing when we discuss things like medicine. We’re discussing the good correlations, for interventions which have been filtered through the scientific process. All of the interventions we are discussing are clearly plausible and do not require time travel machines, usually have mechanisms proposed, have survived sophisticated statistical analysis which often controls for covariates or confounders, are regarded as credible by highly sophisticated credentialed experts like doctors or researchers with centuries of experience, and may even have had quasi-randomized or other kinds of experimental evidence; surely we can repose at least, say, 90% credibility, by the time that some drug or surgery or educational program has gotten that far and we’re reading about it in our favorite newspaper or blog? Being wrong 1 in 10 times would be painful, but it certainly doesn’t justify the sort of corrosive epistemological nihilism you seem to be espousing.”
But unfortunately, it seems that the error rate, after everything we humans can collectively do, is still a lot higher than 1 in 10 before the randomized version gets run. (Which implies that the scientific evidence is not very good in terms of providing enough Bayesian evidence to promote the hypothesis from <1% to >90%, or that it’s <<1% because causality is that rare.)
Thanks for these references! I’m a big fan, but for some reason your writing sits in the silly under-exploited part of my 2-by-2 box of “how much I enjoy reading this” and “how much of this do I actually read”, so I’d missed all of your posts on this topic! I caught up with some of it, and it’s far further along than my thinking. On a basic level, it matches my intuitive model of a sparse-ish network of causality which generates a much much denser network of correlation on top of it. I too would have guessed that the error rate on “good” studies would be lower!