Building on this, you might establish something like “correlations at .9 are more likely to be causal than correlations at .7” and establish a causal mechanism for this. Alternatively, you might find that “correlations from the field of farkology are more often causal than correlations from spleen medicine”, and find a causal explanation for this.
I would be very surprised if this was not the case. Different fields already use different cutoffs for statistical-significance (you might get away with p<0.05 in psychology, but particle physics likes its five-sigmas, and in genomics the cutoff will be hundreds or thousands of times smaller and vary heavily based on what exactly you’re analyzing) and likewise have different expectations for effect sizes (psychology expects large effects, medicine expects medium effects, and genomics expects very small effects; eg for genetic influence on IQ, any claim of a allele with an effect larger than d=0.06 should be greeted with surprise and alarm).
Part or all of this explanation might involve the size of the causal network. It could well be that both correlation coefficients and field of study are just proxy variables to describe the size of a network, and that’s the only important factor in the ratio of correlations to causal links, but it might be the case that there is more to it.
I think that there is going to be a relationship, but it’ll be hard to describe precisely. Suppose we correlated A and B and found r=0.9. This is a large correlation by most fields’ standards, and it would seem to put constraints on the causal net that A and B are part of: either there aren’t many nodes ‘in between’ A and B (because each node is a chance for the correlation to diminish and be lost in influence from all the neighboring nodes, with their own connections) or the nodes are powerfully correlated so the net correlation can still be as high as 0.9.
This could lead to quite a bit of trouble in academic literature, as measures of what evidence a correlation is for causation will become dependent on a set of variables about the context you’re working in, and this could potentially be gamed. In fact, that could be the case even with gwern’s original proposition—claiming you’re working with a small causal net could be enough to lend strong evidence to a causal claim based on correlation, and it’s only by having someone point out that your causal net is lacking that this evidence can have its weighting adjusted.
To a large extent, this is already the case (see above). People justify results with relation to implicit models and supposed analysis procedures (‘we did reported t-test so we are entitled to declare p<0.05 statistically-significant (never mind all the tweaks we tried and interim tests while collecting data)’). The existing defaults aren’t usually well-justified: for example, why does psychology use 0.05 rather than 0.10 or 0.01? ‘Surely God loves p=0.06 almost as much as he loves the p=0.05’ one line goes.
I would be very surprised if this was not the case. Different fields already use different cutoffs for statistical-significance (you might get away with p<0.05 in psychology, but particle physics likes its five-sigmas, and in genomics the cutoff will be hundreds or thousands of times smaller and vary heavily based on what exactly you’re analyzing) and likewise have different expectations for effect sizes (psychology expects large effects, medicine expects medium effects, and genomics expects very small effects; eg for genetic influence on IQ, any claim of a allele with an effect larger than d=0.06 should be greeted with surprise and alarm).
The existing defaults aren’t usually well-justified: for example, why does psychology use 0.05 rather than 0.10 or 0.01?
This is a good point, and leads to what might be an interesting use of the experimental approach of linking correlations to causation: gauging whether the heuristics currently in use in a field are at a suitable level/reflect the degree to which correlation is evidence for causation.
If you were to find, for example, that physics is churning out huge sigmas where it doesn’t really need to, or psychology really really needs to up its standards of evidence (not that that in itself would be a surprising result), those could be very interesting results.
Of course, to run these experiments you need large samples of well-researched correlations you can easily and objectively test for causality, from all the fields you’re looking at, which is no small requirement.
I would be very surprised if this was not the case. Different fields already use different cutoffs for statistical-significance (you might get away with p<0.05 in psychology, but particle physics likes its five-sigmas, and in genomics the cutoff will be hundreds or thousands of times smaller and vary heavily based on what exactly you’re analyzing) and likewise have different expectations for effect sizes (psychology expects large effects, medicine expects medium effects, and genomics expects very small effects; eg for genetic influence on IQ, any claim of a allele with an effect larger than d=0.06 should be greeted with surprise and alarm).
I think that there is going to be a relationship, but it’ll be hard to describe precisely. Suppose we correlated A and B and found r=0.9. This is a large correlation by most fields’ standards, and it would seem to put constraints on the causal net that A and B are part of: either there aren’t many nodes ‘in between’ A and B (because each node is a chance for the correlation to diminish and be lost in influence from all the neighboring nodes, with their own connections) or the nodes are powerfully correlated so the net correlation can still be as high as 0.9.
To a large extent, this is already the case (see above). People justify results with relation to implicit models and supposed analysis procedures (‘we did reported t-test so we are entitled to declare p<0.05 statistically-significant (never mind all the tweaks we tried and interim tests while collecting data)’). The existing defaults aren’t usually well-justified: for example, why does psychology use 0.05 rather than 0.10 or 0.01? ‘Surely God loves p=0.06 almost as much as he loves the p=0.05’ one line goes.
This is a good point, and leads to what might be an interesting use of the experimental approach of linking correlations to causation: gauging whether the heuristics currently in use in a field are at a suitable level/reflect the degree to which correlation is evidence for causation.
If you were to find, for example, that physics is churning out huge sigmas where it doesn’t really need to, or psychology really really needs to up its standards of evidence (not that that in itself would be a surprising result), those could be very interesting results.
Of course, to run these experiments you need large samples of well-researched correlations you can easily and objectively test for causality, from all the fields you’re looking at, which is no small requirement.