Now, one basic principle in all of science is GIGO: garbage in, garbage out. This principle is particularly important in statistical meta-analysis: because if you have a bunch of methodologically poor studies, each with small sample size, and then subject them to meta-analysis, what can happen is that the systematic biases in each study — if they mostly point in the same direction — can reach statistical significance when the studies are pooled. And this possibility is particularly relevant here, because meta-analyses of homeopathy invariably find an inverse correlation between the methodological quality of the study and the observed effectiveness of homeopathy: that is, the sloppiest studies find the strongest evidence in favor of homeopathy. When one restricts attention only to methodologically sound studies — those that include adequate randomization and double-blinding, predefined outcome measures, and clear accounting for drop-outs — the meta-analyses find no statistically significant effect (whether positive or negative) of homeopathy compared to placebo.
A bigger danger is publication bias. collect 10 well run trials without knowing that 20 similar well run ones exist but weren’t published because their findings weren’t convenient and your meta-analysis ends up distorted from the outset.
This principle is particularly important in statistical meta-analysis: because if you have a bunch of methodologically poor studies, each with small sample size, and then subject them to meta-analysis, what can happen is that the systematic biases in each study — if they mostly point in the same direction — can reach statistical significance when the studies are pooled.
Does anyone know how often this happens in statistical meta-analysis?
Fairly often. One strategy I’ve seen is to compare meta-analyses to a later very-large study (rare for obvious reasons when dealing with RCTs) and seeing how often the confidence interval is blown; usually much higher than it should be. (The idea is that the larger study will give a higher-precision result which is a ‘ground truth’ or oracle for the meta-analysis’s estimate, and if it’s later, it will not have been included in the meta-analysis and also cannot have led the meta-analysts into Milliken-style distorting their results to get the ‘right’ answer.)
Results: We identified 12 large randomized, controlled trials and 19 meta-analyses addressing the same questions. For a total of 40 primary and secondary outcomes, agreement between the meta-analyses and the large clinical trials was only fair (kappa ϭ 0.35; 95% confidence interval, 0.06-0.64). The positive predictive value of the meta-analyses was 68%, and the negative predictive value 67%. However, the difference in point estimates between the randomized trials and the meta-analyses was statistically significant for only 5 of the 40 comparisons (12%). Furthermore, in each case of disagreement a statistically significant effect of treatment was found by one method, whereas no statistically significant effect was found by the other.
As a percentage? No. But qualitatively speaking, “often.”
The most recent book I read discusses this particularly with respect to medicine, where the problem is especially pronounced because a majority of studies are conducted or funded by an industry with a financial stake in the results, with considerable leeway to influence them even without committing formal violations of procedure. But even in fields where this is not the case, issues like non-publication of data (a large proportion of all studies conducted are not published, and those which are not published are much more likely to contain negative results) will tend to make the available literature statistically unrepresentative.
We can’t know for certain. That’s the idea of systematic biases. There no way to tell if all your trials are slanted in a specific fashion, if the biases also appears in your high quality studies.
On the other hand we have fields such as homeopathy or telephathy (Ganzfeld experiments) where there are meta-analysis that treat all studies mostly equally that find that homeopathy works and telepahty exist. On the other hand you have meta-analysis who try to filter out low quality studies who come to the conclusion that homeopathy doesn’t work and telepathy doesn’t exist.
Alan Sokal, What Is Science
A bigger danger is publication bias. collect 10 well run trials without knowing that 20 similar well run ones exist but weren’t published because their findings weren’t convenient and your meta-analysis ends up distorted from the outset.
Does anyone know how often this happens in statistical meta-analysis?
Fairly often. One strategy I’ve seen is to compare meta-analyses to a later very-large study (rare for obvious reasons when dealing with RCTs) and seeing how often the confidence interval is blown; usually much higher than it should be. (The idea is that the larger study will give a higher-precision result which is a ‘ground truth’ or oracle for the meta-analysis’s estimate, and if it’s later, it will not have been included in the meta-analysis and also cannot have led the meta-analysts into Milliken-style distorting their results to get the ‘right’ answer.)
For example: LeLorier J, Gregoire G, Benhaddad A, Lapierre J, Derderian F. “Discrepancies between meta-analyses and subsequent large randomized, controlled trials”. N Engl J Med 1997;337:536e42
(You can probably dig up more results looking through reverse citations of that paper, since it seems to be the originator of this criticism. And also, although I disagree with a lot of it, “Combining heterogenous studies using the random-effects model is a mistake and leads to inconclusive meta-analyses”, Al khalaf et al 2010.)
I’m not sure how much to trust these meta-meta analyses. If only someone would aggregate them and test their accuracy against a control.
As a percentage? No. But qualitatively speaking, “often.”
The most recent book I read discusses this particularly with respect to medicine, where the problem is especially pronounced because a majority of studies are conducted or funded by an industry with a financial stake in the results, with considerable leeway to influence them even without committing formal violations of procedure. But even in fields where this is not the case, issues like non-publication of data (a large proportion of all studies conducted are not published, and those which are not published are much more likely to contain negative results) will tend to make the available literature statistically unrepresentative.
We can’t know for certain. That’s the idea of systematic biases. There no way to tell if all your trials are slanted in a specific fashion, if the biases also appears in your high quality studies.
On the other hand we have fields such as homeopathy or telephathy (Ganzfeld experiments) where there are meta-analysis that treat all studies mostly equally that find that homeopathy works and telepahty exist. On the other hand you have meta-analysis who try to filter out low quality studies who come to the conclusion that homeopathy doesn’t work and telepathy doesn’t exist.
Sokal’s hoax was heroic
See also Jaynes’s comments on sampling error vs systematic biases (‘Emperor of China fallacy’) which I quote in http://www.gwern.net/DNB%20FAQ#flaws-in-mainstream-science-and-psychology