Why epidemiology will not correct itself
We’re generally familiar here with the appalling state of medical and dietary research, where most correlations turn out to be bogus. (And if we’re not, I have collected a number of links on the topic in my DNB FAQ that one can read, see http://www.gwern.net/DNB%20FAQ#flaws-in-mainstream-science-and-psychology—probably the best first link to read would be Ioannidis’s “Why Most Published Research Findings Are False”.)
I recently found a talk arguing that this problem was worse than one might assume, with false positives in the >80% range, and more interestingly, why the rate is so high and will remain high for the foreseeable future. Young asserts, pointing to papers and textbooks by epidemiologists, that they are perfectly aware of what the Bonferroni correction does (and why one would use it) and that they choose to not use it because they do not want to risk any false negatives. (Young also conducts some surveys showing less interest in public sharing of data and other good things like that, but that seems to me to be much less important than the statistical tradeoffs.)
There are three papers online that seem representative:
Reading them is a little horrifying when one considers the costs of the false positives, all the people trying to stay healthy by following what is only random noise, and the general (and justified!) contempt for science by those aware of the false positive rate. (I enlarge on this vein of thought on Reddit. The recent kerfluffle about whether salt really is bad for you—medical advice that has stressed millions and will cost more millions due to New York City’s war on salt—is a reminder of what is at stake.)
The take-away, I think, is to resolutely ignore anything to do with diet & exercise that is not a randomized trial. Correlations may be worth paying attention to in other areas but not in health.
- Funnel plots: the study that didn’t bark, or, visualizing regression to the null by 4 Dec 2011 11:05 UTC; 69 points) (
- “Trials and Errors: Why Science Is Failing Us” by 19 Dec 2011 18:48 UTC; 9 points) (
- 9 Sep 2011 2:55 UTC; 3 points) 's comment on Pressure to publish increases scientists’ vulnerability to positive bias by (
- Informed consent bias in RCTs? by 27 Jan 2012 2:31 UTC; 3 points) (
- 30 Nov 2011 22:18 UTC; 0 points) 's comment on Funnel plots: the study that didn’t bark, or, visualizing regression to the null by (
From Reddit:
This is too nihilistic and is not really what experts like Ioannidis are proposing. Better to evaluate the studies (or find sources that evaluate the studies) individually for their sample size and statistical measures, such as whether or not they control for relevant covariates and do multiple hypothesis testing corrections.
You can download a video of Ioannidis’ Mar ’11 lecture on nutrition from http://videocast.nih.gov/PastEvents.asp?c=144 (it’s big though, 250 MB). Some notes:
Randomized trials have problems too.
For example, they’ll often inflate the effects by contrasting the most extreme groups (upper vs lower 20%).
Or, just basic biases, like the winner’s curse (large effects tend to come from studies with small sample sizes—you can see this by comparing the log of treatment effect vs the log of total sample size in the cochrane database) or publication bias (leading to missing data).
Odds ratios in randomized trials also decrease over time.
Generally, Ioannidis wants massive testing via biobanks (sample sizes in the millions), longitudinal measurements, and large-scale global collaborations. These do not necessarily mean only randomized trials, and in fact they are pretty much impossible for that kind of data set. Epi can work too, it just needs to be done well.
It would be nice to have what Ioannidis suggests, but what do we do in the decades (or ever) before those suggestions happen? Throwing out the correlations seems like the best idea to me − 20% of randomized trials having issues is a win in a way that 80% of results with serious issues is not.
Certainly not all correlations are useless. This feels like I am breaking some analogue of Godwin’s law, but just consider the association between cigarette smoke and some types of cancer. Generally, discounting correlations and treating them with more skepticism seem like good ideas. But “throwing out” seems needlessly harsh to me, unless for some reason you are in a hurry, in which case you should think about deferring to more expert sources anyways.
For example, this useful source http://www.informationisbeautiful.net/play/snake-oil-supplements/ (see the spreadsheet at the link) uses mostly randomized trials but also includes some studies which discuss prospective associations. I don’t think the organizers should be criticized for including the correlations.
It seems like everyone wants to bring up tobacco as the justification for such irresponsibility—it paid off once, so we should keep doing it… See my reply to http://news.ycombinator.com/item?id=2870962 (since they brought up tobacco before you did).
Recently it was announced that some organization (It thought it was the SIAI but i can’t find it in their blog) would work to form a panel in order to examine and disambiguate the state of knowledge about a number of different areas, the first being diet, nutrition and exercise. It seems imperative that they take this into consideration. What was this organization, and do we have any way of knowing whether they will or not?
Are you referring to the Persistent Problems Group?
My own opinion of that proposal (I’m not sure whether I said this elsewhere) is that the Group is already being done, and better, by things like the Cochrane Collaboration. There is no comparative advantage there.
That was my thought as well, although if this group were formed I’d be extremely interested in how they worked and what their findings were. I’d imagine Bayesian methods would be the norm, which might give them a leg up.
It would be particularly interesting if they consistently disagreed with mainstream systematic reviews.
Yes, thanks.
Thank you for your writings. This is exactly what this site needs more of: Applied rationality.
Recently I attended a talk by some genetic epidemiology students who applied to bonferroni corrections just based on their supervisor’s advice. The whole lot of them had done it, independently. It’s a conservative method, and not always the best approach. I reckon some subfields of epidemiology are more liable to methodological failings than others.
I don’t follow. You mean, why does reducing false positives increase false negatives? Because Bonferroni doesn’t pull any new data from anywhere, it just shifts along a tradeoff.