Even if you have a nail, not all hammers are the same
(Related to Over-ensapsulation and Subtext is not invariant under linear transformation)
Between 2004 and 2007, Goran Bjelakovic et al. published 3 famous meta-analysis of vitamin supplements, concluding that vitamins don’t help people but instead kill people. This is now the accepted dogma; and if you ask your doctor about vitamins, she’s likely to tell you not to take them, based on reading either one of these articles, or one of the many summaries of these articles made in secondary sources like The Mayo Clinic Journal.
The 2007 study claims that beta-carotene and vitamins A and E are positively correlated with death—the more you take, the more likely you are to die. Therefore, vitamins kill. The conclusion on E requires a little explanation, but the data on beta-carotene and A is simple and specific:
Univariate meta-regression analyses revealed significant influences of dose of beta carotene (Relative Risk (RR), 1.004; 95% CI, 1.001-1.007; P = .012), dose of vitamin A (RR, 1.000006; 95% CI, 1.000002-1.000009; P = .003), … on mortality.
This appears to mean that, for each mg of beta carotene that you take, your risk of death increases by a factor (RR) of 1.004; for each IU of vitamin A that you take, by a factor of 1.000006. “95% CI, 1.001-1.007” means that the standard deviation of the sample indicates a 95% probability that the true RR lies somewhere between 1.001 and 1.007. “P = .012″ means that there’s only a 1.2% chance that you would be so unlucky as to get a sample giving that result, if in fact the true RR were 1.
A risk factor of 1.000006 doesn’t sound like much; but I’m taking 2,500 IU of vitamin A per day. That gives a 1.5% increase in my chance of death! (Per 3.3 years.) And look at those P-values: .012, .003!
So why do I still take vitamins?
What all of these articles do, in excruciating detail with regard to sample selection (though not so much with regard to the math), is to run a linear regression on a lot of data from studies of patients taking vitamins. A linear regression takes a set of data where each datapoint looks like this:
Y = a1X1 + c
and a multiple linear regression takes a set of data where each datapoint usually looks like this:
Y = a1X1 + a2X2 + … anXn + c
where Y and all the Xi’s are known. In this case, Y is a 1 for someone who died and a 0 for someone who didn’t, and each Xi is the amount of some vitamin taken. In either case, the regression finds the values for a1, … an, c that best fit the data (meaning they minimize the sum, over all data points, of the squared error of the value predicted for Y, (Y - (a1X1 + a2X2 + … anXn + c)2).
Scientists love linear regression. It’s simple, fast, and mathematically pure. There are lots of tools available to perform it for you. It’s a powerful hammer in a scientists’ toolbox.
But not everything is a nail. And even for a nail, not every hammer is the right hammer. You shouldn’t use linear regression just because it’s the “default regression analysis”. When a paper says they performed “a regression”, beware.
A linear analysis assumes that if 10 milligrams is good for you, then 100 milligrams is ten times as good for you, and 1000 milligrams is one-hundred times as good for you.
This is not how vitamins work. Vitamin A is toxic in doses over 15,000 IU/day, and vitamin E is toxic in doses over 400 IU/day (Miller et al. 2004, Meta-Analysis: High-Dosage Vitamin E Supplementation May Increase All-Cause Mortality; Berson et al. 1993, Randomized trial of vitamin A and vitamin E supplementation for retinitis pigmentosa.). The RDA for vitamin A is 2500 IU/day for adults. Good dosage levels for vitamin A appear to be under 10,000 IU/day, and for E, less than 300 IU/day. (Sadly, studies rarely discriminate in their conclusions between dosage levels for men and women. Doing so would give more useful results, but make it harder to reach the coveted P < .05 or P < .01.)
Quoting from the 2007 JAMA article:
The dose and regimen of the antioxidant supplements were: beta carotene 1.2 to 50.0 mg (mean, 17.8 mg) , vitamin A 1333 to 200 000 IU (mean, 20 219 IU), vitamin C 60 to 2000 mg (mean, 488 mg), vitamin E 10 to 5000 IU (mean, 569 IU), and selenium 20 to 200 μg (mean 99 μg) daily or on alternate days for 28 days to 12 years (mean 2.7 years).
The mean values used in the study of both A and E are in ranges known to be toxic. The maximum values used were ten times the known toxic levels, and about 20 times the beneficial levels.
17.8 mg of beta-carotene translates to about 30,000 IUs of vitamin A, if it were converted to vitamin A. This is also a toxic value. It is surprising that beta-carotene showed toxicity, though, since common wisdom is that beta-carotene is converted to vitamin A only as needed.
Vitamins, like any medicine, have an inverted-J-shaped response curve. If you graph their health effects, with dosage on the horizontal access, and some measure of their effects—say, change to average lifespan—on the vertical axis, you would get an upside-down J. (If you graph the death rate on the vertical axis, as in this study, you would get a rightside-up J.) That is, taking a moderate amount has some good effect; taking a huge a mount has a large bad effect.
If you then try to draw a straight line through the J that best-matches the J, you get a line showing detrimental effects increasing gradually with dosage. The results are exactly what we expect. Their conclusion, that “Treatment with beta carotene, vitamin A, and vitamin E may increase mortality,” is technically correct. Treatment with anything may increase mortality, if you take ten times the toxic dose.
For a headache, some people take 4 200mg tablets of aspirin. 10 tablets of aspirin might be toxic. If you made a study averaging in people who took from 1 to 100 tablets of aspirin for a headache, you would find that “aspirin increases mortality”.
(JAMA later published 4 letters criticizing the 2007 article. None of them mentioned the use of linear regression as a problem. They didn’t publish my letter—perhaps because I didn’t write it until nearly 2 months after the article was published.)
Anyone reading the study should have been alerted to this by the fact that all of the water-soluble vitamins in the study showed no harmful effects, while all of the fat-soluble vitamins “showed” harmful effects. Fat-soluble vitamins are stored in the fat, so they build up to toxic levels when people take too much for a long time.
A better methodology would have been to use piecewise (or “hockey-stick”) regression, which assumes the data is broken into 2 sections (typically one sloping downwards and one sloping upwards), and tries to find the right breakpoint, and perform a separate linear regression on each side of the break that meets at the break. (I almost called this “The case of the missing hockey-stick”, but thought that would give the answer away.)
Would these articles have been accepted by the most-respected journals in medicine if they evaluated a pharmaceutical in the same way? I doubt it; or else we wouldn’t have any pharmaceuticals. Bias against vitamins? You be the judge.
Meaningful results have meaningful interpretations
The paper states the mortality risk in terms of “relative risk” (RR). But relative risk is used for studies of 0⁄1 conditions, like smoking/no smoking, not for studies that use regression on different dosage levels. How do you interepret the RR value for different dosages? Is it RR x dosage? Or RRdosage (each unit multiplies risk by RR)? The difference between these interpretations is trivial for standard dosages. But can you say you understand the paper if you can’t interpret the results?
To answer this question, you have to ask exactly what type of regression the authors used. Even if a linear non-piecewise regression were correct, the best regression analysis to use in this case would be a logistic regression, which estimates the probability of a binary outcome conditioned on the regression variables. The authors didn’t consider it necessary to report what type of regression analysis they performed; they reported only the computer program (STATA) and the command (“metareg”). The STATA metareg manual is not easy to understand, but three things are clear:
It doesn’t use the word “logistic” anywhere, and it doesn’t use the logistic function, so it isn’t logistic regression.
It does regression on the log of the risk ratio between two binary cases, a “treatment” case and a “no-treatment” case; and computes regression coefficients for possibly-correlated continuous treatment variables (such as vitamin doses).
It doesn’t directly give relative risk for the correlated variables. It gives regression coefficients telling the change in log relative risk per unit of (in this case) beta carotene or vitamin A. If anything, the reported RR is probably er, where r is the computed regression coefficient. This means the interpretation is that risk is proportional to RRdosage.
Since there is no “treatment/no treatment” case for this study, but only the variables that would be correlated with treatment/no treatment, it would have been impossible to put the data into a form that metareg can use. So what test, exactly, did the authors perform? And what do the results mean? It remains a mystery to me—and, I’m willing to bet, to every other reader of the paper.
References
Bjelakovic et al. 2007, “Mortality in randomized trials of antioxidant supplements for primary and secondary prevention: Systematic review and meta-analysis”, Journal of the American Medical Association, Feb. 28 2007. See a commentary on it here.
Bjelakovic et al. 2006, “Meta-analysis: Antioxidant supplements for primary and secondary prevention of colorectal adenoma”, Alimentary Pharmacology & Therapeutics 24, 281-291.
Bjelakovic et al. 2004, “Antioxidant supplements for prevention of gastrointestinal cancers: A systematic review and meta-analysis,” The Lancet 364, Oct. 2 2004.
- Your intuitions are not magic by 10 Jun 2010 0:11 UTC; 163 points) (
- Are there technical/object-level fields that make sense to recruit to LessWrong? by 15 Sep 2019 21:53 UTC; 24 points) (
- 14 Dec 2010 17:55 UTC; 8 points) 's comment on What topics would you like to see more of on LessWrong? by (
- Understanding the Evidence for Killer Supplements by 28 Oct 2010 23:16 UTC; 4 points) (
- 28 Aug 2010 19:25 UTC; 3 points) 's comment on Self-fulfilling correlations by (
- 5 Dec 2011 7:10 UTC; 3 points) 's comment on Announcing the Quantified Health Prize by (
- 20 Apr 2013 8:52 UTC; 2 points) 's comment on Open Thread, April 15-30, 2013 by (
- 11 Apr 2010 6:34 UTC; 1 point) 's comment on Open Thread: April 2010 by (
- 24 Jun 2010 14:09 UTC; 0 points) 's comment on A Rational Education by (
I don’t understand. Why is “they used the wrong statistical formula” worth 47 upvotes on the main article? Because people here are interested in supplementation? Because it’s a fun math problem?
In the other comments, people are discussing which algorithm would be more appropriate, and debating the nuances of each particular method. Not willing to take the time to understand the math, it comes across as, “This could be right, or wrong, depending on such-and-such, and boy isn’t that stupid...”
I run into this problem every time I read anything on health or medicine (it seems limited to these topics). Someone says it’s good for you, someone says it’s bad for you, both sides attack the other’s (complex, expert) methods, and the non-expert is left even more confused than when they first started looking into the matter. And it doesn’t help that personal outcomes can be drastically different regardless of the normal result.
To me, this topic is still confusing, with a slight update toward “take more vitamins.” Without taking classes in statistics and/or medicine, how can I become less wrong on problems like this? Who can I trust, and why?
47 votes doesn’t mean “This is a great article”. It means 47 more people liked it than disliked it. Peanut butter gets more karma than caviar.
That is not entirely true. Personally, I think it is a fine article, but I didn’t upvote it because I felt that 60 upvotes should be enough for it. Is there a site-wide consensus about the interpretation of karma scores? If not, is there even a thread where LWers debate about the best semantics?
If you like something you read and would like to see more of it, vote it up.
And vice versa.
Both. It’s instrumental in that vitamin supplementation is a concern many here have. It’s also useful as an example of how studies can have flaws, and how these flaws can be found with surprisingly little analysis. Dissections of bad studies helps us avoid similar flaws in our own conclusions. And there are indeed researchers on LessWrong, as well as motivated laymen that can follow the math, and even run their own mathematical regressions. This truly is valuable.
You can’t learn to be less wrong about mathematical questions without learning more math. (By definition.)
Depends. I could become less wrong about mathematical questions by learning to listen to people who are less wrong about math. (More generally: I may be able to improve my chance of answering a question correctly even if I can’t directly answer it myself.)
The “problem like this” I was referring to was “health advice and information is often faulty,” not “linear regression analysis of mortality effects from supplementation is faulty.”
I’d like to get better at correcting for the former while avoiding the (potentially enormous) amount of learning and effort involved in getting better at all necessary forms of the latter.
From what I can tell, you’re saying “there is no way; the two are inextricably linked.” In which case, I guess I’ll just wait until they get better at it.
The general advice here is
Not all regression is the same; beware anyone who reports doing “a regression”
Linear regression assumes a linear relationship
Don’t trust a report that bases its authority on numbers if you can’t say what those numbers mean
A conclusion can be both true and misleading
A little unreflective folk-psychology (“vitamins” as being “more is better” instead of having a dose-response curve) can do a lot of damage
All of those points are true, but there’s one I’d like to flag as true but potentially misleading:
Linear regression does assume this in that it tries to find the optimal linear combination of predictors to represent a dependent variable. However, there’s nothing stopping a researcher from feeding in e.g. x and x squared as predictors, and thereby finding the best quadratic relationship between x and some dependent variable.
The way traditional rationalists without special training relate to scientific findings is usually by uncritically accepting them as authoritative. One can become less wrong by learning that scientists are not close to perfect. They make mistakes and sometimes deceive themselves and others. Probably the single most common way this happens is through statistical malpractice. This post, in excellent detail and language non-experts can comprehend, explains one such case and identifies a general type of statistical screw-up: using the wrong tool.
Trust no one. Learn a little math.
You don’t need to be able to solve the problems on your own, just enough to understand the arguments. I’m not a math guy either but only some of the statistical stuff is totally out of my grasp. Did you understand the math in this post?
“Trust experts except when you don’t”?
“Don’t trust experts; become one yourself”? Wouldn’t that put me in the category of people-not-to-be-trusted? Isn’t that what Phil is pointing out, that most people don’t understand statistics? Why would I expect myself to be better at judging these kinds of problems than experts who spend their lives on it? Should I not expect myself to be just as bad at it, and potentially much worse (know enough to be dangerous)?
Yes. But it seems fundamental enough that experts should have caught it, therefore I am skeptical.
Some questions (this is an obviously incomplete* list, of course) to ask when you are in this situation:
Is the source pointing out the error reliable?
Does the criticized work acknowledge or otherwise address the claim?
Does the criticized work contain other flaws? (Subcategory: is the criticized work sloppy or lazy in execution?)
In this particular case, the answer to the third question appears to be “yes”. This is probably good reason to raise your probability that this particular criticism is correct.
* Bear in mind, of course, Eliezer Yudkowsky’s warning: If you want to shoot your foot off, it is never the least bit difficult to do so.
Thank you. These steps for analysis are very useful to me, and I feel they answer my original questions.
Scientists can be wrong. Certain kinds of science are more likely to involve screw-ups. Learn to identify these kinds of findings and learn to identify sources of screw-ups so you don’t fall for them.
If two experts disagree about something and you want to evaluate the disagreement one way is to understand their arguments. Sometimes you can look into both sides and discover that one of them isn’t really the expert you thought they were. You can evaluate the arguments or evaluate the expertise. I can’t think of anything else.
I assume you’re not planning on trying to publish statistical analyses so I doubt you’re dangerous.
You can probably learn more about statistics than at least some of the shoddy scientists out there. If you find yourself disagreeing about stats with a prominent statistician then, yeah, you’re probably wrong.
You aren’t learning how to run different kinds of statistical analyses. You’re learning about statistical errors scientists make. It’s a different set of knowledge which means you can know less about statistics in certain ways but still be able to point out where scientists go wrong.
This is an interesting point in itself. Why health and medicine?
Maybe causal inference is straight up more difficult in health and medicine: effects are smaller and more ambiguous than in hard sciences, and have many hard-to-manipulate causes that blur the signal.
There are borderline results in fields like physics, obviously, but they’re usually more esoteric and tend to have relatively clear cut theory behind them (which is why I’d guess you’re not too worried about, say, last year’s ambiguous results from the Cryogenic Dark Matter Search), so they don’t provoke so much back-and-forthing.
This leads me to a prediction: you’d have as much difficulty reading up on results in psychology and sociology as you do in health and medicine. As for what to do about it? Uh...not sure. I’m still chewing over this thread.
My self-serving explanation is that health/ medicine/ biology select for people who enjoy (or better tolerate) rote memorization (Levels 0-1 in my hierarchy) rather than “how it works”-type understanding (Levels 2-3). This gets the group of intelligent people with worse ability to know the broader meaning of what they’re doing, a skill that tends to curtail questionable statistical practices.
Yes, I know it sounds insulting, but what really turned me off from taking more biology in high school and college, and from med school, is that it’s so much more memorization-oriented rather than generative-model oriented. This suspicion is confirmed when I hear about e.g. ecologists just now getting around to using the method of adjacency matrix eigenvectors (i.e., Google’s PageRank) to identify key organisms in ecosystems.
And an alternate alternate explanation: Poor priorities. Doctors want to hear all the clinical details, and are mentally worn out by the time they finish with those. There’s just no time or energy to do the math too.
When I used to work for NASA in theoretical air traffic management, I’d try to explain some abstract point about turbulent or chaotic traffic flow to operational FAA guys, and they would get bogged-down in details about what kind of planes we were talking about, what altitudes they were flying at, which airlines they belonged to, and on and on.
What is theoretical air traffic management?
I mean, in more detail than I can glean just from knowing what those words mean.
You won’t hear that phrase, but I mean the theoretical study of air traffic. For instance, I studied “free flight”, which is when airplanes can fly directly from their takeoff airport to their landing airport and manage their own collision-avoidance, and showed that in certain free-flight situations you can increase the throughput of the airspace by reducing the information that you give to pilots.
It’s an interesting general phenomenon: They try to optimize their route using all of their information. So the more information they have, the more unpredictable their behavior is. More information can actually cause more trouble than it solves in some cases, at least when it’s information about what other agents are doing.
I had a contract from NASA to look for chaotic behavior in en-route air traffic, but my conclusion was that it is unlikely and nothing need be done to avoid it at present.
Reminds me of Braess’s paradox; a route people don’t know about is similar to one that doesn’t exist.
Here’s an alternate insulting explanation, based on many (but not all) of the doctors I’ve met:
Doctors are bad at being unsure of themselves.
You’re right. I mentally boxed them off when I made my original statement. Thinking further, I might add economics to the list.
You don’t get these problems with economics. In economics journals its standard practice to include your specification, as well as the whole regression output, including a full list of included terms and their significance tests.
When I was completing my Master’s degree I was a sessional assistant for an introductory quantitative methods course for economics and finance majors. The type of simple linear regression would be considered overly simplistic at that level (at least in the absence of some simple specification testing), and if the j curve is already accepted in medicine, to model linearly is unforgivable. It’s not like non-linear transformations are hard to do either, you can do them in Excel without too much trouble.
FWIW, I’m of the impression that economists get a better grounding in quantitative methods than other social scientists (and I would say that the profession is a bit too keen on mathematical approaches in some cases), so maybe you would have similar problems with psychology or sociology. But I don’t think economics has this problem.
Also, maybe more importantly, less in the way of financial and ideological commitment.
(Incidentally, my impression is that theoretical debates are more intense in physics than medicine, though I don’t know much about such theoretical debates as might exist in medicine.)
It shouldn’t. It should come across as “this is wrong, it can make people die, it is evil, don’t listen to it.”
I am approximately cynical enough to suggest “the LessWrong community likes debunking” as a reason.
Is it worth 49 points? Almost certainly not. This indicates a flaw with the karma system, not a flaw in the post.
Absolutely. It’s worth the 59 points that it is at now. How science, and health related science in particular is used poorly is valuable information.
Another possible explanation: LW clientele likes topics related to survival & life extension. Without turning this into a health discussion group I find these broad strokes of how to think about medical information very valuable.
In my case, yes. I’d posit that a future-interested community like this is likely to have lots of supplement-interested individuals trying to squeeze in an extra few years.
Simply put, I think this article has bridged what was a gap between the interests of LessWrong and the interests of LessWrong readers.
Wow, I just read Robin’s writeup on this and it caused me to significantly lower the amount of credence I place on his other positions (but very slightly lower my opinion of supplements). It just struck me as overwhelmingly sloppy and rhetorical. Particularly his justification attempt in response to this thread. (But I suppose Robin’s responses to criticism have never impressed me anyway.)
Here is what is bothering me:
The only causal hypothesis I’ve heard for the supplements leading to an higher rate of death is Phil’s claim that the doses are too high and are resulting in vitamin toxicity. If we accept Robin’s claim that “vitamins kill” (that is that the causal claim is true and the results aren’t just the result of uncontrolled-for correlations) one of three things has to be true: vitamins kill no matter how the body obtains them; there is something involved in taking ‘supplements’ that kills; or Phil is right and vitamin toxicity is the cause. (Is there an option I’m missing?) It seems extremely clear to me that absent any hypothesis for the second of these, vitamin toxicity is by far the most likely explanation. And the best part about this hypothesis is that is also the easiest to test. All you have to do is look for hockeysticks!
Why refuse to test the one explanation for the results we have?
Another possibility is that cancers have higher nutritional needs than normal cells, and some vitamins might be feeding cancer more than they’re feeding the person.
One more explanation: some supplements might come from untrustworthy manufacturers and be contaminated with an unidentified toxin. That’s a tough one to test, since it’s unlikely that any of the studies saved samples of the pills they used or even documented where they came from.
One more hypothesis: supplements are risky because you may end up with vitamins and minerals which are out of proportion with each other.
Do we have reason to think this out of proportion issue would arise more often with supplements than it would just from diet?
Plants and animals are our fellow biological creatures. While the chemical combinations in them might not be ideal for us especially if a person isn’t eating a varied diet, supplements can give proportions of vitamins, minerals, and whatever which have never been seen in nature.
One more angle: I’ve heard claims that agricultural soil has much less minerals than it did in the ancestral environment, and more so in recent decades. It’s been a long time since the glaciers came through, grinding the rocks and as far as I know, modern agriculture typically doesn’t include replacing the minerals taken out with each harvest. On the organic side, manure isn’t going to help that much if it’s from animals that were fed low-mineral food.
I file this under plausible theory. It doesn’t address which minerals are low, or how much should be added. One of my friends uses it as a reason to supplement.
I’ve worried about this too, but isn’t this something that can be easily tested? I mean, if nutritionists have really identified the vitamins, etc. we need, divided the thingspace appropriately (are all things called “protein” functionally the same?), and developed reliable ways of measuring nutritional information that goes on the food label, this should show up pretty quickly. (I’m becoming less confident of assumptions like these, in part because of errors like what Phil Goetz found here.)
When standard produce is packaged, how do they get the data for the label? Do they have to regularly test that source’s produce, or is there just a standard lookup table that e.g. all baby carrots can give as their data? (If the latter, that screams “information cascade!”)
I don’t know whether nutritionists have identified all the nutrients we need.
And I’m pretty sure that the functioning of living organisms isn’t terribly well understood, even for “ideal” cases.
If you look at Table 2 in the paper, it shows doses of each vitamin for every study that is considered low risk for bias. I count 9 studies that have vitamin A <10,000 IU and vitamin E <300 IU, which is what PhilGoetz said are good dosage levels.
The point estimates from those 9 studies (see figure 2) are: 2.88, 0.18, 3.3, 2.11, 1.05, 1.02, 0.78, 0.87, 1.99. (1 favors control)
Based on this quick look at the studies, I don’t see any reason to believe that a “hockey stick” model will show a benefit of supplements at lower dose levels.
The titular contention used the word ‘kill’. That’s what hockey sticks tend to do.
But Goetz implies that the vitamins may have benefits in the right regime:
It would be a strange usage of ‘good’ if all Goetz meant by it was ‘increases fatalities by too small an amount to easily detect’ rather than ‘increases some desirable outcome’.
Robin generates a lot of clever ideas. That’s awesome. I’ve heard him retract before, but it is tempting to make the minimum feasible patch to your stated ideas when someone exposes some shallowness or flaw in your analysis.
I’m interested in a large single prospective on vitamin supplementation. I can’t believe any good will come at looking retrospectively at correlations (I’m too lazy right now to see if that’s actually a problem with any of the studies used in the meta-analysis) - people with more health problems (and especially older people) tend to take more vitamins.
Confounding things further but in the opposite direction—people who are health conscious (and wealthier) take more vitamins too.
My approach regarding this topic is that I plan on going and doing/funding research on this kind of thing myself and are actively re-educating myself and acquiring resources in order to do so. In the mean time I’m just not going to take excessive doses of fat soluble vitamins—because that’d just be a stupid idea in the first place!
I’d be extremely grateful if you do investigate this.
See http://phs.bwh.harvard.edu/
Cool. Huge prospective study. They demonstrated that there’s no reduction in (several types of) cancer or heart attack mortality with either vit. E or vit. C supplementation (400 IU of vitamin E every other day and 500 mg of vitamin C daily). It’s strange to me that they don’t look at all-cause mortality.
I think they did look at overall mortality. Quoting from the abstract of the 2008 paper
You’re right. I missed it. I know they say it’s not significant, but in fact the two P=.15 are weakly convincing to me (95% CI of .97-1.18). The old men are dying off 7% more often if they have the (thought to be reasonable at the beginning of the study) dose of vit C or E (compared to placebo). Redo the study and you’ll probably get something like 4-10% instead of 7%. I think this is pretty good evidence for Robin’s claim.
This sort of binary treatment-variable study can always be criticized for overly high doseage, as Phil Goetz pointed out. The 400 IU vit E every 2 days is well under the dose already commonly accepted to cause long-term problems (400 IU daily). The 500mg vit. C daily is well above the highest dietary recommendation of 100mg/day, but it’s well below the amount some people take.
That was my reaction, too. But given that his complaint was that no one redid the analysis on the actual data, I think it’s appropriate to wait a bit and see if he does it himself, and what the result is. If he doesn’t follow through by looking at the actual data, then I’ll unsubscribing from OB, because that would be a strong signal of bias; but for now I prefer to wait and see.
Also, that relying on that sort of broad statistical analysis makes sense if you have no other source of information. If you have ways of telling whether particular supplements make your life or health better, then experimenting with supplements for yourself could be a good idea.
Is the data easily available anywhere? If it’s not, that might explain why no one redid the analysis. If it is, I might be persuaded to try a few different models out.
EDIT: After some digging, I’m pretty sure I know what statistical model they used in the study and should be able to reproduce their results and try a few different things IF I can get the data in a sufficiently nice form.
EDIT2: see my recent post in the discussion section
Why’d you delete it? :-( I was wondering earlier where it went.
I didn’t think I understood enough of what they did to comment usefully anymore, and I don’t have the time currently (i.e., this week) to put into understanding it. Did you read it after I had put the edit at the end? If so, did you still find it useful?
They do the same kind of thing with ionizing radiation: a lot of organizations assume that the health effects of radiation are completely linear, even far below the range where we’ve been able to measure, despite the lack of evidence for this (and some evidence suggesting a J-shaped curve). Other organizations refuse to extrapolate to extremely low doses, citing the lack of evidence.
The issue is just way too politicized.
There’s a general principle that very small doses of toxins or stresses of any kind—vaccines, radiation, oxidants, poisons, alcohol, heat, cold, exercise—are beneficial, because they provoke the body to a protective overreaction. One of the talks at the 2007 DC conference on cognitive aging even suggested that this is responsible for why people who think more have fewer memory problems as they age.
(This suggests that our bodies are lazy—they could maintain themselves better than they do on every dimension. Or it might be that, if we measured all the responses simultaneously, we’d find that mounting a protective response to radiation made us more vulnerable to infection, alcohol, and all the rest.)
Or maybe it would just require the expenditure of energy.
And yet anabolism and expenditures of energy pretty reliably shorten lifespan. Many of these responses rely on the use of regulatory RNA; and the dicer-mediated siRNA mechanism has been shown to have a limited capacity that degrades when multiple regulatory responses occur simultaneously.
That principle would be hormesis, no?
Be wary of placing too much trust in that logic, that way lies homeopathy.
For the radiation thing there’s at least some evidence that humans can adapt to high background radiation but I’ve never seen any evidence that the reaction ever outweighs the exposure.
http://www.ncbi.nlm.nih.gov/pubmed/11769138
Yep, my impression from what I can remember of https://en.wikipedia.org/wiki/Radiation_hormesis is that people who believe models other than LNT are privileging the hypothesis.
The fact that this assumption is made so explicit makes it much less problematic than the problem this article is talking about.
I’ve just read a book by Gwyneth Cravens that talks about this and explains it well:
http://www.amazon.com/Power-Save-World-Nuclear-Energy/dp/B002KAOSLK/
(It’s also about Uranium mining, how nuclear power plants work, how risk is mitigated, nuclear waste storage, etc. A good read.)
Thanks Phil. I am suitably outraged at both that both the authors and the journal published this.
I’m not sure whether ‘benefit of the doubt’ in this instance suggests ‘political motivation’ or ‘incompetence’. I’ll give them whichever benefit of the doubt they prefer. The most basic knowledge of the field suggests a prior probability that a fat soluble vitamin has a linear response with dosage is negligible.
I think the simplest hypothesis is that this was a case of pushbutton statistics—get a statistics package, read the documentation, and feed it numbers until it gives you numbers back.
The papers overwhelm the reader with so many details about how to categorize and treat the different samples in the meta-study, that it’s easy to feel like they’ve “done enough” and just wave the math through.
It might be that, in order to pay more attention to statistical correctness, you’ve got to pay less attention to other details. A person has only so much mental energy! So it may reflect not poor statistics skills so much as poor priorities. Doctors want to hear all the clinical details; but there’s little time and mental energy left for anything else.
You seem to have a mild case of pushbutton statistics
Negligence hurts people. In this case it hurts people at the margin, where nutritional advice from misinformed doctors tips the scales. Yet while negligence in surgery is a PR nightmare, there is still a net benefit of prestige to having papers published, read and referenced even when it can be shown that the research is flawed. If only negligent publications came with a commensurate penalty to the credibility of the author and journal, even if only until they published a suitable retraction.
So why do you still take vitamins? If you look at their Figure 2, there aren’t many studies that ‘favored antioxidants’, and some of those studies had low doses.
“A linear analysis assumes that if 10 milligrams is good for you, then 100 milligrams is ten times as good for you, and 1000 milligrams is one-hundred times as good for you.” That’s only true if the range of data included both 10 milligrams and 1000 milligrams. Linearity is only assumed within the range of data of the data sets.
The hockey stick approach seems too restrictive as well. Just use a p-spline.
There doesn’t appear to be statistician on the paper. This study really needed one. Using meta-regression to estimate a dose effect is challenging, especially when you don’t have access to the original data (just using aggregate, study-level covariates). In fact, the dose effect and the concept of study heterogeneity are conflated here.
I agree with you that it’s unclear what they actually did.
An even better methodology would be to allow for higher order terms in the regression model. Adding squared terms, the model would look like this:
Y=a_1Xb_1X2c
or
Y=a_1X_1b_1X_12a_2X_2b_2X_22...a_nX_1bnX_n2c
This would allow for nice those nice looking curves you were talking about. And it can be combined with logistic regression. Really, regression is very flexible; there’s no excuse for what they did.
Also, the scientists could have done a little model checking. If what Phil says about the U/J shaped response curve is true, the first order model would have been rejected by some sensible model selection criterion (AIC, BIC, stepwise selection, lack-of-fit F test, etc)
related side note: In my grad stat classes, “Linear Regression” usually includes things like my example above—i.e. linear functions of the (potentially transformed) explanatory variables including higher order terms. Is this different from the how the term is widely used?
unrelated side note: is there a way to type pretty math in the comments?
followup question: are scientists outside of the field of statistics really this dumb when it comes to statistics? It seems like they see their standard methods (i.e., regression) as black boxes that take data as an input and then output answers. Maybe my impression is skewed by the examples popping up here on LW.
Yes: Comment formatting
thanks!
I don’t think it is. I seem to remember reading in Wonnacott & Wonnacott’s textbook that you can still call it ‘linear regression’ whether or not one of those regressors is a nonlinear function of another.
That makes sense intuitively, because a linear regression algorithm doesn’t care where your regressors come from, so conceptually it’s irrelevant whether they all turn out to be different functions of the same variable (for example). (Barring obvious exceptions like your regressors all being linear functions of the same variable, which would of course mess up your regression.)
I don’t know of one, but I haven’t been here long!
My understanding is, a lot of them aren’t...but a lot of them are.
Yes. Quadratic regression is better, often. The problem is that the number of coefficients to adjust in the model gets squared, which goes against Ockhams razor. This is precisely the problem I am working on these days, though in the context of the oil industry.
It’s not too difficult to check to see if adding the extra terms improves the regression. In my original comment, I listed AIC and BIC among others. On the other hand, different diagnostics will give different answers, so there’s the question of which diagnostic to trust if they disagree. I haven’t learned much about regression diagnostics yet, but at the moment they all seem ad hoc (maybe because I haven’t seen the theory behind them yet).
If you say W = XxX, then make a model that’s linear in W, it’s a linear model. If you use both X and XxX, I don’t think there was a definitive answer… until Wikipedia, of course. Which says no.
Er, what? It says yes.
For anyone interested, here is a decent algorithm for getting the “correct” number of lines in your linear regression.
http://www.cs.princeton.edu/~wayne/kleinberg-tardos/06dynamic-programming-2x2.pdf
Pages 5 and 6.
Welcome to LessWrong! Feel free to introduce yourself in the welcome thread.
That is a very good summary and review for those who want want to brush up on dynamic programming—it gives several example problems and cost functions to be minimized, and shows how the optimal substructure fits in.
I do have to say that the bit for the tradeoff between overfitting and accuracy is not terribly useful for those trying to understand such things. It is a cookbook method, with no justification for why these particular error weightings are terribly useful.
EDIT: Of course, almost any regularization will help compared to nothing, and it does show a nice way to do this with dynamic programming, which can greatly speed things up over naive implementations.
Ouch. Comic Sans.
Good cookbook, though.
Another beauty. (The logistic regression thing isn’t that big a deal, though—the logistic function only makes a difference at the extremes, and the fact that the RR is very close to one means it’s right in the middle.)
Good point. And logistic regression coefficients are hard to interpret, so maybe logistic regression would be a poor choice in this case.
Credit should go to Andrew Gelman, who also points out (in his book with Jennifer Hill on hierarchical modeling) that the logistic regression coefficients do have a straightforward interpretation, at least when the probabilities are not too close to the extremes. (I’d have to look it up.)
I don’t have Gelman’s book, but: logistic regression says p = 1 / (1 + exp(-z)) where z is a linear combination of 1 and the independent variables. But then z is just the “log odds”, log(p/(1-p)); you can think of the coefficient of 1 as being the log prior odds ratio and the other coefficients as being the amount of evidence you get for X over not-X per unit change in each independent variable.
One can complain about empirical studies in dozens of ways. Yes, for any linear regression one can complain that they should have included higher order moments for all of the variables. But if readers can feel justified in ignoring any analysis for which one can make such a complaint, then readers can feel justified in ignoring pretty much any such data analysis. That is way too low a standard for ignoring data.
If you suspect that this lack has seriously skewed the results of some particular study, then you should get the data and do your own analysis the way you think it should be done, and then publish that. Then readers can at least compare the prestige of the two publications in deciding who is right.
I think the complaint here is less that higher order moments would’ve produced higher quality results, and more that when testing for adverse effects on health, they used mean dosages already known to be toxic, which is a pretty thorough screening out of any evidence collected.
It would be confidence-inspiring to see the raw data, and some better analyses of it, of course.
But Phil isn’t saying we can ignore the study just because it uses a linear regression. He’s giving good, and what should be obvious-to-experts, reasons why a linear regression will be deceptive on this question. Once you know dosage matters and that
then linear regression looks like a really bad choice.
But no citation is given for this “known to be toxic” claim. Known by whom how and how confidently?
The references were a couple of sentences before. Miller et al. 2004, “Meta-Analysis: High-Dosage Vitamin E Supplementation May Increase All-Cause Mortality”, “Randomized Trial of Vitamin A and Vitamin E Supplementation for Retinitis Pigmentosa.” Though I see the formatting made it look like 1 reference instead of 2, and the link was broken.
Also, this is something you can easily google if you question it.
People familiar with vitamin studies would know right away that 200,000IU of vitamin A and 5,000IU of vitamin E per day are both extremely high, and not in the same category as “vitamin supplementation”.
Sure. “Known to be toxic” was probably too glib a way to put it, Phil should provide a cite and I shouldn’t have repeated it uncritically. But even if this was just someone’s hypothesis without much experimental evidence behind it: the concept of vitamin poisoning isn’t a new one. There are publicized daily intake tolerable upper levels for lots of vitamins. Hypervitaminosis A, E. I don’t know what kind of evidence backs these claims up but presumably people in the field are aware of this kind of thing. I’m not saying “These are the toxicity levels. This is why the meta analysis is wrong.” I’m saying “People are hypothesizing vitamin toxicity at certain levels. Why the hell would you run an analysis that couldn’t even in principle take that into account?”
One thing is the fat soluble vitamins (A, E, D & K2-mk4) are [cofactors]. Vitamin A (retinol) toxicity directly depends on Vitamin D3 status.
What vitamins does everyone take? I take a no-iron multi-vitamin, extra vitamin D, and fish oil, all from cheap sources. I would be especially curious if anyone takes/has evidence for more expensive vitamins that are better absorbed.
None.
I do take a calcium supplement on medical advice, but there’s a medical history behind that. I’m not sure it’s really necessary though.
My expectation is that if I eat as much as I want of everything I want then I’ll get enough of whatever trace substances I need, and that taking more will have no effect until one gets up to toxic levels. There is a large gap between the necessary amount and a toxic amount. The inverted J that Phil Goetz described has a very wide and flat bowl.
However, this is not based on any specific scientific finding that I know of, just general principles of biological organisation and the conditions in which we and all other organisms evolved. Consider that we need many different vitamins, minerals, and amino acids, without which we die, and the same is true of other animals. How can we possibly get what we need, without the benefit of modern science to tell us? When we lack one of them, we can tell we’re unwell, but we can’t tell what we need or how to get it, excepting only air and water. Look how long it took to solve scurvy, beri-beri, rickets, and so on. Hunger has more than one dimension, but fewer than the number of things we need. Trying to mix and match foodstuffs to obtain exactly the right amount of everything is impossible under these conditions.
Since, in general, we do get enough without knowing how, and getting exactly enough is impossible, we do not get exactly enough. We inevitably get more than enough of some things, and discard or store what our bodies leave unused. We can tolerate getting a lot more than enough because we have to, or we’d die of an excess of this or that micronutrient as often as from a deficiency.
I’m not sure I understand your point, as it seems your argument would lead you to take vitamins.
I’ll explain.
Do you feel healthy all the time, and if not do you know the exact cause every time you don’t? I don’t think I’ve ever met anyone who always feels healthy or anyone who always knows the exact reason they don’t.
How do you know that in our environment today we get enough?
It seems strange that it’s inevitable that we get more than enough of some things, but not that we get less than enough of other things.
The conclusion I draw from your argument is that because the inverted J has a wide, flat bowl, I should take vitamin supplements. Where am I misunderstanding your point?
My expectation is that in times of such plenty as the present, anyone not in poverty or following some very restricted diet will find it difficult not to get enough of everything.
In my case, I’m well-off and eat whatever I want, which is much the same range of things in much the same quantities, year after year. Therefore if I have a deficiency, it must be a chronic one. But I have no chronic health problems beyond slowly decaying teeth and the aftereffects of an acute illness many years ago of unknown aetiology. Therefore, I conclude, I experience no chronic dietary deficiency. (I have not seen dietary supplements touted as a cure for tooth decay, and I have seen everything touted as preventatives for diseases with no known cause.) I am over 50, so I think that’s long enough a trial.
(Digression: fun fact for bright young things! Nearly everyone over 50 has at least something that has gone wrong with their bodies and will never be fixed. I suspect that not many young people in good health realise this. Barring radical advances in medical science, this is what you have to look forward to.)
There are those who say that health is more than just the absence of illness, but I’ve never been able to make out what they mean. Perhaps by “health” they mean being possessed of great physical energy and joie de vivre, rather than merely being free of identifiable problems, but I’ve never seen anyone attribute that to supplements except the people selling them. I haven’t particularly looked, though.
Those of you who do take dietary supplements: in what ways do you feel different, depending on whether you take them or not?
I read Seth Roberts’ blog, and this sounds like something he might have addressed, but Googling [“Seth Roberts” vitamins] didn’t turn up anything.
Chronic vitamin D deficiency is common.
Xylitol. Garlic. Now you have.
Tea as well, through its large doses of fluoride and its anti-bacterial properties.
I have indeed! But Googling them, xylitol’s selling point seems to be “not as destructive as sugar” rather than positively preventing decay. The first recommendation I found of garlic as a preventative also said that chewing a clove every day prevents bad breath. Um....
Research ‘xylitol, mothers, teeth’. That should hopefully bring you to the study that was done on mothers who were given xylitol during pregnancy and/or the early period while nursing newborns. It was found to have an actual protective effect and, if I recall, delayed the spreading of ‘nasty’ bacteria to the child from the mother.
Xylitol also kills bacteria in vivo. In laymans terms ‘the bacteria notice that it is sugar but don’t realise it is weird alcohol sugar so they eat it then starve’. So it is not merely a way to not have sugar in your mouth while also getting your ‘sweet’ on. Note that it also kills ‘good’ bacteria in the same way so too much isn’t recommended, for the sake of your digestive system!
As for garlic… sure, it kills bacteria, but really, a clove a day… that isn’t one I’ve chosen to make a habit of. I’ll use listerine thanks!
RE: Xylitol again. I do recommend chewing gum flavoured with the stuff, not necessarily actually eating it!
I’m no expert by any means, but my general feeling is that most people today don’t get everything they need. Especially if they eat whatever they want. We’re not optimized to want what’s best for us in today’s world. For example, we want fatty foods and in today’s environment such foods are over-available. Maybe someone with more knowledge can point to information on the subject.
However, I have always assumed (with no real knowledge one way or the other except for the fact that our bodies are complicated kludges) that being deficient in some nutrients can not only cause identifiable health problems, but can also cause other things that you may not want.
As an example, say that being deficient in Vitamin X ended up knocking 10 points off what your IQ could have been potentially. How would you know?
Being a non-expert in the field, and taking all the things you’ve talked about, leads me to take dietary supplements. (A multivitamin that at the moment I can’t recall what it contains and a Vitamin D capsule daily) They are inexpensive (obscenely inexpensive if you’re well-off), and as far as I can tell do no harm if I don’t need them.
I’m taking 4000 IU of D3/day.
I was taking Schiff Move Free for a knee injury, and I noticed some improvement in mood at the introductory 4 pills/day dose. When after two weeks, I pulled back to half of that as recommended, there seemed to be some decline in my mood, so I’m still taking two of the Schiff pills and I’ve added 2000 IU of D3.
I think it’s doing my mood some good, and my knee healed nicely.
I take a generic multivitamin (incl. iron) daily.
I take half a multi-vitamin, 4000 UI of vitamin D (in gelcaps, not dry tablets, since vit D is fat soluble and better absorbed that way), and 3g of fish oil (adding up to 1800mg of EPA/DHA).
Update: I just bought some iron-free multivitamins based on information from this thread.
I’m similar. I take a no-iron multivitamin, not one of those mega-dose ones but just a regular one to prevent nutrient deficiencies in my diet from sneaking up on me. When I can’t get enough sun to stimulate vitamin D production, I take cheap 1000 IU pills. And because I don’t drink milk very often, I take one or two calcium-magnesium-zinc pills daily. This stuff is cheap, and it strikes me as a pretty well-rounded, moderate combination.
Any suggestions for improvements?
Regarding vitamin D:
Taking only 1000 IU of vitamin D only when you don’t get enough sun is very likely to be insufficient. Due to wearing clothes and spending most time indoors even Brazilians are highly deficient in vitamin D.
A study I remember from PubMed but currently can’t find among the flood of vitamin D related studies concluded that 3000 IU per day is a likely average recommendation for adult males. Also, because the response to supplementation varies greatly from person to person the most advisable approach is regular (~ 3 per year) blood testing coupled with adjustments of dosage. For example some people can get to 50 ng/ml blood 25(OH)D levels by taking 2000IU, while others need 7000IU to reach the same levels.
The minimum recommended blood level is usually 50 ng/ml which is as of now well established by evidence but some say that 60-80 ng/ml is still completely safe and might do some additional benefits, so it might be a better target. I personally take 5000 IU per day regardless of which season it is and I hover around 65 ng/ml.
The Vitamin D Council is a decent source of additional information.
Also, see this great chart for a summary on several risks associated with low vit D levels. It’s a bit dated now but it’s still striking.
One supplement that is now being widely prescribed is Vitamin D. Testing to see if one is Vitamin D deficient is common here in the Boston area. It is being suggested that insufficient Vitamin D is linked to multiple health ailments- autoimmune diseases and increased risk of heart attacks in particular. My doctor prescribed Vitamin D supplements for me.
Had the silliness of this linear model been visible in a scatter plot? Is there any point in using linear regression, when lines are a subset of more complex curves? (I haven’t read the papers, no access.)
It’s possible that better mathematical tools would improve conclusions in studies of this kind. But I increasingly believe that the problem lies not in the mathematics but in the very nature of the inquiry: questions of the type “does vitamin A improve health” simply cannot be answered on the basis of the information obtainable through these kinds of small sample size studies. The information content of the empirical data is far smaller than the complexity of the system under examination, and so no meaningful conclusions cannot be obtained.
Of all the scientific conclusions justified on the basis of statistical analysis, the most universally agreed-upon is probably that smoking increases mortality. But even that seemingly rock-solid result is surrounded by a fog of uncertainty.
The sample size was over 200,000 patients. You seem to be saying that medicine can’t be science.
Thank you for a very nice article.
Thank you for making such a polite and kind comment!
Must be new here. :)
Then vitamins are not evil, as the paper claims.
Roughly speaking, can we assume that the right thing they should have written as a conclusion in the paper would have been the weaker claim:
“Vitamins X and Y are evil under these daily doses; further studies are needed to confirm if they are beneficial in some other dosage, and if so, which is the optimal one.”
?
It would have been had that been the only problem with the study. See the comments by myself, Dr Steve Hickey, Len Noriega etc here http://www.cochranefeedback.com/cf/cda/feedback.do?DOI=10.1002/14651858.CD007176&reviewGroup=HM-LIVER
Meta-analyses in general are not to be trusted—at all...
I, too, would like to hear more about the problems of meta-analysis in general. So far it’s naively seemed to me that they’d be more reliable than isolated studies, because they pool a larger amount of results and thus reduce the effect of chance / possible flaws or artifacts in the individual studies.
I think the problem is that each study has to make many arbitrary decisions about aspects of the experimental protocol. This decision will be made the same way for each subject in a single study, but will vary across studies. There are so many such decisions that, if the meta-analysis were to include them as dependent variables, each study would introduce enough new variables to cancel out the statistical power gain of introducing that study.
I would love to hear a more detailed discussion of the problems with meta-analysis.
Very, very briefly (I’m preparing a very long blog post on this, but I want to post it when Dr Hickey, my uncle, releases his book on this, which won’t be for a while yet) - meta-analysis is essentially a method for magnifying the biases of the analyst. When collating the papers, nobody is blinded to anything so it’s very, very easy to remove papers that the people doing the analysis disagree with (approx 1% or fewer of papers that turn up in initial searches end up getting used in most meta-analyses, and these are hand-picked). On top of this, many of them include additional unpublished (and therefore unreviewed) data from trials included in the analysis. You can easily see how this could cause problems, I’m sure. There are many, many problems of this nature. I’d strongly recommend everyone do what I did (for a paper analysing these problems) - go to the Cochrane or JAMA sites, and just read every meta-analysis published in a typical year, without any previous prejudice as to the worth or otherwise of the technique. If you can find a single one that appears to be good science, I’d be astonished...
A good systematic review (meta-analysis is the quantitative component thereof, although the terms are often incorrectly used interchangeably) will define inclusion criteria before beginning the review. Papers are then screened independently by multiple parties to see if they fit these criteria, in attempt to limit introducing bias in the choice of which to include. It shouldn’t be quite as arbitrary as you imply.
This is meant to counter publication bias, although it’s fraught with difficulties. Your comment seems to imply that this practice deliberately introduces bias, which is not necessarily the case.
Are you aware of the PRISMA statement? If so, can you suggest improvements to the recommended reporting of systematic reviews?
So you’re doing a meta-analysis to show that meta-analysis doesn’t work?
If your thesis is correct, you should also be able to show that meta-analysis does work, by judicious choice of meta-analyses. Which means that there should be some good meta-analyses out there!
Do you have an online copy of this paper? Sounds like my kind of thing.
Afraid not, just the abstract is online at the moment (google “Implications and insights for human adaptive mechatronics from developments in algebraic probability theory”—would point you to a link directly, but Google seems to think that my work network is sending automated requests, and has blocked me temporarily).
That title will turn away medical people.
Wasn’t my title ;)
Thanks!
Actually only synthetic beta carotene and things like retinol are implicated.
Beta carotene comes in multiple forms—and it is widely recognised that the form in carrots and greens can’t be overdosed on—even if you make it into tablets. It does make your skin turn orange—but seems otherwise harmless.
There’s a broadly similar story for retinol—and scientists have known about this for quite a long time, now.
This would have been a much better title. Otherwise, great post.