Safety can be dangerous
In 2005, Hurricane Rita caused 111 deaths. 3 deaths were caused by the hurricane. 90 were caused by the mass evacuation.
The FDA is supposed to approve new drugs and procedures if the expected benefits outweigh the expected costs. If they actually did this, their errors on both sides (approvals of bad drugs vs. rejections of good drugs) would be roughly equal. The most-publicized drug withdrawal in the past 10 years was that of Vioxx, which the FDA estimated killed a total of 5165 people over 5 years. This suggests that the best drug that the FDA rejected during that decade could have saved 1000 people/year. During that decade, many drugs were (or could have been) approved that might save more than that many lives every year. Gleevec (invented 1993, approved 2001) is believed to save about 10,000 lives a year. Herceptin (invented in the 1980s, began human trials 1991, approved for some patients in 1998, more in 2006, and more in 2010) was estimated to save 1,000 lives a year in the United Kingdom, which would translate to 5,000 lives a year in the US. Patients on Apibaxan (discovered in 2006, not yet approved) have 11% fewer deaths from stroke than patients on warfarin, and stroke causes about 140,000 deaths/year in the US. To stay below the expected drug-rejection error level of 1000 people/year, given just these three drugs (and assuming that Apibaxan pans out and can save 5,000 lives/year), the FDA would need to have a faulty-rejection rate F such that F(10000) + F(5000) + F(5000) < 1000, F < 5%. This seems unlikely.
ADDED: One area where this affects me every day is in branching software repositories. Every software developer agrees that branching the repository head for test versions and for production versions is good practice. Yet branching causes, I would estimate, at least half of our problems with test and production releases. It is common for me to be delayed one to three days while someone figures out that the software isn’t running because they issued a patch on one branch and forgot to update the trunk, or forgot to update other development or test versions that are on separate branches. I don’t believe in branching anymore—I think we would have fewer bugs if we just did all development on the trunk, and checked out the code when it worked. Branching is good for humongous projects where you have public releases that you can’t patch on the head, like Firefox or Linux. But it’s out of place for in-house projects where you can just patch the head and re-checkout. The evidence for this in my personal experience as a software developer is overwhelming; yet whenever I suggest not branching, I’m met with incredulity.
Exercise for the reader: Find other cases where cautionary measures are <EDIT>taken past the point of marginal utility</EDIT>.
ADDED: I think that this is the problem: You have observed a distribution of outcome utilities from some category of event followed by you taking some action A. You observe a new instance of this event. You want to predict the outcome utility of action A for this event.
Some categories have a power-law outcome distribution with a negative exponent b, indicating there are fewer events of large importance: number of events of size U = ec—bU. Assume that you don’t observe all possible values of U. Events of importance < U0 are too small to observe; and events with large U are very uncommon. It is then difficult to tell whether the category has a power-law distribution without a lot of previous observations.
If a lot of event categories have a distribution like this, where big impacts are bad, and they are usually insignificant but sometimes catastrophic, then it’s likely rational to treat these events as if they will be catastrophic. And if you don’t have enough observations to know if the distribution is a power-law, or something else, it’s rational to treat it as if it were a power-law distribution to be safe.
Could this account for the human risk-aversion “bias”?
If you are the FDA, you are faced with situations where the utility distribution is probably such a power-law distribution mirrored around zero, so there are a few events with very high utility (save lots of lives), and a similar number of events with the negative of that utility (lose that many lives). I would guess that situations like that are rare in our ancestral environment, though I don’t know.
- 7 Sep 2011 10:51 UTC; 1 point) 's comment on Open Thread: September 2011 by (
The ratio of 90 deaths from the evacuation to 3 deaths from the hurricane looks bad, but is in fact irrelevant. The proper comparison would be 90 deaths from evacuating, to X deaths that would have resulted had those people stayed put, or performed some other action in preparation. While it’s possible a proper estimate of the risk of staying near the hurricane’s projected path would show that it was the less dangerous course, the abstract you linked doesn’t suggest that is a topic covered by the paper.
I guess the argument is the same as for the drugs: the evacuation effort should be scaled back until it causes equally many deaths as the hurricane.
Actually, until the marginal lives lost and saved are equal, which says nothing about the total.
Also, this is technically not correct:
Actually, if the FDA really did this the marginal—in this case, most-dangerous—drug approved should kill as many people as it save. But since every drug before that would save more people as it killed, on net there should be more people saved than killed.
Right—I just logged in to try to fix this, after realizing that if what I originally wrote were true, drugs would have net zero benefit.
(An additional complication is that approval of a good drug gives continued benefits indefinitely, while approval of a bad drug does not give continued costs—its badness is found out and it is taken off the market.)
A lot of this good drug stuff seems to be drugs that are taking a long time to pass, but eventually will. These can be compared simply with the bad drugs.
Yay for marginal cost does not equal average cost!
[libertarian alert]
I’m not sure the drug example is a safety problems per se, it looks more like an incentive problem to me.
If an FDA official approves a bad drug that kills 1000 people/year, he probably gets canned. If he rejects a good drug that would have saved 1000 lives/year...well, no one including him will actually know how many lives it would have saved, and he will take his paycheck home and sleep soundly at night.
Can you come up with an example that doesn’t involve government?
Additionally, solving the problem of getting people to take the right drugs may be more complicated than just putting out drugs that have a tiny positive expected value. The market is humans, after all, and humans themselves are loss-averse. There are also credibility problems—it’s easy to blame the regulators for failures, and so humans do. And so people demand that the regulators be very selective, and so they do. If regulators didn’t respond like this, people might behave differently, maybe avoiding new drugs or distrusting doctors.
Gerd Gigerenzer estimates that there were 1,600 excess road fatalities due to increased driving post 9/11.
Leads me to wonder whether there are any terrorist attacks that killed more people via indirect effects like that than from the attack’s direct effect. (9/11 fits if I count the war in Afghanistan and/or the Iraq War, although that feels a bit like cheating.)
It depends on what you mean by terrorism and indirect effects. The assassination of Franz Ferdinand had a few repercussions.
Every other attack on aircraft ever, I should think. Even if they only had 1/20th the effect on encouraging driving over flight, very few of them will have killed 80 people.
Is that true? I thought most aircraft attacks were an all-or-nothing thing. The aircraft goes down or it doesn’t.
You’re right—I was thinking of the sort of attack where they capture rather than down the aircraft, but that hardly ever happens these days.
He means each air attack encouraged driving instead of flying, and the extra driving killed more people than the “all” from the attack.
Good point: we tend to act as if the worst-case scenario were a given, without regard for the relatively foreseeable negative consequences to safety measures.
This is more of a tradeoff of time and money than lives saved, but see our continued insistence that nobody can get on an airplane without taking their shoes off. It’s a policy that has almost certainly done no good and never stood a serious chance of doing any good. Here a political scientist explains why it’s very hard to walk back stupid policies like that even when virtually everyone agrees they should be walked back: in most cases, someone’s got to take the initiative to walk it back, and in the event of a very-unlikely disaster, voters will disproportionately blame the walker-back, but will not blame him/her for the inconvenience they experience while the policy remains in place. There’s no incentive to do the obviously intelligent thing.
All this said, in order to fully argue for the Rita example, you’d have to show that the people who ordered the evacuation were wrong to do so. Perhaps many more than 90 lives would have been lost had the 2.5 million evacuees remained in place. Perhaps they would not have, but the best available damage forecasting indicated that they probably would be. As the linked report says, the hurricane narrowly missed major population centers. If the best available forecasting indicated that there was a 10% chance that it would strike those population centers, and if it did there was a 90% chance that 2000 people would die, whereas if the evacuation were ordered there was a 90% chance that 100 people would die in the process of evacuating, then the evacuation was the right call. (Numbers pulled out of my ass, obviously, but just illustrating that even in the case of a seemingly “anti-safety” scenario like Hurricane Rita, it’s hard to figure out, even in retrospect, what course of action is best.)
There are certainly a lot of cost-benefit considerations biased towards excessive safety, but what I find even more fascinating is the family of effects that are know under the monikers of Tullock effects, Peltzman effects, risk compensation, and risk homeostasis. The basic idea underlying all these is that regulations and interventions that forcibly lower certain risks motivate people to compensate by adopting other risky behaviors, so that the overall level of risk remains the same (though possibly redistributed), or even gets worse due to biases or externalities. A classic example is when safer cars lead to more reckless driving.
Where I live, there’s a busy road with two car lanes and a bike lane in each direction. A while ago, the city government decided that the bike lane was too narrow, and for the safety of the cyclists, they moved the lane markers so as to widen the bike lane and make the car lanes narrower. A few days after this change, I saw a happy couple riding on the widened bike lane in parallel.
This still seems like a net win, however. The couple may not be any safer, but they get to ride in parallel. Presumably reckless driving involves something like texting or speeding, both of which are beneficial if you don’t crash or get caught.
(If people are acting less safely in ways that don’t have any benefits, then this argument fails, of course. I’d be surprised if that was the case.)
A somewhat different example (though one that appears to be based on a fairly limited amount of data):
Speed bumps slow down emergency vehicles, and that apparently costs more lives than are saved by reduced traffic collisions.
What if you have two drugs, one that saves 1000 lives per year, and the other that costs 1000 per year? I think you’re saying that it’s a wash whether you approve both or ban both. But if you approve both and eventually ban the deadly drug, it’s a net win. The expected value of approving a drug is higher than the expected value of using it indefinitely. (I see that you address this in the comments. But I don’t think this is “an additional complication” but a huge bias to this methodology.)
The continual push for “higher standards” leads to fewer drug approvals. It is hard to assess the quality of the marginally rejected drugs, but the other effect is to delay approval. It is easy to retrospectively measure the effect of delay. Gieringer showed that in the 50s, 60s, and 70s when the US and Europe had different standards, the lives lost to delays on one side of the Atlantic swamped the lives lost to drugs that were eventually banned. That’s all drugs that were eventually banned, not just drugs that were not approved because of “higher standards.” In fact, the most famous differential rejection, thalidomide, was rejected on a US whim, when the US generally had laxer standards at the time.
References:
The risks of caution—Tim Tyler;
The Risk of Excessive Caution—John C. Downen
The Perils of Precaution—Max More
/waves
this is only relevant if you can give us data on the drugs that would kill thousands that the fda prevents from being marketed.
No—that would be relevant if I were claiming that the FDA should be abolished. Here, I’m only claiming that they do not, in fact, approve a drug if its expected benefits outweigh its expected costs. Approval appears to require something closer to a 10⁄1 benefit/cost ratio.
sorry, I was thrown off by your last sentence “Exercise for the reader: Find other cases where cautionary measures are more dangerous than nothing.” Seems like a fairly explicit point about how the fda is worse than nothing.
Oh. I see what you mean.
Possibly I shouldn’t have mentioned the FDA. The Rita data is stronger and more interesting; but all the comments are on the FDA paragraph.
Looking up a few other hurricanes I haven’t seen a continuation of this trend. Katrina had more killed that didn’t evacuate, obviously, and Andrew seemed to have a 50⁄50 split (including deaths from clean-up as “evacuation” deaths, actually).
Good point. Assume an exponential distribution of deaths/hurricane, and a symmetrical error (eg., a predicted category 3 storm is equally likely to come out category 2 or category 4), and the optimal response would be to treat the predicted category 3 as if it would be category 4. This would lead us to usually have more deaths from evacuation than from the hurricane for less-destructive storms; this would be more than made up for on the occasions when they turned out to be more destructive. So the Rita evacuation may not have been an error.
Probably not a good heuristic, since category 2 storms are four times more common than category 4 storms. Not sure how much this affects the calculations, but FWIW.
Mmm, I thought the FDA case was stronger, as this is a pretty strong counterargument to the Rita assertion.
Are clinical trials submitted to the FDA/TGA/ETGA registered before the trials commence? I know that is best practice, but I am unclear whether it is regulated. If it is not, could there be a business in selling ‘pre-registration accreditation’, for pharmaceutical and medical device manufacturers to market their products for a higher standard of product?
On the note of software branching:
Over here in Google, we constantly build everything from head, to the point that every build invocation is essentially a dramatically optimized (and distributed, and limited to just transitive dependencies of your current target, but still..) “make world”. Pretty much everything is in a single repository.
As a result, we occasionally have to deal with other people breaking out code. That happens.. maybe once every few months. And we don’t need to manage divergent versions, or forget which branches we’d applied some patch to.
I’m pretty sure the non-branching workflow can only be called a stunning success.
Are there separate neural systems for risk-avoidance and award-seeking?