Long-chain correlation: lead paint and crime
A friend has been asking my views on the likelihood that there’s anything to a correlation between changing levels of lead in paint (and automotive exhaust) and the levels of crime. He quoted from a Reason Blog:
So Nevin dove in further, digging up detailed data on lead emissions and crime rates to see if the similarity of the curves was as good as it seemed. It turned out to be even better: In a 2000 paper (PDF) he concluded that if you add a lag time of 23 years, lead emissions from automobiles explain 90 percent of the variation in violent crime in America. Toddlers who ingested high levels of lead in the ’40s and ‘50s really were more likely to become violent criminals in the ’60s, ‘70s, and ’80s.
I responded with the following:
Sounds like a stretch to me. I’d want to hear that they didn’t test more than 5 other hypothesis before coming to that conclusion, or the p value was far better than .05. I kind of doubt that either is the case.
He’s apparently continued to pursue the question, and just forwarded these remarks from Steven Pinker that I thought were very illuminating, and probably deserve a place in this community’s toolkit for skeptics. Pinker’s main point is that the association between Lead and crime is a long tenuous chain of suppositions, and several of the intermediate points should be far easier to measure. Finding correlations at this distance is not very informative.
Does the phrase “long-chain correlation” stick in your head and make it easier to dismiss this kind of argument?
- 8 Feb 2013 7:57 UTC; 3 points) 's comment on Politics Discussion Thread February 2013 by (
- 21 Oct 2013 3:18 UTC; 0 points) 's comment on Making Fun of Things is Easy by (
This is not a hypothesis testing problem! It doesn’t matter what the p-value is, you can’t conclude causation from correlation convincingly without showing a mechanism or randomizing or finding a natural experiment. What Pinker’s saying (I completely agree with him) is that when the unobserved causal graph between two variables is large enough, there is almost certainly big confounding variables in there you haven’t accounted for.
Again—confounding is not a statistical issue, you can’t just get around it by being clever with p-values/Bayes theorem/whatever.
edit: By mechanism I mean a direct mechanism spanning 23 years, not “lead causes brain damage” (because such a high level observation does not rule out confounding sources).
I didn’t bother reading Drum’s article the first time I saw it circulating. I’ve known about Nevin’s papers for a couple of years, already decided they were interesting but weak evidence, and Drum didn’t seem to be bringing much new to the table. But I’ve now looked at his article, and it does reference stuff I wasn’t aware of, like the city-level correlations between vehicular lead emissions and assaults.
We now have studies based on MRI scans, ecological correlations at multiple levels of aggregation, and longitudinal studies of individuals. But they’re still all observational, and could still be affected by individual-level confounding factors. Qualitatively the causal mechanism is obviously real,* but it’s only backed up by actual experiments in vivo for blood levels of 10-50 μg/dL (as far as I know). Drum’s aiming an order of magnitude lower, where I’d still expect some effect, but I don’t trust observational studies or simple extrapolation to estimate it precisely.
The very very obvious thing to do now is run a nice, big RCT. There are RCTs for lesser interventions (e.g.) but I see none for window replacement or soil cleanup.
So: you select 100 city neighbourhoods. Send an army of researchers to each of them to bang on every door and recruit as many families with infants & toddlers as they can. Send a second wave of researchers to measure the soil in the recruited kids’ houses, take blood samples, administer the relevant IQ/development tests (test the parents too if you’re feeling hardcore), and write everyone’s demographic details (sex, age, race, etc.) down on their clipboards. Now you can choose 50 neighbourhoods at random as the treatment group; don’t forget to check it’s statistically comparable to the control group!
Having done that, get another load of people to bang on every door in the 50 neighbourhoods in the treatment arm, and offer them money to replace their windows and their soil. (I don’t know whether you can get away with not replacing windows & soil for the residents who don’t have kids in the study. If you could skip them you’d save a lot of money.) Then do the cleanup jobs.
That’s not even the hard part. You now have to keep track of all the families you’ve recruited for the next 3 decades. A lot of them will move elsewhere. (Boosting the value of their houses by eliminating the lead might even encourage them to sell up.) That’s fine — you just pay some academics to keep track of them. A year down the line, visit all of the families and run the tests again to measure the short-term changes. Some families will refuse the follow-up visit. That’s OK too — you just keep their details on file, ’cause another 5 years later, when the kids are in school, you’re gonna do a third wave of testing. Then a few more years later, when the kids are hitting puberty, you’re gonna do a fourth wave of testing, and you’re gonna get their criminal records from the cops. And again, a decade later, once they’re adults, you do the testing and criminal record check again. And you might do another one down the line just to check there aren’t any late-breaking effects to bite you. At each step, you compare the control & treatment subjects and publish the results.
How much might this all cost? Say there are 100 families’ houses to clean in each of the 50 treatment neighbourhoods. That’s 5000 houses. Drum says that Nevin says that replacing 16 million houses’ windows would cost about $200 billion in total, or $12500 per house. Soil cleanup costs about as much again, making the cleanup total $25000 per house, or $125 million for the 5000 treatment houses. Add the cost of the blood testing, paying a few academics to run the study for decades, and other odds & sods, and the total cost might be something like $150 million. Oh yeah, and your study still wouldn’t give you an unbiased estimate of the benefits of lead abatement; it’d underestimate them because of the time the children spent being exposed to lead in their houses before you showed up to run the study. But it’d give you a robust answer to a $400 billion question, and you could get part of that answer by the end of the decade.
* Blood lead poisoning has been observed since antiquity; sufficiently high exposure causes death within weeks or months; lead-treated lab animals show behavioural deficits; non-fatal but high blood lead levels cause obvious symptoms in children that can be partially reversed with chelation.
A study sort of like this was done in Rochester, and they found that nothing they did changed blood lead levels very much and so they didn’t learn anything from it. I guess they could go further with actually replacing people’s houses.
The hidden agenda of Extreme House Makeover.
One form of more convincing evidence based on observational longitudinal data is using g-computation to adjust for the so called “time varying confounders” of lead exposure.
A classic paper on this from way back in 1986 is this: http://www.biostat.harvard.edu/robins/new-approach.pdf
The paper is 120 pages, but the short version is, in graphical terms all you do is pretend that you are interested in lead exposure interventions (via do(.)) at every time slice, and simply identify this causal effect from the observational data you have. The trick is you can’t adjust for confounders as usual, because of this issue:
C → A1 → L → A2 → Y
Say A1, A2 are exposures to lead at two time slices, C is baseline confounders, L is an intermediate response, and Y is a final response. The issue is the usual adjustment here:
p[y|do[a1,a2]]=∫l∫cp[y|a1,a2,c,l]p[c,l]dcdl
is wrong. That’s because C, L and Y are all confounded by things we aren’t observing, and moreover if you condition on L, you open a path A1 → L <-> Y via these unobserved confounders which you do not want open. Here L is the “time-varying confounder”: for the purposes of A2 we want to adjust for it, but for the purposes of A1 we do not. This implies the above formula is actually wrong and will bias your estimate of the early lead exposure A1 on Y.
What we want to do instead is this:
p[y|do[a1,a2]]=∫l∫cp[y|a1,a2,c,l]p[l|a1,c]p[c]dcdl
The issue here is you still might not have all the confounders at every time slice. But this kind of evidence is still far better than nothing at all (e.g. reporting correlations across 23 years).
Prediction: if you did this analysis, you would find no statistically significant effect on any scale.
For a proposed $400b intervention, a few hundred millions seem like pretty reasonable expenditures.
Agreed. (And come to think of it, I’m underplaying things by calling this a “$210 billion question” — the $400 billion total cost of the interventions is as relevant as the estimated annual return.)
“90 percent of the variation” is misleading when comparing the levels of one time-series against another. It’s very easy to find two time-series that regress almost perfectly on one another because both steadily increase. Looking at first-differences is more informative about possible causal relations. The image Cyan posted is effectively two data points from a first difference perspective: both went up and then both went down.
Another graph from Nevin’s website is slightly more persuasive:
There you can see 4-6 corresponding changes in trend. Still not that impressive, but maybe enough to start looking more closely.
Interesting to see the murder rate stay almost flat through the 2000s even as lagged lead use plummets by 80% or so.
The lead/crime theory seems to have entered blogospheric conversation-space thanks to an article by Mother Jones writer/blogger Kevin Drum. Here’s a link to most of what he’s written on the subject in the past month. Here’s an excerpt from the short version/intro to the magazine article:
Your first MotherJones link seems a bit inaccurate… Try this one
I was about to post on this subject myself to see if there were any LessWrong opinions. It’s certainly more than just one US-wide correlation; similar correlations appear at the level of individual states and cities, and internationally too. Plus there are longitudinal studies (following cohorts of children with different measured lead exposures through time, and monitoring levels of school delinquency then criminality).
Oh goddamit… thanks for catching the bad link. Fixed.
Does having a good grip on the causal mechanism help increase our confidence in the result?
It’s pretty clear that lead causes cognitive damage. Cognitive damage in children (especially high functioning with emotion control issues) seems like a plausible cause of crime when the children grow up.
That doesn’t tell us the magnitude of the change, but does tell us what direction to expect the effect to be.
Pinker isn’t arguing that lead and crime have no association, but rather that the crime decline isn’t substantially caused by environmental lead contamination.
Not quite—he’s arguing that this is not good evidence that the crime decline is substantially caused by environmental lead contamination. He says a few times it’s an interesting and plausible hypothesis, he’s stressing that this doesn’t constitute good evidence for it. The text is an essay on reasoning, not an essay on lead and crime.
And its worth noting the cohort studies Pinker suggests need to be done HAVE in fact been done- and while not a slam-dunk case, are largely supportive of the hypothesis (at least if Drum’s article is to be believed, I haven’t yet dipped into the research papers)
Picking the time lag to maximise the best fit between the two data sets is the kind of thing that they teach you not to do in machine learning classes; it leads to overfitting.
The theory makes a prediction about the time lag at which autocorrelation will be maximized: it’s the time interval needed for a generation to mature.
I’m pretty suspicious that it’s actually a postdiction.
I think the actual sequence of events is more like this: crime rates fell drastically all over the US starting in the very early nineties. It’s not often in social science that a phenomenom cries out for a causal explanation with a single overriding cause, but this was one such.
The time-lag of the correlation provided enough evidence to bring the lead hypothesis out of the “epsilon probability” regime. That’s straightfoward Bayesian reasoning—verifying a consequence (i.e., prediction) of a hypothesis increases the plausibility of the hypothesis. Further predictions of the hypothesis were then verified—things like prospective longitudinal studies showing the association of blood lead levels and violence on the individual level and natural experiments generated by the slightly different timings of various countries’ and various US states’ lead gasoline phase-outs.
I think “lead paint causes crime” is evaluated based on direct mechanisms of brain damage, behavioral changes, and somewhat controlled population studies.
This known mechanism is then offered as a dominant causal mechanism for historical crime rates based on their correlation to time lagged lead exposure rates.
It’s a perfectly reasonable way to try to explain historical crime rates. What it would not be is a reasonable way to establish that lead exposure induced crime, which I think you mistake the procedure for.
The threat of confounders is inevitable in population studies, but that doesn’t mean you don’t do historical population modeling.