gwern comments on How confident are you in the Atomic Theory of Matter?

gwern 29 Jun 2013 16:05 UTC
2 points

Most meta-analyses are done by people in the field, although I’m not sure whether they’re typically experts in the specific phenomenon they’re meta-analyzing.

My own impression has been this as well: if you already understand your basic null-hypothesis testing, a regular meta-analysis isn’t that hard to learn how to do.

But an epidemiologist meta-analyzing observational studies generally can’t quantify confounding biases and analogous sources of systematic error.

Do you have any materials on epidemiological meta-analyses? I’ve been thinking of trying to meta-analyze the correlations of lithium in drinking water, but even after a few days of looking through papers and textbooks I still haven’t found any good resources on how to handle the problems in epidemiology or population-level correlations.
- satt 29 Jun 2013 17:07 UTC
  2 points
  Parent
  
  Do you have any materials on epidemiological meta-analyses? [...] I still haven’t found any good resources on how to handle the problems in epidemiology or population-level correlations.
  
  Not to hand. But (as you’ve found) I doubt they’d tell you what you want to know, anyway. The problems aren’t special epidemiological phenomena but generic problems of causal inference. They just bite harder in epidemiology because (1) background theory isn’t as good at pinpointing relevant causal factors and (2) controlled experiments are harder to do in epidemiology.
  
  If I were in your situation, I’d probably try running a sensitivity analysis. Specifically, I’d think of plausible ways confounding would’ve occurred, guesstimate a probability distribution for each possible form of confounding, then do Monte Carlo simulations using those probability distributions to estimate the probability distribution of the systematic error from confounding. This isn’t usually that satisfactory, since it’s a lot of work and the result often depends on arsepulls.
  
  But it’s hard to do better. There are philosophers of causality out there (like this guy) who work on rigorous methods for inferring causes from observational data, but as far as I know those methods require pretty strong & fiddly assumptions. (IlyaShpitser can probably go into more detail about these methods.) They also can’t do things like magically turn a population-level correlation into an individual-level correlation, so I’d guess you’re SOL there.
  - gwern 29 Jun 2013 17:35 UTC
    2 points
    Parent
    
    But (as you’ve found) I doubt they’d tell you what you want to know, anyway. The problems aren’t special epidemiological phenomena but generic problems of causal inference. They just bite harder in epidemiology because (1) background theory isn’t as good at pinpointing relevant causal factors
    
    I’ve found that there’s always a lot of field-specific tricks; it’s one of those things I really was hoping to find.
    
    This isn’t usually that satisfactory, since it’s a lot of work and the result often depends on arsepulls.
    
    Yeah, that’s not worth bothering with.
    
    (2) controlled experiments are harder to do in epidemiology.
    
    The really frustrating thing about the lithium-in-drinking-water correlation is that it would be very easy to do a controlled experiment. Dump some lithium into some randomly chosen county’s water treatment plants to bring it up to the high end of ‘safe’ natural variation, come back a year later and ask the government for suicide & crime rates, see if they fell; repeat n times; and you’re done.
    
    They also can’t do things like magically turn a population-level correlation into an individual-level correlation, so I’d guess you’re SOL there.
    
    I’m interested for generic utilitarian reasons, so I’d be fine with a population-level correlation.
    - satt 29 Jun 2013 20:04 UTC
      0 points
      Parent
      
      They just bite harder in epidemiology because (1) background theory isn’t as good at pinpointing relevant causal factors
      
      I’ve found that there’s always a lot of field-specific tricks; it’s one of those things I really was hoping to find.
      
      Hmm. Based on the epidemiology papers I’ve skimmed through over the years, there don’t seem to be any killer tricks. The usual procedure for non-experimental papers seems to be to pick a few variables out of thin air that sound like they might be confounders, measure them, and then toss them into a regression alongside the variables one actually cares about. (Sometimes matching is used instead of regression but the idea is similar.)
      
      Still, it’s quite possible I’m only drawing a blank because I’m not an epidemiologist and I haven’t picked up enough tacit knowledge of useful analysis tricks. Flicking through papers doesn’t actually make me an expert.
      
      The really frustrating thing about the lithium-in-drinking-water correlation is that it would be very easy to do a controlled experiment.
      
      True. Even though doing experiments is harder in general in epidemiology, that’s a poor excuse for not doing the easy experiments.
      
      I’m interested for generic utilitarian reasons, so I’d be fine with a population-level correlation.
      
      Ah, I see. I misunderstood your earlier comment as being a complaint about population-level correlations.
      
      I’m not sure which variables you’re looking for (population-level) correlations among, but my usual procedure for finding correlations is mashing keywords into Google Scholar until I find papers with estimates of the correlations I want. (For this comment, I searched for “smoking IQ conscientiousness correlation” without the quotes, to give an example.) Then I just reuse those numbers for whatever analysis I’d like to do.
      
      This is risky because two variables can correlate differently in different populations. To reduce that risk I try to use the estimate from the population most similar to the population I have in mind, or I try estimating the correlation myself in a public use dataset that happens to include both variables and the population I want.
      - gwern 29 Jun 2013 21:01 UTC
        0 points
        Parent
        
        (For this comment, I searched for “smoking IQ conscientiousness correlation” without the quotes, to give an example.) Then I just reuse those numbers for whatever analysis I’d like to do. This is risky because two variables can correlate differently in different populations. To reduce that risk I try to use the estimate from the population most similar to the population I have in mind, or I try estimating the correlation myself in a public use dataset that happens to include both variables and the population I want.
        
        You never try to meta-analyze them with perhaps a state or country moderator?
        satt 4 Jul 2013 20:43 UTC
        0 points
        Parent
        
        You never try to meta-analyze them with perhaps a state or country moderator?
        
        I misunderstood you again; for some reason I got it into my head that you were asking about getting a point estimate of a secondary correlation that enters (as a nuisance parameter) into a meta-analysis of some primary quantity.
        
        Yeah, if I were interested in a population-level correlation in its own right I might of course try meta-analyzing it with moderators like state or country.