Do you have any materials on epidemiological meta-analyses? [...] I still haven’t found any good resources on how to handle the problems in epidemiology or population-level correlations.
Not to hand. But (as you’ve found) I doubt they’d tell you what you want to know, anyway. The problems aren’t special epidemiological phenomena but generic problems of causal inference. They just bite harder in epidemiology because (1) background theory isn’t as good at pinpointing relevant causal factors and (2) controlled experiments are harder to do in epidemiology.
If I were in your situation, I’d probably try running a sensitivity analysis. Specifically, I’d think of plausible ways confounding would’ve occurred, guesstimate a probability distribution for each possible form of confounding, then do Monte Carlo simulations using those probability distributions to estimate the probability distribution of the systematic error from confounding. This isn’t usually that satisfactory, since it’s a lot of work and the result often depends on arsepulls.
But it’s hard to do better. There are philosophers of causality out there (like this guy) who work on rigorous methods for inferring causes from observational data, but as far as I know those methods require pretty strong & fiddly assumptions. (IlyaShpitser can probably go into more detail about these methods.) They also can’t do things like magically turn a population-level correlation into an individual-level correlation, so I’d guess you’re SOL there.
But (as you’ve found) I doubt they’d tell you what you want to know, anyway. The problems aren’t special epidemiological phenomena but generic problems of causal inference. They just bite harder in epidemiology because (1) background theory isn’t as good at pinpointing relevant causal factors
I’ve found that there’s always a lot of field-specific tricks; it’s one of those things I really was hoping to find.
This isn’t usually that satisfactory, since it’s a lot of work and the result often depends on arsepulls.
Yeah, that’s not worth bothering with.
(2) controlled experiments are harder to do in epidemiology.
The really frustrating thing about the lithium-in-drinking-water correlation is that it would be very easy to do a controlled experiment. Dump some lithium into some randomly chosen county’s water treatment plants to bring it up to the high end of ‘safe’ natural variation, come back a year later and ask the government for suicide & crime rates, see if they fell; repeat n times; and you’re done.
They also can’t do things like magically turn a population-level correlation into an individual-level correlation, so I’d guess you’re SOL there.
I’m interested for generic utilitarian reasons, so I’d be fine with a population-level correlation.
They just bite harder in epidemiology because (1) background theory isn’t as good at pinpointing relevant causal factors
I’ve found that there’s always a lot of field-specific tricks; it’s one of those things I really was hoping to find.
Hmm. Based on the epidemiology papers I’ve skimmed through over the years, there don’t seem to be any killer tricks. The usual procedure for non-experimental papers seems to be to pick a few variables out of thin air that sound like they might be confounders, measure them, and then toss them into a regression alongside the variables one actually cares about. (Sometimes matching is used instead of regression but the idea is similar.)
Still, it’s quite possible I’m only drawing a blank because I’m not an epidemiologist and I haven’t picked up enough tacit knowledge of useful analysis tricks. Flicking through papers doesn’t actually make me an expert.
The really frustrating thing about the lithium-in-drinking-water correlation is that it would be very easy to do a controlled experiment.
True. Even though doing experiments is harder in general in epidemiology, that’s a poor excuse for not doing the easy experiments.
I’m interested for generic utilitarian reasons, so I’d be fine with a population-level correlation.
Ah, I see. I misunderstood your earlier comment as being a complaint about population-level correlations.
I’m not sure which variables you’re looking for (population-level) correlations among, but my usual procedure for finding correlations is mashing keywords into Google Scholar until I find papers with estimates of the correlations I want. (For this comment, I searched for “smoking IQ conscientiousness correlation” without the quotes, to give an example.) Then I just reuse those numbers for whatever analysis I’d like to do.
This is risky because two variables can correlate differently in different populations. To reduce that risk I try to use the estimate from the population most similar to the population I have in mind, or I try estimating the correlation myself in a public use dataset that happens to include both variables and the population I want.
(For this comment, I searched for “smoking IQ conscientiousness correlation” without the quotes, to give an example.) Then I just reuse those numbers for whatever analysis I’d like to do. This is risky because two variables can correlate differently in different populations. To reduce that risk I try to use the estimate from the population most similar to the population I have in mind, or I try estimating the correlation myself in a public use dataset that happens to include both variables and the population I want.
You never try to meta-analyze them with perhaps a state or country moderator?
You never try to meta-analyze them with perhaps a state or country moderator?
I misunderstood you again; for some reason I got it into my head that you were asking about getting a point estimate of a secondary correlation that enters (as a nuisance parameter) into a meta-analysis of some primary quantity.
Yeah, if I were interested in a population-level correlation in its own right I might of course try meta-analyzing it with moderators like state or country.
Not to hand. But (as you’ve found) I doubt they’d tell you what you want to know, anyway. The problems aren’t special epidemiological phenomena but generic problems of causal inference. They just bite harder in epidemiology because (1) background theory isn’t as good at pinpointing relevant causal factors and (2) controlled experiments are harder to do in epidemiology.
If I were in your situation, I’d probably try running a sensitivity analysis. Specifically, I’d think of plausible ways confounding would’ve occurred, guesstimate a probability distribution for each possible form of confounding, then do Monte Carlo simulations using those probability distributions to estimate the probability distribution of the systematic error from confounding. This isn’t usually that satisfactory, since it’s a lot of work and the result often depends on arsepulls.
But it’s hard to do better. There are philosophers of causality out there (like this guy) who work on rigorous methods for inferring causes from observational data, but as far as I know those methods require pretty strong & fiddly assumptions. (IlyaShpitser can probably go into more detail about these methods.) They also can’t do things like magically turn a population-level correlation into an individual-level correlation, so I’d guess you’re SOL there.
I’ve found that there’s always a lot of field-specific tricks; it’s one of those things I really was hoping to find.
Yeah, that’s not worth bothering with.
The really frustrating thing about the lithium-in-drinking-water correlation is that it would be very easy to do a controlled experiment. Dump some lithium into some randomly chosen county’s water treatment plants to bring it up to the high end of ‘safe’ natural variation, come back a year later and ask the government for suicide & crime rates, see if they fell; repeat n times; and you’re done.
I’m interested for generic utilitarian reasons, so I’d be fine with a population-level correlation.
Hmm. Based on the epidemiology papers I’ve skimmed through over the years, there don’t seem to be any killer tricks. The usual procedure for non-experimental papers seems to be to pick a few variables out of thin air that sound like they might be confounders, measure them, and then toss them into a regression alongside the variables one actually cares about. (Sometimes matching is used instead of regression but the idea is similar.)
Still, it’s quite possible I’m only drawing a blank because I’m not an epidemiologist and I haven’t picked up enough tacit knowledge of useful analysis tricks. Flicking through papers doesn’t actually make me an expert.
True. Even though doing experiments is harder in general in epidemiology, that’s a poor excuse for not doing the easy experiments.
Ah, I see. I misunderstood your earlier comment as being a complaint about population-level correlations.
I’m not sure which variables you’re looking for (population-level) correlations among, but my usual procedure for finding correlations is mashing keywords into Google Scholar until I find papers with estimates of the correlations I want. (For this comment, I searched for “smoking IQ conscientiousness correlation” without the quotes, to give an example.) Then I just reuse those numbers for whatever analysis I’d like to do.
This is risky because two variables can correlate differently in different populations. To reduce that risk I try to use the estimate from the population most similar to the population I have in mind, or I try estimating the correlation myself in a public use dataset that happens to include both variables and the population I want.
You never try to meta-analyze them with perhaps a state or country moderator?
I misunderstood you again; for some reason I got it into my head that you were asking about getting a point estimate of a secondary correlation that enters (as a nuisance parameter) into a meta-analysis of some primary quantity.
Yeah, if I were interested in a population-level correlation in its own right I might of course try meta-analyzing it with moderators like state or country.