When the sample size is very large, getting whatever result you want requires using a lot of variables and/or contrived variables. The variables used don’t seem particularly numerous or contrived.
The results of Dale and Krueger are consistent with attending a more selective college having a strong positive effect on earnings, and also consistent with attending a more selective college having a strong negative effect on earnings. But if it were true that attending a more selective college had a strong positive effect on earnings, would you expect there to be zero correlation after controlling for the variables that they do?
It would be better to take a randomly selected population of sufficiently large size to get statistical power and examine the causal pathways that led them to their current income level as opposed to a higher level or a lower level. I’m very interested in getting this sort of data, but it would be hard to get it at a sufficiently fine level of granularity (e.g. it’s not possible to get records of how all of the hiring decisions were made) and I don’t know of any such data sets that have even coarse information on the subject.
I am not sure if I am following you, but the issue isn’t sample size, the issue is whether your answer is biased or not. If your estimator is severely biased, it doesn’t matter what the sample size is. That is, if the effect is 0, but your bias is −3, then you estimate −3 poorly at 100 samples, and better at 100000 samples, but you are still estimating a number that is not 0.
You seem to think that if the effect is 0, then getting it to look like it is not 0 requires very exotic scenarios or sets of variables, but that’s not true at all. It is very easy to get bias. So easy that people even have word examples of it happening: “when I regressed the possession of an olympic gold medal on physical fitness, I got a strong effect, even adjusting for socioeconomic background, age, gender, and height—to the olympic medal replica store!”
“Causal pathways” are a hard problem. You have to be very careful, especially if your data tracks people over time. The keywords here are “mediation analysis.”
Pearl said that he does not deal with statistical issues, only identification. So perhaps he would not be the best person to judge a study (because statistical analysis itself is a difficult art, even if we get the causal part right). Perhaps folks at the Harvard, (or Hopkins, or Berkeley, or North Carolina? or Penn?) causal groups would be better.
Thanks. I don’t have technical knowledge of statistics, and may be off base. I’ll have to think about this more.
Do you disagree with my bottom line that the Dale Krueger study points (to a nontrivial degree) in the direction of having a prior that there’s no effect?
I think the Lesswrong census data is pretty nice to understand what regression does.
It turns out that US Lesswrong readers are on average smarter than non-US Lesswrong readers.
If you try to find out whether a belief correlates with IQ and simply run a regression you sometimes get very different results when you control for whether the person comes from the US or whether you don’t control for that fact.
Apart from all the theoretical arguments, regression studies frequently simply don’t replicate. Even if you don’t see a problem with a study if it doesn’t replicate it’s worthless.
If you try to find out whether a belief correlates with IQ and simply run a regression you sometimes get very different results when you control for whether the person comes from the US or whether you don’t control for that fact.
Correlations of very different sizes? Or just differing signs? I would not be surprised by the latter. The former would surprise me, if it applied to a randomly selected belief.
Apart from all the theoretical arguments, regression studies frequently simply don’t replicate.
The Dale-Krueger study does replicate in a sense: they got the same results for the 1989 cohort as they did for the 1976 cohort.
The US LW’ler are one average smarter, older and have higher income. If I rember right they also vote more often. But it’s been a while till a played with the data so I don’t want to say something wrong by being to detailed in my claims.
The Dale-Krueger study does replicate in a sense: they got the same results for the 1989 cohort as they did for the 1976 cohort.
For what value of “same”? Did they first analysed 1976 and published and years later analysed 1989 and came to the same conclusions or did they just throw all data together.
When the sample size is very large, getting whatever result you want requires using a lot of variables and/or contrived variables. The variables used don’t seem particularly numerous or contrived.
The results of Dale and Krueger are consistent with attending a more selective college having a strong positive effect on earnings, and also consistent with attending a more selective college having a strong negative effect on earnings. But if it were true that attending a more selective college had a strong positive effect on earnings, would you expect there to be zero correlation after controlling for the variables that they do?
It would be better to take a randomly selected population of sufficiently large size to get statistical power and examine the causal pathways that led them to their current income level as opposed to a higher level or a lower level. I’m very interested in getting this sort of data, but it would be hard to get it at a sufficiently fine level of granularity (e.g. it’s not possible to get records of how all of the hiring decisions were made) and I don’t know of any such data sets that have even coarse information on the subject.
Hi,
I am not sure if I am following you, but the issue isn’t sample size, the issue is whether your answer is biased or not. If your estimator is severely biased, it doesn’t matter what the sample size is. That is, if the effect is 0, but your bias is −3, then you estimate −3 poorly at 100 samples, and better at 100000 samples, but you are still estimating a number that is not 0.
You seem to think that if the effect is 0, then getting it to look like it is not 0 requires very exotic scenarios or sets of variables, but that’s not true at all. It is very easy to get bias. So easy that people even have word examples of it happening: “when I regressed the possession of an olympic gold medal on physical fitness, I got a strong effect, even adjusting for socioeconomic background, age, gender, and height—to the olympic medal replica store!”
“Causal pathways” are a hard problem. You have to be very careful, especially if your data tracks people over time. The keywords here are “mediation analysis.”
Pearl said that he does not deal with statistical issues, only identification. So perhaps he would not be the best person to judge a study (because statistical analysis itself is a difficult art, even if we get the causal part right). Perhaps folks at the Harvard, (or Hopkins, or Berkeley, or North Carolina? or Penn?) causal groups would be better.
See my response to Eliezer, as well.
Thanks. I don’t have technical knowledge of statistics, and may be off base. I’ll have to think about this more.
Do you disagree with my bottom line that the Dale Krueger study points (to a nontrivial degree) in the direction of having a prior that there’s no effect?
I think the Lesswrong census data is pretty nice to understand what regression does.
It turns out that US Lesswrong readers are on average smarter than non-US Lesswrong readers.
If you try to find out whether a belief correlates with IQ and simply run a regression you sometimes get very different results when you control for whether the person comes from the US or whether you don’t control for that fact.
Apart from all the theoretical arguments, regression studies frequently simply don’t replicate. Even if you don’t see a problem with a study if it doesn’t replicate it’s worthless.
Correlations of very different sizes? Or just differing signs? I would not be surprised by the latter. The former would surprise me, if it applied to a randomly selected belief.
The Dale-Krueger study does replicate in a sense: they got the same results for the 1989 cohort as they did for the 1976 cohort.
The US LW’ler are one average smarter, older and have higher income. If I rember right they also vote more often. But it’s been a while till a played with the data so I don’t want to say something wrong by being to detailed in my claims.
For what value of “same”? Did they first analysed 1976 and published and years later analysed 1989 and came to the same conclusions or did they just throw all data together.
Yes