a median SAT score of 1490 (from the LessWrong 2014 survey) corresponds to +2.42 SD, which regresses to +1.93 SD for IQ using an SAT-IQ correlation of +0.80.
I don’t think this is a valid way of doing this, for the same reason it wouldn’t be valid to say
a median height of 178 cm (from the LessWrong 2022 survey) corresponds to +1.85 SD, which regresses to +0.37 SD for IQ using a height-IQ correlation of +0.20.
Those are the real numbers with regards to height BTW.
These both seem valid to me! Now, if you have multiple predictors (like SAT and height), then things get messy because you have to consider their covariance and stuff.
That reasoning as applied to SAT score would only be valid if LW selected its members based on their SAT score, and that reasoning as applied to height would only be valid if LW selected its members based on height (though it looks like both Thomas Kwa and Yair Halberstadt have already beaten me to it).
Edit: well, sort of. I think it depends on what information you’re allowing yourself to know when building your statistical model. If you’re not letting yourself make guesses about how the LW population was selected, then I still think the SAT thing and the height thing are reasonable. However, if you’re actually trying to figure out an estimate of the right answer, you probably shouldn’t blind yourself quite that much.
In general, if we have two vectors X and Y which are jointly normally distributed, we can write the joint mean μ and the joint covariance matrix Σ as
μ=[μXμY],Σ=[KXXKXYKYXKYY]
The conditional distribution for Y given X is given by Y|X∼N(μY|X,KY|X), defined by conditional mean
μY|X=μY+KYXKXX−1(X−μX)
and conditional variance
KY|X=KYYKXX−1KXY
Our conditional distribution for the IQ of the median rationalist, given their SAT score is N(0+(0.8⋅1⋅(2.42−0)),1−(0.8∗1∗0.8))=N(1.94,0.36) (That’s a mean of 129 and a standard deviation of 9 IQ points.)
Our conditional distribution for the IQ of the median rationalist, given their height is N(0+(0.2∗1∗(1.85−0)),1−(0.2∗1∗0.2))=N(0.37,0.96) (That’s a mean of 106 and a standard deviation of 14.7 IQ points.)
Our conditional distribution for the IQ of the median rationalist, given their SAT score and height is N(0+([0.80.2][10.160.161]−1[2.421.85]),1−([0.80.2][10.160.161]−1[0.80.2]))=N(2.04,0.35)(That’s a mean of 131 and a standard deviation of 8.9 IQ points)
Unfortunately, since men are taller than women, and rationalists are mostly male, we can’t use the height as-is when estimating the IQ of the median rationalist (maybe normalizing height within each sex would work?).
I don’t think this is a valid way of doing this, for the same reason it wouldn’t be valid to say
Those are the real numbers with regards to height BTW.
These both seem valid to me! Now, if you have multiple predictors (like SAT and height), then things get messy because you have to consider their covariance and stuff.
That reasoning as applied to SAT score would only be valid if LW selected its members based on their SAT score, and that reasoning as applied to height would only be valid if LW selected its members based on height (though it looks like both Thomas Kwa and Yair Halberstadt have already beaten me to it).
Cool, you’ve convinced me, thanks.
Edit: well, sort of. I think it depends on what information you’re allowing yourself to know when building your statistical model. If you’re not letting yourself make guesses about how the LW population was selected, then I still think the SAT thing and the height thing are reasonable. However, if you’re actually trying to figure out an estimate of the right answer, you probably shouldn’t blind yourself quite that much.
Eric Neyman is right. They are both valid!
In general, if we have two vectors X and Y which are jointly normally distributed, we can write the joint mean μ and the joint covariance matrix Σ as
μ=[μXμY],Σ=[KXXKXYKYXKYY]The conditional distribution for Y given X is given by Y|X∼N(μY|X,KY|X),
defined by conditional mean
μY|X=μY+KYXKXX−1(X−μX)
and conditional variance
KY|X=KYYKXX−1KXY
Our conditional distribution for the IQ of the median rationalist, given their SAT score is N(0+(0.8⋅1⋅(2.42−0)),1−(0.8∗1∗0.8))=N(1.94,0.36)
(That’s a mean of 129 and a standard deviation of 9 IQ points.)
Our conditional distribution for the IQ of the median rationalist, given their height is N(0+(0.2∗1∗(1.85−0)),1−(0.2∗1∗0.2))=N(0.37,0.96)
(That’s a mean of 106 and a standard deviation of 14.7 IQ points.)
Our conditional distribution for the IQ of the median rationalist, given their SAT score and height is N(0+([0.80.2][10.160.161]−1[2.421.85]),1−([0.80.2][10.160.161]−1[0.80.2]))=N(2.04,0.35)(That’s a mean of 131 and a standard deviation of 8.9 IQ points)
Unfortunately, since men are taller than women, and rationalists are mostly male, we can’t use the height as-is when estimating the IQ of the median rationalist (maybe normalizing height within each sex would work?).