In theory, you can use measured correlation to rule out models that predict the measured correlation to be some other number. In practice this is not very useful because the space of all possible models is enormous. So what happens in practice is that we make some enormously strong assumptions that restrict the space of possible models to something manageable.
Such assumptions may include: that measured IQ scores consist of some genetic base plus some noise from other factors including environmental factors and measurement error. We might further assume that the inherited base is linear in contributions from genetic factors with unknown weights, and the noise is independent and normally distributed with zero mean and unknown variance parameter. I’ve emphasized some of the words indicating stronger assumptions.
You might think that these assumptions are wildly restrictive and unlikely to be true, and you would be correct. Simplified models are almost never true, but they may be useful nonetheless because we have bounded rationality. So there is now a hypothesis A: “The model is adequate for predicting reality”.
Now that you have a model with various parameters, you can do Bayesian updates to update distributions for parameters—that is the hypotheses “A and (specific parameter values)”—and also various alternative “assumption failure” hypotheses. In the given example, we would very quickly find overwhelming evidence for “the noise is not independent”, and consequently employ our limited capacity for evaluation on a different class of (probably more complex) models.
This hasn’t actually answered your original question “what does that tell me about the IQ of her twin sister Beth?”, because in the absence of a model it tells you essentially nothing. There exist distributions for the conditional distributions of twin IQ (I1,I2) that have a correlation coefficient 0.86 and yield any distribution you like for I1 given I2 = 120. We can rule most of them out on more or less vague grounds of being “biologically implausible”, but not purely from a mathematical perspective.
But let’s continue anyway.
First, we need to know more about the circumstances in which we arrived at this situation, where we knew Alice’s IQ and not Beth’s. Is this event likely to have been dependent in any significant way upon their IQs, or the ordering thereof? Let’s assume not, because that’s simpler. E.g. we just happened to pick some twin pair out of the world and found out one of their IQs at random but not yet the other.
Then maybe we could use a model like the one I introduced, where the IQs I1 and I2 of twins are of the form
I_k = S + e_k,
where S is some shared “predisposition” which is normally distributed, and the noise terms e_k are independent and normally distributed with zero mean and common variance. Common genetics and (usually) common environment would influence S, while individual variations and measurement errors would be considered in the e_k.
Now, this model is almost certainly wrong in important ways. In particular the assumption of independent additivity doesn’t have any experimental evidence for it, and there doesn’t seem to be any reason to expect it to hold (especially for a curve-fitted statistic like IQ). Nonetheless, it’s worth investigating one of the simplest models.
There is some evidence that the distribution of IQ for twins is slightly different from that for the general population, but probably by less than 1 IQ point so it’s fairly safe to assume that both I_1 and I_2 have mean close to 100 and standard deviation close to 15. In this simple model, the correlation coefficient of the population is just var(S) / 15^2, and so if the study was conducted well enough to accurately measure the population correlation coefficient, then we should conclude that standard deviations are near 13.9 for S and 5.6 for e_k.
Now we can look at the distribution of (unknown) S and e_1 that could result in I_1 = 120. Each of these are normally distributed and so the conditional distribution for the components of the sum is also normally distributed, with E[S | I_1 = 120] = 100 + 20 * var(S) / 15^2 and E[e_1 | I_1 = 120] = 20 * var(e_1) / 15^2.
So in this case, the conditional distribution for S will be centered on 117.2. This differs from the mean by a factor of 0.86 of the difference between I_1 and the mean, which is just the correlation coefficient r. The conditional variance for S is √(1-r) times the unconditional variance, so about 5.2.
Now you have enough information to calculate a conditional distribution for Beth. The expected conditional distribution for her IQ would (under this model) be normally distributed with mean ≅ 117.2 and standard deviation 15 √(1 - r^2) ≅ 7.6.
Therefore to the extent that you have credence in this model and the studies estimating those correlations you could expect about a 70% chance for her IQ to be in the range 110 to 125.
Similar calculations for Carl lead to a lower and wider distribution with a 70% range more like 96 to 123.
The corresponding range for cousin Dominic’s distribution would be 88 to 118, almost the same as you might expect for a completely random person (85 to 115).
In theory, you can use measured correlation to rule out models that predict the measured correlation to be some other number. In practice this is not very useful because the space of all possible models is enormous. So what happens in practice is that we make some enormously strong assumptions that restrict the space of possible models to something manageable.
Such assumptions may include: that measured IQ scores consist of some genetic base plus some noise from other factors including environmental factors and measurement error. We might further assume that the inherited base is linear in contributions from genetic factors with unknown weights, and the noise is independent and normally distributed with zero mean and unknown variance parameter. I’ve emphasized some of the words indicating stronger assumptions.
You might think that these assumptions are wildly restrictive and unlikely to be true, and you would be correct. Simplified models are almost never true, but they may be useful nonetheless because we have bounded rationality. So there is now a hypothesis A: “The model is adequate for predicting reality”.
Now that you have a model with various parameters, you can do Bayesian updates to update distributions for parameters—that is the hypotheses “A and (specific parameter values)”—and also various alternative “assumption failure” hypotheses. In the given example, we would very quickly find overwhelming evidence for “the noise is not independent”, and consequently employ our limited capacity for evaluation on a different class of (probably more complex) models.
This hasn’t actually answered your original question “what does that tell me about the IQ of her twin sister Beth?”, because in the absence of a model it tells you essentially nothing. There exist distributions for the conditional distributions of twin IQ (I1,I2) that have a correlation coefficient 0.86 and yield any distribution you like for I1 given I2 = 120. We can rule most of them out on more or less vague grounds of being “biologically implausible”, but not purely from a mathematical perspective.
But let’s continue anyway.
First, we need to know more about the circumstances in which we arrived at this situation, where we knew Alice’s IQ and not Beth’s. Is this event likely to have been dependent in any significant way upon their IQs, or the ordering thereof? Let’s assume not, because that’s simpler. E.g. we just happened to pick some twin pair out of the world and found out one of their IQs at random but not yet the other.
Then maybe we could use a model like the one I introduced, where the IQs I1 and I2 of twins are of the form
I_k = S + e_k,
where S is some shared “predisposition” which is normally distributed, and the noise terms e_k are independent and normally distributed with zero mean and common variance. Common genetics and (usually) common environment would influence S, while individual variations and measurement errors would be considered in the e_k.
Now, this model is almost certainly wrong in important ways. In particular the assumption of independent additivity doesn’t have any experimental evidence for it, and there doesn’t seem to be any reason to expect it to hold (especially for a curve-fitted statistic like IQ). Nonetheless, it’s worth investigating one of the simplest models.
There is some evidence that the distribution of IQ for twins is slightly different from that for the general population, but probably by less than 1 IQ point so it’s fairly safe to assume that both I_1 and I_2 have mean close to 100 and standard deviation close to 15. In this simple model, the correlation coefficient of the population is just var(S) / 15^2, and so if the study was conducted well enough to accurately measure the population correlation coefficient, then we should conclude that standard deviations are near 13.9 for S and 5.6 for e_k.
Now we can look at the distribution of (unknown) S and e_1 that could result in I_1 = 120. Each of these are normally distributed and so the conditional distribution for the components of the sum is also normally distributed, with E[S | I_1 = 120] = 100 + 20 * var(S) / 15^2 and E[e_1 | I_1 = 120] = 20 * var(e_1) / 15^2.
So in this case, the conditional distribution for S will be centered on 117.2. This differs from the mean by a factor of 0.86 of the difference between I_1 and the mean, which is just the correlation coefficient r. The conditional variance for S is √(1-r) times the unconditional variance, so about 5.2.
Now you have enough information to calculate a conditional distribution for Beth. The expected conditional distribution for her IQ would (under this model) be normally distributed with mean ≅ 117.2 and standard deviation 15 √(1 - r^2) ≅ 7.6.
Therefore to the extent that you have credence in this model and the studies estimating those correlations you could expect about a 70% chance for her IQ to be in the range 110 to 125.
Similar calculations for Carl lead to a lower and wider distribution with a 70% range more like 96 to 123.
The corresponding range for cousin Dominic’s distribution would be 88 to 118, almost the same as you might expect for a completely random person (85 to 115).