Scores typically have low predictive power. The R2 of scores for EA is IIRC about 10%. By definition, they will never go above the heritability of EA which is only about 40%. Unless you’re prepared to pump out tons of eggs, test them all, and pick the best, you are probably not going to do much to change the phenotype. Again, with larger sample sizes this will eventually change.
The power of a PGS is more strongly related to the R than to the R2. So a PGS with an R2 of 10% corresponds to sqrt(10%)=0.32, which is sqrt(10%)/sqrt(40%)=half of the total selective power you could get from a PGS. (Well, except for the aspect where the EA PGS is not super causal. Though that is relatively limited to EA and doesn’t hold for e.g. IQ PGS.)
Polygenic scores are, well, polygenic. They sum up many different genes. We don’t know what else they correlate with. You are in effect giving your child a pill with unknown side effects. Maybe in future that won’t be true. I don’t know how long that will take, but I suspect it will be long because there is a lot we don’t know.
This can be solved by looking at genetic correlations, which are frequently computed for polygenic scores.
I’m not sure what you mean by selective power. I suppose the natural question is “how many extra (e.g.) IQ points do I get for an extra standard deviation of a PGS?” In other words, you want the regression coefficient, where the dependent variable is on some meaningful scale. I stand by my comment, unless you can show a PGS where a 1 s.d. change currently does something big.
Genetic correlations: maybe, but we haven’t looked at genetic correlations for many things, and indeed we don’t have other polygenic scores to correlate them with for many things, and indeed we haven’t collected questions on big enough samples to create those polygenic scores for many things, so again, we aren’t there yet.
I stand by my comment, unless you can show a PGS where a 1 s.d. change currently does something big.
A 1SD change on a latent variable can have a big absolute risk effect for liability-threshold traits like schizophrenia depending on the pre-existing absolute risk / where one is on the latent spectrum.
(This is the nonlinearity of normal distributions and thin tails again—if the risk is ~0 SD, perhaps because there are 2 schizophrenic parents carrying a very high risk burden, then shifting a fraction of a SD drops the absolute risk down from 40% to 11% (pnorm(qnorm(0.4) - 1))*, because the density drops fast in the middle of the bell curve; if the risk is <-2SD because there is no family history of schizophrenia, then the exact same latent shift in SDs will drop the absolute risk from something like 0.8% to 0.5%, because you are already out in the thin tails where you can only go from ‘rare’ to ‘rare’. Same PGS+embryo-count, same latent shift, very different practical implications. This is also true of common dichotomous traits where everyone is at high risk, not merely specific families: for example, diabetes or heart disease. Since a quarter to a half of the population will get these, the latent risk is very high, and 1SD shift on it will have large practical consequences. Going from 30% risk of diabetes to 6% risk would make a big difference healthwise.)
* for selecting using solely the current 2020 SCZ PGS of 7%, out of 5 embryos you’d get ‘only’ −0.21 SDs at most, so for the double-SCZ-parent case, that’d drop from 40% to 32%. Nevertheless, considering how devastating schizophrenia can be, to themselves and everyone around them, I’d say that an absolute risk change of 8% is extremely valuable and ‘does something big’.
Genetic correlations: maybe, but we haven’t looked at genetic correlations for many things
Yes we have. We have genetic correlations for literally thousands of human traits. The UKBB alone lets you compute pairwise correlations over like 4k traits. And this also ignores that we especially have composite/index traits like longevity, SES, mental illness diagnoses, or self-reported health. It requires tremendous gymnastics to claim there is some hidden explosive correlation which somehow doesn’t show up in those global traits; obviously, even if selecting for IQ selected for some deadly disease that has entirely escaped measurement, that must be vastly outweighed by all the other deadly diseases it selects against, otherwise the net positive (genetic & phenotypic) correlation with all-cause mortality/longevity would not exist. It’d be nil, or the other direction.
(The genetic correlation argument is one of the first counterobjections everyone comes up with, but it requires an almost total ignorance of the genetic correlation literature to sustain. Which is why critics like Turley have resorted to either focusing solely on EDU because it has some negative correlates & high IGE they can hammer on while counting on a receptive audience which doesn’t know that EDU is extremely unrepresentative and a bait-and-switch for IQ; moving the goalposts about ‘efficacy’ even further; or just abandoning all the original counterarguments entirely and talking about “but we haven’t clinically validated embryo selection and it would take decades to do so and the PGSes might change”, which is both false (countless sibling comparisons prove they work, PGSes don’t change much over time, not for what people would select on) and a nifty catch-22 - you can’t ‘validate’ them if it’s been banned because they haven’t been validated...)
The only thing more valid than sibling comparisons is actually doing it. Actually doing it should add only an iota to your confidence in it being valid, because all it is is what siblings already are.
I’m not sure what you mean by selective power. I suppose the natural question is “how many extra (e.g.) IQ points do I get for an extra standard deviation of a PGS?” In other words, you want the regression coefficient, where the dependent variable is on some meaningful scale. I stand by my comment, unless you can show a PGS where a 1 s.d. change currently does something big.
The R value is equivalent to the standardized version of the regression coefficient (modulo some statistical details that don’t make a difference here). Therefore it will be linearly related to the regression coefficient, in whichever scale you choose. Meanwhile, the R2 will be nonlinearly related to the regression coefficient, due to being a nonlinear function of R. See also Marco Del Giudice’s paper on the same topic: Are we comparing apples or apples squared? The proportion of explained variance exaggerates differences between effects
Sure. But the most interesting dependent variable isn’t usually “how many standard deviations of Y will I gain”, it’s e.g. “how many years of education will I gain”. In any case, on either scale, is there a PGS where a 1 s.d. change does something big? You might say the most recent EA is a candidate. In one dataset a 1 s.d. increase causes (i.e. within-siblings) about a 4.5 percentage point increase in the probability of university attendance.
I agree that SD units are strictly speaking meaningless and something like this is reelvant. However I’m just saying that R2 does not help over R with this, and in fact makes it worse because R2 is nonlinearly related to the meaningful quantities while R is linearly related to the meaningful quantities.
I do not know how EA PGS relates to meaningful quantities, and to be honest I would not recommend selecting for EA PGS because (to paraphrase one of gwern’s articles) EA measures an input rather than an output (unlike intelligence PGS), and so it is more likely to contain bad stuff too. (IIRC EA PGS contributes to a bunch of mental illnesses, whereas intelligence PGS only contributes to autism and anorexia. And realistically GD too but I haven’t seen explicit data on it yet.)
The power of a PGS is more strongly related to the R than to the R2. So a PGS with an R2 of 10% corresponds to sqrt(10%)=0.32, which is sqrt(10%)/sqrt(40%)=half of the total selective power you could get from a PGS. (Well, except for the aspect where the EA PGS is not super causal. Though that is relatively limited to EA and doesn’t hold for e.g. IQ PGS.)
This can be solved by looking at genetic correlations, which are frequently computed for polygenic scores.
I’m not sure what you mean by selective power. I suppose the natural question is “how many extra (e.g.) IQ points do I get for an extra standard deviation of a PGS?” In other words, you want the regression coefficient, where the dependent variable is on some meaningful scale. I stand by my comment, unless you can show a PGS where a 1 s.d. change currently does something big.
Genetic correlations: maybe, but we haven’t looked at genetic correlations for many things, and indeed we don’t have other polygenic scores to correlate them with for many things, and indeed we haven’t collected questions on big enough samples to create those polygenic scores for many things, so again, we aren’t there yet.
A 1SD change on a latent variable can have a big absolute risk effect for liability-threshold traits like schizophrenia depending on the pre-existing absolute risk / where one is on the latent spectrum.
(This is the nonlinearity of normal distributions and thin tails again—if the risk is ~0 SD, perhaps because there are 2 schizophrenic parents carrying a very high risk burden, then shifting a fraction of a SD drops the absolute risk down from 40% to 11% (
pnorm(qnorm(0.4) - 1)
)*, because the density drops fast in the middle of the bell curve; if the risk is <-2SD because there is no family history of schizophrenia, then the exact same latent shift in SDs will drop the absolute risk from something like 0.8% to 0.5%, because you are already out in the thin tails where you can only go from ‘rare’ to ‘rare’. Same PGS+embryo-count, same latent shift, very different practical implications. This is also true of common dichotomous traits where everyone is at high risk, not merely specific families: for example, diabetes or heart disease. Since a quarter to a half of the population will get these, the latent risk is very high, and 1SD shift on it will have large practical consequences. Going from 30% risk of diabetes to 6% risk would make a big difference healthwise.)* for selecting using solely the current 2020 SCZ PGS of 7%, out of 5 embryos you’d get ‘only’ −0.21 SDs at most, so for the double-SCZ-parent case, that’d drop from 40% to 32%. Nevertheless, considering how devastating schizophrenia can be, to themselves and everyone around them, I’d say that an absolute risk change of 8% is extremely valuable and ‘does something big’.
Yes we have. We have genetic correlations for literally thousands of human traits. The UKBB alone lets you compute pairwise correlations over like 4k traits. And this also ignores that we especially have composite/index traits like longevity, SES, mental illness diagnoses, or self-reported health. It requires tremendous gymnastics to claim there is some hidden explosive correlation which somehow doesn’t show up in those global traits; obviously, even if selecting for IQ selected for some deadly disease that has entirely escaped measurement, that must be vastly outweighed by all the other deadly diseases it selects against, otherwise the net positive (genetic & phenotypic) correlation with all-cause mortality/longevity would not exist. It’d be nil, or the other direction.
(The genetic correlation argument is one of the first counterobjections everyone comes up with, but it requires an almost total ignorance of the genetic correlation literature to sustain. Which is why critics like Turley have resorted to either focusing solely on EDU because it has some negative correlates & high IGE they can hammer on while counting on a receptive audience which doesn’t know that EDU is extremely unrepresentative and a bait-and-switch for IQ; moving the goalposts about ‘efficacy’ even further; or just abandoning all the original counterarguments entirely and talking about “but we haven’t clinically validated embryo selection and it would take decades to do so and the PGSes might change”, which is both false (countless sibling comparisons prove they work, PGSes don’t change much over time, not for what people would select on) and a nifty catch-22 - you can’t ‘validate’ them if it’s been banned because they haven’t been validated...)
So what is the “best” way to validate them, in your opinion? Is there anything better than sibling comparisons?
The only thing more valid than sibling comparisons is actually doing it. Actually doing it should add only an iota to your confidence in it being valid, because all it is is what siblings already are.
The R value is equivalent to the standardized version of the regression coefficient (modulo some statistical details that don’t make a difference here). Therefore it will be linearly related to the regression coefficient, in whichever scale you choose. Meanwhile, the R2 will be nonlinearly related to the regression coefficient, due to being a nonlinear function of R. See also Marco Del Giudice’s paper on the same topic: Are we comparing apples or apples squared? The proportion of explained variance exaggerates differences between effects
Sure. But the most interesting dependent variable isn’t usually “how many standard deviations of Y will I gain”, it’s e.g. “how many years of education will I gain”. In any case, on either scale, is there a PGS where a 1 s.d. change does something big? You might say the most recent EA is a candidate. In one dataset a 1 s.d. increase causes (i.e. within-siblings) about a 4.5 percentage point increase in the probability of university attendance.
I agree that SD units are strictly speaking meaningless and something like this is reelvant. However I’m just saying that R2 does not help over R with this, and in fact makes it worse because R2 is nonlinearly related to the meaningful quantities while R is linearly related to the meaningful quantities.
I do not know how EA PGS relates to meaningful quantities, and to be honest I would not recommend selecting for EA PGS because (to paraphrase one of gwern’s articles) EA measures an input rather than an output (unlike intelligence PGS), and so it is more likely to contain bad stuff too. (IIRC EA PGS contributes to a bunch of mental illnesses, whereas intelligence PGS only contributes to autism and anorexia. And realistically GD too but I haven’t seen explicit data on it yet.)