I would like to know whether I should delay having children to take advantage of polygenic screening technology. I imagine this could be valuable for other aspiring parents to know as well, and would (probably) have positive externality for aspiring parents to know this. Might also be useful info for grant-makers in the space.
Related: Welcome Polygenically Screened Babies
EtA: If you’re interested in helping with related research in any way or just interested in being kept up-to-date with related research (at a low frequency / high-value), you can PM me.
TL:DR: If you’re female you should consider freezing your eggs and if you’re male with a female partner you should consider talking to them about freezing their eggs. You should probably do this regardless of whether you want to wait for the technology to improve. The process will cost about $40k-50k for the first kid with today’s prices, and probably $10k/kid after that. The benefit will be at least a year or so of increased life expectancy per kid, a decrease of heart disease, diabetes, and various cancers on the order of 10%-40%, and possibly increased IQ of somewhere between 0 and 10 points even if you don’t directly select for it (due to positive pleiotropy).
Here are some more details:
A BASIC PRIMER
So right now we have a bunch of Genome Wide Associate Studies (GWAS) that look at single letters in the genome and how strongly changes in those letters are associated with some trait of interest. These GWAS can usually explain 10-15% of the variance in a given trait, with some notable exceptions such as height, where we can explain >40% of the variance.
I think the two potential benefit of waiting to have kids would be seeing an improvement in the percentage of variance explainable by polygenic scores and having a broader set of traits from which to choose.
WHAT IS AVAILABLE NOW
The only company I know of actually offering polygenic screening available to the general public is Genomic Prediction. Their trait panel is entirely focused on common diseases like heart disease, cancer, diabetes and a couple of others. Let me first give a summary of the cost-effectiveness of this type of “disease reduction” screening.
The implied “variance explained” by the reductions shown in their genomic index is actually quite impressive for some of these diseases. Let’s use their original preprint from here: https://www.mdpi.com/2073-4425/11/6/648/htm
I used Carmi et al’s code from “Utility of polygenic embryo screening for disease depends on the selection strategy” to estimate the implied variance explained given those reductions and come up with predictors able to explain about 40-50% of variance for Type 2 Diabetes, Heart Attack and Coronary Artery Disease, and slightly lower for Hypertension and the others.
Those are very impressive numbers. Most stand-alone predictors explain less than 15% of variance. This implies that either Genomic Prediction’s numbers are wrong, or there’s something really amazing going on in genomic indexing: somehow selecting against multiple diseases is straight up better than selecting for a single disease, even if you only care about a single disease.
Part of this might just be a result of sample size: when your coronary artery disease predictor is trained on one population and your hypertension predictor is trained on another, there’s probably some kind of pooling effect going on. But given that most of the data for these predictors seems to come from UK Biobank, there’s also a more profound implication to the reductions shown in their panel: it seems likely that most of these clinically distinct disease are all manifestations of some underlying “health factor”, and that health factor has a strong genetic basis. Some genetic variants increase your risk of many many diseases. If that was not true, you would not see simultaneous reductions of this size across so many diseases. And my bet is there are reductions to diseases not even shown on the panel. What a crazy thing to discover while researching a LessWrong post reply. This is probably worth a whole post.
EDIT: I found a study that replicated the strong positive pleiotropy effect shown in Genomic Prediction’s index: https://www.researchgate.net/publication/323614487_Improving_genetic_prediction_by_leveraging_genetic_correlations_among_human_diseases_and_traits
This is actually incredible. My interpretation is that there’s not only a general factor g for intelligence across cognitive tasks, but also an h factor for health across multiple diseasees
Anyways, the implication for you question here is that current DISEASE predictors are already very very strong. Explaining 40-50% of variance from a predictor is incredible. That’s probably getting close to the limit of heritability for some of these. So for heart disease, diabetes, and some types of cancer, we’re probably nearing the limit of what polygenic predictors can explain and there is not much point waiting for them to get better. Right now you could probably simultaneously decrease the risk of many of these diseases by 70-80% by selecting among 10 embryos.
WHAT ARE THE BENEFITS OF WAITING?
Disease predictors are not nearly as good for non-European populations. I believe they the second best predictors are for South Asian, followed by east asian and then African. If you or your spouse trace your primary ancestor to one or multiple of those groups, it makes more sense to wait. Predictors for those of African ancestry in particular have substantial room for improvement.
The second caveat is about selecting for non-disease traits. This community has expressed particular interest in selecting for intelligence, though there are obviously other non-disease traits such as conscientiousness or mental energy that are also important.
There is substantial room for improvement in our intelligence predictors. Right now you could likely pay a PHD student <$10,000 to construct an intelligence predictor for you based on the Education Attainment Study #3 that would probably explain about 20% of variance in intelligence. If you had 14 euploid embryos to choose from and 70% of those implant, you would expect your first child to have an IQ about 4-5 points higher than the average of you and your spouse/partner.
Steve Hsu, one of the leading researchers in this field, has estimated that we would be able to explain
50-60%30-40% of the variance in cognitive ability if the UK biobank simply offered their existing intelligence test to the 90% of BioBank participants who haven’t taken it. That would raise the expected IQ gain from selection among 14 embryos to ~9.5 points, which would perhaps be worth waiting for, though it’s not clear when or even if UK Biobank will do that. And since most of the biobank participants are European, the benefit might be somewhat smaller for other ethnicities.So if you and your spouse are both European, you used normal IVF with multiple rounds of egg extractions and improved predictors would be a gain of about 13 IQ points. And since you probably wouldn’t select exclusively for IQ (disease are important too), I’d guess a more realistic gain would be about 10 points.
Also paying that PHD student to make the intelligence predictor might get all research into the genetic roots of intelligence banned, so consider that a major possible downside. Though if it wasn’t banned you could distribute it to anyone who wanted it and everyone doing IVF could have children 3-10 IQ points above their parents.
Then there’s the question of all these other important traits that we don’t even have predictors for, like conscientiousness, mental energy, performance while sleep-deprived and whatever else you value. I haven’t researched these other traits in depth too much, but it seems like there’s a lot of other important stuff that fall into this bucket.
Here’s a GWAS looking at neuroticism that found 190 genes associated with the trait at 2.5*10^-5. https://www.nature.com/articles/s41598-021-82123-5#Sec2
Funny anecdote from the study: the associated genes were found to modulate behavioral response to cocaine. The authors don’t say what percentage of variance is explained by those 190 genes, but my guess is it’s in the neighborhood of 5%. So if you waited 5 years to have kids, these predictors of personality traits would almost certainly improve, probably to somewhere between 15% and 40%.
I can’t find a single GWAS on mental energy. Why has no one looked into that yet?
A similar improvement is likely to happen for many of the other predictors, particularly those for which people have already done GWAS.
Of course there’s one more question you’d have to answer even if you did have great predictors: which of these personality traits should be selected for and how strongly? All else held equal, more intelligence seems to pretty much always be better, and high disease risk seems to pretty much always be worse. Of course you can’t necessarily hold all else equal when selecting a for a finite set of traits, but most of the literature I’ve read about plieotropy suggests that unless you have extremely powerful selection techniques (i.e. iterated embryo selection, gene editing or whole genome synthesis), these are unlikely to be a concern.
But with personality traits I don’t yet have a clear mental model of which traits should be selected for and how strongly. I think most parents mostly want to give their child a happy productive life more than anything else, and besides the no-brainers like reducing predisposition to depression and anxiety, it’s not entirely clear how to do that.
WHAT SHOULD I DO?
If you would be willing to pay ~$40k to substantially decrease your child’s risk of common diseases and increase their lifespan by ~1 year, you should consider doing freezing eggs and doing IVF. And if you’re not ready to have kids yet or you want to wait for polygenic predictors to improve, you should freeze your eggs (or talk with your partner about freezing their eggs).
Why freeze eggs? Well unfortunately a woman’s production of chromosomally normal eggs gets substantially lower with age. The percentage of eggs that will be “euploid” (chromosomally normal) first increases in the late teens and early 20′s before reaching its max around 25. It then slowly declines starting around 30 and really accelerating after age 35. By the early 40′s, 80%+ of eggs produced will be aneuploid. The more euploid eggs available for freezing, the bigger a gain you’ll get from polygenic screening.
A woman’s capacity to actually carry a pregnancy to term on the other hand, lasts well into the post-menopausal period. The oldest mother to giver birth via donor eggs was 74! So by freezing eggs, you can preserve fertility for as much as 40 years.
If you’re single and a guy, then there are not really many action items for you. Sperm quality doesn’t really seem to decline until about 40, at which point it drops off slowly. The only direct option here would be to get eggs from a donor bank, but if you do that you’d likely have to face the challenges of single parenting. Plus donor eggs cost a few tens of thousands, so it would be quite a bit more expensive.
HOW DO I ACTUALLY DO THIS?
If you’re seriously considering doing IVF for polygenic screening, the first step is comparing IVF clinics. Some IVF clinics are 3x the cost of others for essentially the same service. Some IVF clinics have poor implantation cryopreservation and low implantation success rates. So choosing the right clinic will have a big effect on your cost/benefit analysis. Egg retrieval usually takes 3-6 visits from what I’ve heard, so it may actually be worth flying to another state (or perhaps even another country) to lower the price.
You then have to consider the IVF funnel to figure out how much it’s going to cost to achieve a certain reduction in disease risk/increase in healthspan. I really wish there was a tool for this because a lot of factors can substantially affect loss rates. But the basic gist is this: at each step in the IVF process, fewer eggs/embryos come out than go in. The three most important factors affecting the number of embryos you have to choose from are the IVF clinic, the genetic testing company, and the age of the mother.
Here are all the steps that have to be done.
Medication is taken stimulating egg production
Eggs are extracted
Eggs are frozen and unfrozen at a later date (optional but necessary for polygenic screening)
Eggs are fertilized, turning them into embryos
Embryos grow to day 5 blastocysts, at which point they are biopsied
Day 5 blastocysts are biopsied for polygenic screening (and to see if they’re chromosomally normal)
The euploiod embryo with the highest polygenic score is implanted.
A baby is born
At every single one of these steps, fewer eggs/embryos come out than went in.
If you’re 23-28 you’ll probably get around 15 eggs per cycle of IVF. According to some random news articles I looked up, 40%-50% of those will grow to day 5 blastocysts (this might be higher if you go to a good clinic and/or don’t have fertility issues)
If you’re 23-28, about 80% of the embryos that reach this stage will be euploid, meaning they have the potential to implant and turn into a healthy child. The others will either result in miscarriage or have a condition like Down Syndrome if implanted.
When you choose an embryo to implant, there’s a roughly 70% chance it will lead to a live birth (lower if you have fertility issues).
So roughly 30% of eggs extracted will lead to a live birth (though it should be noted that the above numbers may not be accurate since my numbers might be wrong a bunch of factors influence the percentage).
That means you need 3-4x as many eggs extracted as you want to select from. At 15 eggs per IVF cycle in good conditions, that’s 2-3 rounds of egg extraction if you want 10 embryos to choose from (taking implantation rates into account).
I think egg freezing is about $6k/cycle with genetic testing included. So for 3 cycles, that’s about $20k. Then IVF itself is I think like $15k. So maybe $35k-45k all-in cost not including the cost of childbirth, which is stupidly exensive but usually covered by insurance.
It should be noted that there’s actually a pretty big gain from selecting from just 2 embryos. Going up to 10 increases the benefit by about 80%, but the gains are still pretty noticable from any selection at all.
Anyways, I hope this was helpful. Let me know if you want me to write a more in-depth post about how to do IVF for polygenic selection.
There’s also Orchid (https://www.orchidhealth.com/). (And Genomic Prediction is now LifeView, https://www.lifeview.com/.)
So far as I know Orchid hasn’t actually brought a product to market yet, though they’re working on one
I wonder if (or how likely) you can sale the other embryos, and for how much.
I wonder if you want, say, 4 children, whether there are economies of scale (say: will the top 4 embryo out of 30 embryos be better than the top 2 embryos out of 15?).
I’m very interested in this space. You seem to have identify an important need: reviewing IVF clinics. I haven’t researched whether that need is already fulfilled or not, but if not, I’d be interested in thinking of a potential business model to respond to this need (and if not, doing it non-for-profit).
There are also related questions I’d like to research, such as tips on how to choose an independent sperm donor; how much regression to the mean is there (ie. to know when marginal returns becomes too low to be worth it); survey on fraction of people that would donate gamete if asked to by a friend; polygenic screening aside, is there still an economic case for freezing eggs early, etc.. All this could be documented on a specific website.
Anyway, I don’t want to scope creep this project. I’d love if you wrote a more in-depth post about how to do IVF for polygenic selection for LessWrong. Ex.: I’d like to know how egg qualities decrease with age to know when it becomes more important, and what the health risks are.
But if anyone is interested in either a) contributing to those projects, or b) consuming the value of those projects, then I invite you to reach out to me at mathieu.roy.37@gmail.com.
This is not just an intellectual curiosity.
I assume you are imagining a comparison like a one-step procedure where you implant 4 out of a batch of 30, vs a two-step procedure where you must implant 2 out of a batch of 15 each step while discarding the previous batch. The answer turns out to be yes, in this case, but not because of economies of scale (I haven’t seen anyone actually quote discounts for biopsying & genotyping large batches), but option value (and mumble mumble Jensen’s inequality mumble convexity).
The expected ranks are the same because they are i.i.d., so it doesn’t matter on average if you pick 2-of-15 twice or 4-of-30 once. But that’s just the starting point: you have a lossy pipeline to feed them into before you get 4 children out the other end, and a lot of embryos will never implant or yield a live birth. So, in the two-step scenario, when you lose your top candidate and have to pick from the small within-step pool, you get forced back to the mean more than if you had pooled them all in a single step and could retry with the next-best globally. This makes it worse. (Imagine a more extreme scenario where you are picking 1-of-2 100 times vs picking 100-of-200 once. When you lose an embryo in a 1-of-2 step, your backup embryo is at −0.56SD, because just as the expected gain from max(2) is 0.56, the expected gain from the symmetrical min(2) is −0.56SD, and with only 2 candidates, if you lose the max, all you have left is the min. Whereas if you had pooled them, then when you lose your first embryo in 100-of-200, the expected rank of fallback candidate #101 is ~0SD, just like #100 was ~0SD.) The more finegrained irreversible decisions must be, the worse the ‘ratchet’ of losses is. (This is because there are only bad surprises; if there were good surprises, the logic works the other direction, which is why the value of a bunch of separate options > the value of the mean of those options.)
Considering the unpleasantness and difficulty of egg harvesting, and the ease of storage, it doesn’t make much sense to try to do things in many stages.
Regression to a (population) mean is not relevant because you are comparing sibling embryos which are distributed around the parental mean by construction.
For many traits, returns are pretty linear. Like IQ points, it doesn’t much matter where you are, the value of another point is pretty constant. So you can just ignore that entirely and operate on marginal gains. (eg if the parents are both 130 IQ and you calculate a marginal gain of +1 IQ point from selection & that makes it profitable, maybe their genotypic mean is actually 115 IQ, but it doesn’t matter, because the marginal gain will still be +1 IQ point over the batch mean and will still be profitable).
For binary traits, you can have diminishing returns based on where the trait value is. As I note in my other comment, if a family is at very high risk for a trait like schizophrenia, then selection can be extremely valuable beyond what the average PGS % would imply; the flip side of this is that then there must be families who have below-average risk and benefit a below-average amount. However, these diminishing returns are automatically incorporated in any index score (which is how one should be selecting). If the polygenic score for SCZ is already very low, then the index component for embryos which move the SCZ score even lower will be very small, because it reduces a tiny absolute risk by a tiny absolute amount, which has a tiny expected value, and the index will prefer embryos which move more important traits.
If you want to model such scenarios to imagine conditioning on a specific example/family history, it’s easy to just switch out the population fraction for the implied fraction of the ‘parental population’ in your liability-threshold code, and everything works as before. But I generally use the average because the average is the average.
I guess I was using “economies of scale” very loosely; that’s kind of what I had in mind, but thank you for the details and explanations!
I find the mostly linear relationship between IQ and income to be surprising. If we use the odds of winning a Nobel as a proxy for “ability to make an important scientific discovery”, doesn’t the lack of average-IQ winners imply some kind of exponential relationship between IQ and scientific productivity?
If both the linear income relationship and the one described above are true, it implies an exponentially decreasing ability of high IQ people to capture the value they create.
Something like that. A straight line on log odds charts in SMPY, IIRC.
Oh definitely. This is a point Gensowski and others make: part of the reason that the income relationship does bend is that it’s hard to capture all your positive externalities even though the patenting rate etc increases. You can invent the transistor or antibiotics, but you won’t capture anywhere remotely close to 1% of the total surplus. Also part of the country-level story for why IQ looks so powerful: individuals don’t capture anything remotely approaching their full contributions (in the same way that people on the other end of the spectrum do not internalize anything remotely like the harm they do to everyone else), so the country-level correlations can be much stronger than individual-level ones.
Thanks for writing this up. I’d be interested in the more in-depth post.
There’s polygenic screening now. It doesn’t include eg IQ, but polygenic screening for IQ is unlikely to be very good any time in the near future. Probably polygenic screening for other things will improve at some rate, but regardless of how long you wait, it could always improve more if you wait longer, so there will never be a “right time”.
Even in the very unlikely scenario where your decision about child-rearing should depend on something about polygenic screening, I say do it now.
Polygenic predictors have improved since Gwern’s 2016 post on embryo selection. Using his R code for estimating gain given variance and standard deviation and taking the variance explained from the Educational Attainment 3 study, I find that selecting from 10 embryos would produce a gain of between 4 to 5 points for the top-scoring embryo (assuming no implantation loss). Accounting for implantation loss it would probably take 14 embryos or so to get the same benefit.
Gwern’s code: https://www.gwern.net/Embryo-selection#benefit
EA3 study: https://sci-hubtw.hkvisa.net/10.1038/s41588-018-0147-3
Steve Hsu thinks that if we were to offer UK biobank’s IQ test to a million participants, we could get IQ predictors that would explain
50-60%30-40% of variance. That would work out to a gain of 9-10 IQ points from selecting among 10 embryos, and up to 14 points if you had about 30 to choose from. See “technical note” in this post: https://infoproc.blogspot.com/2021/09/kathryn-paige-harden-profile-in-new.htmlWhy not?
Embryos produced by the same couple won’t vary in IQ too much, and we only understand some of the variation in IQ, so we’re trying to predict small differences without being able to see what’s going on too clearly. Gwern predicts that if you had ten embryos to choose from, understood the SNP portion of IQ genetics perfectly, and picked the highest-IQ without selecting on any other factor, you could gain ~9 IQ points over natural conception.
Given our current understanding of IQ genetics, keeping the other two factors the same, you can gain ~3 points. But the vast majority of couples won’t get 10 embryos, and you may want to select for things other than IQ (eg not having deadly diseases). So in reality it’ll be less than that.
The only thing here that will get better in the future is our understanding of IQ genetics, but it doesn’t seem to be moving forward especially quickly, at some point we’ll exhaust the low- and medium- hanging fruits, and even if we do a great job there the gains will max out at somewhere less than 9 points.
Also, this is assuming someone decides to make polygenic screening for IQ available at some point, or someone puts in the work to make it easy for the average person to do despite being not officially available.
I am not an expert in this and would defer to Gwern or anyone who knows more.
Thanks!
And, hypothetically, generating lots of embryos to choose from? Or is that not in the cards?
Yes, generating lots of embryos will help, but the marginal returns decline extremely quickly because you’re taking the maximum of a Gaussian distribution.
Gain is proportional to sqrt(ln(number of embryos))
Going from 30 embryos (roughly the upper bound of what you can get with normal IVF) to 1000 will only increase gain 43%.
43% is still pretty big, but it’s not going to radically change the world. For that you’re going to need massive parallel gene editing, iterated embryo selection or whole genome synthesis.
I did a calculation a while back and estimated it would cost about $200 million to synthesize an entire human genome at today’s prices. But there’s a bunch of other technical challenges you’d have to overcome.
Gene editing seems to be improving but there are still some weird issues with CRISPR where it seems to randomly chop off chromosomes sometimes.
There are a couple groups working towards in-vitro oogenesis, which is probably the most important step to making iterated embryo selection work. I think the main blocking step there is making the environment in which the eggs mature. You need a bunch of follicle cells and the only source for those currently is abortion tissue. There’s a couple groups trying to make them from stem cells instead: https://www.technologyreview.com/2021/10/28/1038172/conception-eggs-reproduction-vitro-gametogenesis/
So lots of stuff in the pipeline, but nothing seems to be imminent.
Thanks, this is an enlightening summary!
(In particular, trying to understand what you’re saying made me understand something basic about IES vs just embryo selection: with IES you’re actually doing many iterations of sexual reproduction with a population, so you can get a genotype composed of any alleles that are present in any of the starting embryos.)
Hmm, that seems like it shouldnt be that hard of a problem to solve, but idk. I hope someone takes this on if that’s really a bottleneck.
If you read the Technology Review article I posted, there is a Japanese team that managed to do it in mice, but it took them 4 years. I don’t really know that much about the technical details, but just judging from the empirical results it does not appear to be an easy problem.
Oh, I assumed the problem was a social / logistic one, but now I’m assuming there’s also a scientific / technological one
I guess the social/logistic one is not wanting to create a solution that relies on a supply of abortion tissue, which leads to the scientific/technological one of how to create the follicle cells.
For a normal trait, the variance of the children of a fixed couple is approximately the population variance. I think that’s a lot.
How old are you, and how long do you propose delaying? There may be some good reasons to delay having children until late 20s or maybe even early 30s (especially if your economic or social-support structures are likely to improve). But also pretty strong evidence that earlier is better (especially the mother’s age matters statistically for success of pregnancy and health of the child).
Generally, if you’re in a position to consider “definitely want kids, unsure whether now or later”, the answer is “now”.
Children now aren’t necessarily mutually exclusive with children in the future. You’re not creating disutility by starting now and then “upping your game” when technology is more accessible!
Right, good point, not necessarily, but also we’re working with finite resources_
I am not sure that “earlier is better”. It’s true that the biology favours early parenthood. But the sociology goes the other way: it’s better to have children when you’re high-income and worldly-wise. So there might be a trade-off between e.g. health and wealth.
There’s a big literature on this, you could start with e.g. Powell, B., Steelman, L.C. and Carini, R.M., 2006. Advancing age, advantaged youth: Parental age and the transmission of resources to children. Social Forces, 84(3), pp.1359-1390; or for health, Myrskylä, M. and Fenelon, A., 2012. Maternal age and offspring adult health: evidence from the health and retirement study. Demography, 49(4), pp.1231-1257, which finds a U-shaped relationship. Be aware of possible confounds (e.g. educated people have kids later, but you deciding to have kids later won’t make you more educated per se).
Update: at a very quick glance the following looks useful: Mikko Myrskylä, Karri Silventoinen, Per Tynelius, Finn Rasmussen, Is Later Better or Worse? Association of Advanced Parental Age With Offspring Cognitive Ability Among Half a Million Young Swedish Men, American Journal of Epidemiology, Volume 177, Issue 7, 1 April 2013, Pages 649–655, https://doi.org/10.1093/aje/kws237
I suspect “it depends” is going to dominate here. I’d argue that for many people, parental health (including sleep resiliency) is most important for young kids, and income/wisdom is more important for tween/teen kids.
I’m also very unwilling to have opinions generally about “earlier” or “later” without reference points—specifics matter. For people in the Western Intellectual class (for whom a university degree and a white-collar job is the default), I’d recommend that 23-27 is a good target, with delays of up to 5 years being quite reasonable, but not first-best for most).
Please do discount my opinion—my wife and I chose not to have kids. This is based on observations and discussions with quite a few people in my extended friends circle, some who had kids “early” (only one at 19, but lots in early 20s), and a lot who had kids “late” (29-35, and one at 43!). Each has specific joys and frustrations and on balance I hear minimal “I wish I’d...” from the earlier crowd.
Also, and importantly, there are almost always overriding factors that make the optimization of parent’s age at birth to be at best a secondary concern. When you find to and agree with your partner and have a stable/trusted situation such that you feel ABLE to commit to having kids is far more important than statistical benefits of age band. Thus, my advice: “when you’ve decided to have kids with this person, my recommendation is not to delay much”.
i didnt put my age because i was asking the question from everyone’s point of view, but am 31 years old; thanks for your input!
Good point! (Although the parents don’t have to use their own gametes or wombs)
One issue nobody has raised yet is the effects of structural racism.
The GWAS studies used to create the polygenic risk scores generally have a very pronounced sampling bias towards people of European ancestry. See for example the GWAS Diversity Monitor, which is a dashboard meant to monitor the sampling practices used by GWAS studies. In addition to selecting people to sample by ethnicity, an accepted practice is to look at the genomes after sampling and try to identify and exclude “ethnic outliers”.
If you or your partner don’t have ethnicities that would make your genomes look typical among the samples used to train the scoring algorithm, it’s an open question whether any particular score instrument is going to be usefully predictive for you or your potential child. See for example Generalization and dilution of association results from European GWAS in populations of non-European ancestry: the PAGE study, which found that, while many GWAS hits generalize from a very restricted sample, a substantial fraction don’t. See also Current clinical use of polygenic scores will risk exacerbating health disparities, which discusses polygenic risk scores in particular, and their accuracy falloff when used on people who the score developers would have excluded from their training set.
Note also that even the papers complaining about this problem are still breaking down their results by very abstract discrete dimensions like “5 continental populations”, which sweep a lot of people under a very large rug. If you and your partner have different ethnicities, you get to be on the wrong end of fun lines like this one, from that last paper:
How long do you think it’ll take for that to be fixed?
It could already be fixed under different regulatory and scientific regimes, enough data exists in general. The statistics isn’t the hard part. (This is much of why UKBB is so amazing.) The barrier is datasharing for TWASes. Hence, much like the question of ‘how well do covid vaccines protect healthy people against infection’ or ‘does this anti-covid drug actually work’ or ‘can we develop a useful covid rapid at-home test’, it’ll take exactly as long to fix as everyone wants it to take, which can be arbitrarily long.
Relatively small amounts (ie roughly already existing or easily obtained) of data are necessary. From a statistical POV, a rising tide lifts all boats—the large GWASes have already done most of the work in prioritizing SNPs containing causal variants and providing highly informative priors on both the distribution of effects & specifically where to look. (Power-wise, it looks something like: if you have a GWAS n=200k in Europeans, you don’t need n=200k in East Asians to get an equivalent East Asian PGS, you only need like n<20k. Think of it as like layers of Swiss cheese: the European GWAS hits will be ambiguous within each block as to which SNP inside it is causal, but then the East Asian blocks slice it up differently and those 3-4 candidates will be split up across different blocks, and you only need to decide between a few candidates, as opposed to the original prior-less situation where you start with millions of candidates. And there are, across the world, much more than 20k genotyped East Asians etc.)
Is it not already sort of fixed?
We know how well PRS perform in other ancestries, right? It just means that PRS are a little bit less good, not that it doesn’t work today.
If you are rich enough to afford a full time nanny, and pay for surrogacy then sure, freeze some gamete and have kids when you can get the genetic testing and have genetically superior offspring.
Raising kids, especially toddlers, is exhausting work best done by the young but mature. So, in the 25-35 age range.
Are you already committed to a specific person to have children with?
The reason I ask is that who you have children with will have a drastically larger impact on the quality of children you get vs even 100% accurate polygenic screening. If waiting 10 years gets you better polygenic screening but makes finding a good partner (genetics + character/culture etc...) somewhat less likely, then the tradeoff may not be worth it.
(It’s still smart to freeze eggs/sperm anyway)
Good point. (No I’m not. I’m also considering using gamete donors anyway.)
In my opinion, as of 2021, no.
Our polygenic scores are predictive, not causal. That is, we estimate them by maximizing predictive power in a non-causal setting. They do have some causal effects. For example, about half a score for educational attainment’s correlation with EA is causal. The other half is correlated environments, genetic nurture etc.
In future, sibling-based designs may lead to truly causal scores. I don’t know how long that will take.
Scores typically have low predictive power. The R2 of scores for EA is IIRC about 10%. By definition, they will never go above the heritability of EA which is only about 40%. Unless you’re prepared to pump out tons of eggs, test them all, and pick the best, you are probably not going to do much to change the phenotype. Again, with larger sample sizes this will eventually change.
Polygenic scores are, well, polygenic. They sum up many different genes. We don’t know what else they correlate with. You are in effect giving your child a pill with unknown side effects.
Maybe in future that won’t be true. I don’t know how long that will take, but I suspect it will be long because there is a lot we don’t know.
Having children involves risks. Your child might end up with a terrible disease. Even with all the genetic testing in the world, your child might be hit by a bus and end up quadriplegic. Or they may mature into smart, healthy adults… and embrace values that you find deeply wrong.
That’s not to say you should be casual about those risks. It is to say that you have to accept some risk. Delaying childrearing because in future you might be able to avoid some of this risk suggests that you may not have internalized that. (I only say “suggests”. I don’t know you!)
If you do want to delay childrearing so you can screen using reliably causal polygenic scores with high predictive power and no risk of side effects, I think you should be prepared to wait 10 years (and maybe more). If so, consider freezing your/your partner’s eggs.
There are some good general reasons to delay childrearing:
Older parents tend to be richer
Older parents might be wiser
Both those things benefit children (pace Robert Plomin!) The benefits of older parents are visible in the data (within families, so controlling for obvious differences between early- and late-parenting families). But of course that depends how old, wise and rich you and your partner are now.
The power of a PGS is more strongly related to the R than to the R2. So a PGS with an R2 of 10% corresponds to sqrt(10%)=0.32, which is sqrt(10%)/sqrt(40%)=half of the total selective power you could get from a PGS. (Well, except for the aspect where the EA PGS is not super causal. Though that is relatively limited to EA and doesn’t hold for e.g. IQ PGS.)
This can be solved by looking at genetic correlations, which are frequently computed for polygenic scores.
I’m not sure what you mean by selective power. I suppose the natural question is “how many extra (e.g.) IQ points do I get for an extra standard deviation of a PGS?” In other words, you want the regression coefficient, where the dependent variable is on some meaningful scale. I stand by my comment, unless you can show a PGS where a 1 s.d. change currently does something big.
Genetic correlations: maybe, but we haven’t looked at genetic correlations for many things, and indeed we don’t have other polygenic scores to correlate them with for many things, and indeed we haven’t collected questions on big enough samples to create those polygenic scores for many things, so again, we aren’t there yet.
A 1SD change on a latent variable can have a big absolute risk effect for liability-threshold traits like schizophrenia depending on the pre-existing absolute risk / where one is on the latent spectrum.
(This is the nonlinearity of normal distributions and thin tails again—if the risk is ~0 SD, perhaps because there are 2 schizophrenic parents carrying a very high risk burden, then shifting a fraction of a SD drops the absolute risk down from 40% to 11% (
pnorm(qnorm(0.4) - 1)
)*, because the density drops fast in the middle of the bell curve; if the risk is <-2SD because there is no family history of schizophrenia, then the exact same latent shift in SDs will drop the absolute risk from something like 0.8% to 0.5%, because you are already out in the thin tails where you can only go from ‘rare’ to ‘rare’. Same PGS+embryo-count, same latent shift, very different practical implications. This is also true of common dichotomous traits where everyone is at high risk, not merely specific families: for example, diabetes or heart disease. Since a quarter to a half of the population will get these, the latent risk is very high, and 1SD shift on it will have large practical consequences. Going from 30% risk of diabetes to 6% risk would make a big difference healthwise.)* for selecting using solely the current 2020 SCZ PGS of 7%, out of 5 embryos you’d get ‘only’ −0.21 SDs at most, so for the double-SCZ-parent case, that’d drop from 40% to 32%. Nevertheless, considering how devastating schizophrenia can be, to themselves and everyone around them, I’d say that an absolute risk change of 8% is extremely valuable and ‘does something big’.
Yes we have. We have genetic correlations for literally thousands of human traits. The UKBB alone lets you compute pairwise correlations over like 4k traits. And this also ignores that we especially have composite/index traits like longevity, SES, mental illness diagnoses, or self-reported health. It requires tremendous gymnastics to claim there is some hidden explosive correlation which somehow doesn’t show up in those global traits; obviously, even if selecting for IQ selected for some deadly disease that has entirely escaped measurement, that must be vastly outweighed by all the other deadly diseases it selects against, otherwise the net positive (genetic & phenotypic) correlation with all-cause mortality/longevity would not exist. It’d be nil, or the other direction.
(The genetic correlation argument is one of the first counterobjections everyone comes up with, but it requires an almost total ignorance of the genetic correlation literature to sustain. Which is why critics like Turley have resorted to either focusing solely on EDU because it has some negative correlates & high IGE they can hammer on while counting on a receptive audience which doesn’t know that EDU is extremely unrepresentative and a bait-and-switch for IQ; moving the goalposts about ‘efficacy’ even further; or just abandoning all the original counterarguments entirely and talking about “but we haven’t clinically validated embryo selection and it would take decades to do so and the PGSes might change”, which is both false (countless sibling comparisons prove they work, PGSes don’t change much over time, not for what people would select on) and a nifty catch-22 - you can’t ‘validate’ them if it’s been banned because they haven’t been validated...)
So what is the “best” way to validate them, in your opinion? Is there anything better than sibling comparisons?
The only thing more valid than sibling comparisons is actually doing it. Actually doing it should add only an iota to your confidence in it being valid, because all it is is what siblings already are.
The R value is equivalent to the standardized version of the regression coefficient (modulo some statistical details that don’t make a difference here). Therefore it will be linearly related to the regression coefficient, in whichever scale you choose. Meanwhile, the R2 will be nonlinearly related to the regression coefficient, due to being a nonlinear function of R. See also Marco Del Giudice’s paper on the same topic: Are we comparing apples or apples squared? The proportion of explained variance exaggerates differences between effects
Sure. But the most interesting dependent variable isn’t usually “how many standard deviations of Y will I gain”, it’s e.g. “how many years of education will I gain”. In any case, on either scale, is there a PGS where a 1 s.d. change does something big? You might say the most recent EA is a candidate. In one dataset a 1 s.d. increase causes (i.e. within-siblings) about a 4.5 percentage point increase in the probability of university attendance.
I agree that SD units are strictly speaking meaningless and something like this is reelvant. However I’m just saying that R2 does not help over R with this, and in fact makes it worse because R2 is nonlinearly related to the meaningful quantities while R is linearly related to the meaningful quantities.
I do not know how EA PGS relates to meaningful quantities, and to be honest I would not recommend selecting for EA PGS because (to paraphrase one of gwern’s articles) EA measures an input rather than an output (unlike intelligence PGS), and so it is more likely to contain bad stuff too. (IIRC EA PGS contributes to a bunch of mental illnesses, whereas intelligence PGS only contributes to autism and anorexia. And realistically GD too but I haven’t seen explicit data on it yet.)
Selection of embryos based on polygenic scoring is a cute idea but is decades away at best.
Your ability to pick an appropriate partner to conceive a child with has been honed over 100,000′s of years (actually, more like ‘since sexual reproduction began’). However, it’s far from perfect and there are some gross genetic failings (existing and de novo) that genetic screening can recognise that aren’t obvious at the everyday scale of day to day living.
How can it be decades away if a couple of random “transhumanist” couples are already doing it? Mass adoption might be decades away, but lesswrongers are weird people who are often interested in early-adopting new technologies (like cryptocurrency, cryonics, etc). https://www.geneticsandsociety.org/biopolitical-times/first-polygenic-risk-score-baby