I wonder if you want, say, 4 children, whether there are economies of scale (say: will the top 4 embryo out of 30 embryos be better than the top 2 embryos out of 15?).
I assume you are imagining a comparison like a one-step procedure where you implant 4 out of a batch of 30, vs a two-step procedure where you must implant 2 out of a batch of 15 each step while discarding the previous batch. The answer turns out to be yes, in this case, but not because of economies of scale (I haven’t seen anyone actually quote discounts for biopsying & genotyping large batches), but option value (and mumble mumble Jensen’s inequality mumble convexity).
The expected ranks are the same because they are i.i.d., so it doesn’t matter on average if you pick 2-of-15 twice or 4-of-30 once. But that’s just the starting point: you have a lossy pipeline to feed them into before you get 4 children out the other end, and a lot of embryos will never implant or yield a live birth. So, in the two-step scenario, when you lose your top candidate and have to pick from the small within-step pool, you get forced back to the mean more than if you had pooled them all in a single step and could retry with the next-best globally. This makes it worse. (Imagine a more extreme scenario where you are picking 1-of-2 100 times vs picking 100-of-200 once. When you lose an embryo in a 1-of-2 step, your backup embryo is at −0.56SD, because just as the expected gain from max(2) is 0.56, the expected gain from the symmetrical min(2) is −0.56SD, and with only 2 candidates, if you lose the max, all you have left is the min. Whereas if you had pooled them, then when you lose your first embryo in 100-of-200, the expected rank of fallback candidate #101 is ~0SD, just like #100 was ~0SD.) The more finegrained irreversible decisions must be, the worse the ‘ratchet’ of losses is. (This is because there are only bad surprises; if there were good surprises, the logic works the other direction, which is why the value of a bunch of separate options > the value of the mean of those options.)
Considering the unpleasantness and difficulty of egg harvesting, and the ease of storage, it doesn’t make much sense to try to do things in many stages.
how much regression to the mean is there (ie. to know when marginal returns becomes too low to be worth it)
Regression to a (population) mean is not relevant because you are comparing sibling embryos which are distributed around the parental mean by construction.
For many traits, returns are pretty linear. Like IQ points, it doesn’t much matter where you are, the value of another point is pretty constant. So you can just ignore that entirely and operate on marginal gains. (eg if the parents are both 130 IQ and you calculate a marginal gain of +1 IQ point from selection & that makes it profitable, maybe their genotypic mean is actually 115 IQ, but it doesn’t matter, because the marginal gain will still be +1 IQ point over the batch mean and will still be profitable).
For binary traits, you can have diminishing returns based on where the trait value is. As I note in my other comment, if a family is at very high risk for a trait like schizophrenia, then selection can be extremely valuable beyond what the average PGS % would imply; the flip side of this is that then there must be families who have below-average risk and benefit a below-average amount. However, these diminishing returns are automatically incorporated in any index score (which is how one should be selecting). If the polygenic score for SCZ is already very low, then the index component for embryos which move the SCZ score even lower will be very small, because it reduces a tiny absolute risk by a tiny absolute amount, which has a tiny expected value, and the index will prefer embryos which move more important traits.
If you want to model such scenarios to imagine conditioning on a specific example/family history, it’s easy to just switch out the population fraction for the implied fraction of the ‘parental population’ in your liability-threshold code, and everything works as before. But I generally use the average because the average is the average.
I find the mostly linear relationship between IQ and income to be surprising. If we use the odds of winning a Nobel as a proxy for “ability to make an important scientific discovery”, doesn’t the lack of average-IQ winners imply some kind of exponential relationship between IQ and scientific productivity?
If both the linear income relationship and the one described above are true, it implies an exponentially decreasing ability of high IQ people to capture the value they create.
doesn’t the lack of average-IQ winners imply some kind of exponential relationship between IQ and scientific productivity?
Something like that. A straight line on log odds charts in SMPY, IIRC.
it implies an exponentially decreasing ability of high IQ people to capture the value they create.
Oh definitely. This is a point Gensowski and others make: part of the reason that the income relationship does bend is that it’s hard to capture all your positive externalities even though the patenting rate etc increases. You can invent the transistor or antibiotics, but you won’t capture anywhere remotely close to 1% of the total surplus. Also part of the country-level story for why IQ looks so powerful: individuals don’t capture anything remotely approaching their full contributions (in the same way that people on the other end of the spectrum do not internalize anything remotely like the harm they do to everyone else), so the country-level correlations can be much stronger than individual-level ones.
I assume you are imagining a comparison like a one-step procedure where you implant 4 out of a batch of 30, vs a two-step procedure where you must implant 2 out of a batch of 15 each step while discarding the previous batch. The answer turns out to be yes, in this case, but not because of economies of scale (I haven’t seen anyone actually quote discounts for biopsying & genotyping large batches), but option value (and mumble mumble Jensen’s inequality mumble convexity).
The expected ranks are the same because they are i.i.d., so it doesn’t matter on average if you pick 2-of-15 twice or 4-of-30 once. But that’s just the starting point: you have a lossy pipeline to feed them into before you get 4 children out the other end, and a lot of embryos will never implant or yield a live birth. So, in the two-step scenario, when you lose your top candidate and have to pick from the small within-step pool, you get forced back to the mean more than if you had pooled them all in a single step and could retry with the next-best globally. This makes it worse. (Imagine a more extreme scenario where you are picking 1-of-2 100 times vs picking 100-of-200 once. When you lose an embryo in a 1-of-2 step, your backup embryo is at −0.56SD, because just as the expected gain from max(2) is 0.56, the expected gain from the symmetrical min(2) is −0.56SD, and with only 2 candidates, if you lose the max, all you have left is the min. Whereas if you had pooled them, then when you lose your first embryo in 100-of-200, the expected rank of fallback candidate #101 is ~0SD, just like #100 was ~0SD.) The more finegrained irreversible decisions must be, the worse the ‘ratchet’ of losses is. (This is because there are only bad surprises; if there were good surprises, the logic works the other direction, which is why the value of a bunch of separate options > the value of the mean of those options.)
Considering the unpleasantness and difficulty of egg harvesting, and the ease of storage, it doesn’t make much sense to try to do things in many stages.
Regression to a (population) mean is not relevant because you are comparing sibling embryos which are distributed around the parental mean by construction.
For many traits, returns are pretty linear. Like IQ points, it doesn’t much matter where you are, the value of another point is pretty constant. So you can just ignore that entirely and operate on marginal gains. (eg if the parents are both 130 IQ and you calculate a marginal gain of +1 IQ point from selection & that makes it profitable, maybe their genotypic mean is actually 115 IQ, but it doesn’t matter, because the marginal gain will still be +1 IQ point over the batch mean and will still be profitable).
For binary traits, you can have diminishing returns based on where the trait value is. As I note in my other comment, if a family is at very high risk for a trait like schizophrenia, then selection can be extremely valuable beyond what the average PGS % would imply; the flip side of this is that then there must be families who have below-average risk and benefit a below-average amount. However, these diminishing returns are automatically incorporated in any index score (which is how one should be selecting). If the polygenic score for SCZ is already very low, then the index component for embryos which move the SCZ score even lower will be very small, because it reduces a tiny absolute risk by a tiny absolute amount, which has a tiny expected value, and the index will prefer embryos which move more important traits.
If you want to model such scenarios to imagine conditioning on a specific example/family history, it’s easy to just switch out the population fraction for the implied fraction of the ‘parental population’ in your liability-threshold code, and everything works as before. But I generally use the average because the average is the average.
I guess I was using “economies of scale” very loosely; that’s kind of what I had in mind, but thank you for the details and explanations!
I find the mostly linear relationship between IQ and income to be surprising. If we use the odds of winning a Nobel as a proxy for “ability to make an important scientific discovery”, doesn’t the lack of average-IQ winners imply some kind of exponential relationship between IQ and scientific productivity?
If both the linear income relationship and the one described above are true, it implies an exponentially decreasing ability of high IQ people to capture the value they create.
Something like that. A straight line on log odds charts in SMPY, IIRC.
Oh definitely. This is a point Gensowski and others make: part of the reason that the income relationship does bend is that it’s hard to capture all your positive externalities even though the patenting rate etc increases. You can invent the transistor or antibiotics, but you won’t capture anywhere remotely close to 1% of the total surplus. Also part of the country-level story for why IQ looks so powerful: individuals don’t capture anything remotely approaching their full contributions (in the same way that people on the other end of the spectrum do not internalize anything remotely like the harm they do to everyone else), so the country-level correlations can be much stronger than individual-level ones.