It would be better than nothing. I am grinding one of my favorite axes more than I probably should. But those numbers make my case. My intuition says it would be hard to mine a few million SNPs, pick the most strongly associated 9500, and have them account for less than .29 of the variance, even if there were no relationship at all. And height is probably a very simple property, which may depend mainly on the intensity and duration of expression of a single growth program, minus interference from deficiencies or programs competing for resources.
“My intuition says it would be hard to mine a few million SNPs, pick the most strongly associated 9500, and have them account for less than .29 of the variance, even if there were no relationship at all.”
With sample sizes of thousands or low tens of thousands you’d get almost nothing. Going from 130k to 250k subjects took it from 0.13 to 0.29 (where the total contribution of all common additive effects is around 0.5).
Most of the top 9500 are false positives (the top 697 are genome-wide significant and contribute most of the variance explained). Larger sample sizes let you overcome noise and correctly weight the alleles with actual effects. The approach looks set to explain everything you can get (and the bulk of heritability for height and IQ) without whole genome sequencing for rare variants just by scaling up another order of magnitude.
It would be better than nothing. I am grinding one of my favorite axes more than I probably should. But those numbers make my case. My intuition says it would be hard to mine a few million SNPs, pick the most strongly associated 9500, and have them account for less than .29 of the variance, even if there were no relationship at all. And height is probably a very simple property, which may depend mainly on the intensity and duration of expression of a single growth program, minus interference from deficiencies or programs competing for resources.
“My intuition says it would be hard to mine a few million SNPs, pick the most strongly associated 9500, and have them account for less than .29 of the variance, even if there were no relationship at all.”
With sample sizes of thousands or low tens of thousands you’d get almost nothing. Going from 130k to 250k subjects took it from 0.13 to 0.29 (where the total contribution of all common additive effects is around 0.5).
Most of the top 9500 are false positives (the top 697 are genome-wide significant and contribute most of the variance explained). Larger sample sizes let you overcome noise and correctly weight the alleles with actual effects. The approach looks set to explain everything you can get (and the bulk of heritability for height and IQ) without whole genome sequencing for rare variants just by scaling up another order of magnitude.