One problem is that for that approach, you would need, say, standardized IQ tests and genomes for a large number of people, and then to identify genome properties correlated with high IQ.
First, all biologists everywhere are still obsessed with “one gene” answers. Even when they use big-data tools, they use them to come up with lists of genes, each of which they say has a measurable independent contribution to whatever it is they’re studying. This is looking for your keys under the lamppost. The effect of one gene allele depends on what alleles of other genes are present. But try to find anything in the literature acknowledging that. (Admittedly we have probably evolved for high independence of genes, so that we can reproduce thru sex.)
Second, as soon as you start identifying genome properties associated with IQ, you’ll get accused of racism.
You can deal with epistasis using the techniques Hsu discusses and big datasets, and in any case additive variance terms account for most of the heritability even without doing that. There is much more about epistasis (and why it is of secondary importance for characterizing the variation) in the linked preprint.
First, all biologists everywhere are still obsessed with “one gene” answers. Even when they use big-data tools, they use them to come up with lists of genes, each of which they say has a measurable independent contribution to whatever it is they’re studying. This is looking for your keys under the lamppost. The effect of one gene allele depends on what alleles of other genes are present. But try to find anything in the literature acknowledging that. (Admittedly we have probably evolved for high independence of genes, so that we can reproduce thru sex.)
Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ~2,000, ~3,700 and ~9,500 SNPs explained ~21%, ~24% and ~29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability.
It would be better than nothing. I am grinding one of my favorite axes more than I probably should. But those numbers make my case. My intuition says it would be hard to mine a few million SNPs, pick the most strongly associated 9500, and have them account for less than .29 of the variance, even if there were no relationship at all. And height is probably a very simple property, which may depend mainly on the intensity and duration of expression of a single growth program, minus interference from deficiencies or programs competing for resources.
“My intuition says it would be hard to mine a few million SNPs, pick the most strongly associated 9500, and have them account for less than .29 of the variance, even if there were no relationship at all.”
With sample sizes of thousands or low tens of thousands you’d get almost nothing. Going from 130k to 250k subjects took it from 0.13 to 0.29 (where the total contribution of all common additive effects is around 0.5).
Most of the top 9500 are false positives (the top 697 are genome-wide significant and contribute most of the variance explained). Larger sample sizes let you overcome noise and correctly weight the alleles with actual effects. The approach looks set to explain everything you can get (and the bulk of heritability for height and IQ) without whole genome sequencing for rare variants just by scaling up another order of magnitude.
One problem is that for that approach, you would need, say, standardized IQ tests and genomes for a large number of people, and then to identify genome properties correlated with high IQ.
That’s just a matter of time till genome sequencing get’s cheap enough. There will be a day where it makes sense for China to sequence the DNA of every citizen for health purposes. China has also standardized test scores of it’s population and no issues with racism that will prevent people from analysing the data.
One problem is that for that approach, you would need, say, standardized IQ tests and genomes for a large number of people, and then to identify genome properties correlated with high IQ.
First, all biologists everywhere are still obsessed with “one gene” answers. Even when they use big-data tools, they use them to come up with lists of genes, each of which they say has a measurable independent contribution to whatever it is they’re studying. This is looking for your keys under the lamppost. The effect of one gene allele depends on what alleles of other genes are present. But try to find anything in the literature acknowledging that. (Admittedly we have probably evolved for high independence of genes, so that we can reproduce thru sex.)
Second, as soon as you start identifying genome properties associated with IQ, you’ll get accused of racism.
You can deal with epistasis using the techniques Hsu discusses and big datasets, and in any case additive variance terms account for most of the heritability even without doing that. There is much more about epistasis (and why it is of secondary importance for characterizing the variation) in the linked preprint.
? I see mentions of stuff like dominance and interaction all the time; the reason people tend to ignore it in practice seems to be that the techniques which assume additive/independence work pretty well and explain a lot of the heritability. For example, height the other day: “Defining the role of common variation in the genomic and biological architecture of adult human height”
Seems like an excellent start to me.
It would be better than nothing. I am grinding one of my favorite axes more than I probably should. But those numbers make my case. My intuition says it would be hard to mine a few million SNPs, pick the most strongly associated 9500, and have them account for less than .29 of the variance, even if there were no relationship at all. And height is probably a very simple property, which may depend mainly on the intensity and duration of expression of a single growth program, minus interference from deficiencies or programs competing for resources.
“My intuition says it would be hard to mine a few million SNPs, pick the most strongly associated 9500, and have them account for less than .29 of the variance, even if there were no relationship at all.”
With sample sizes of thousands or low tens of thousands you’d get almost nothing. Going from 130k to 250k subjects took it from 0.13 to 0.29 (where the total contribution of all common additive effects is around 0.5).
Most of the top 9500 are false positives (the top 697 are genome-wide significant and contribute most of the variance explained). Larger sample sizes let you overcome noise and correctly weight the alleles with actual effects. The approach looks set to explain everything you can get (and the bulk of heritability for height and IQ) without whole genome sequencing for rare variants just by scaling up another order of magnitude.
That’s just a matter of time till genome sequencing get’s cheap enough. There will be a day where it makes sense for China to sequence the DNA of every citizen for health purposes. China has also standardized test scores of it’s population and no issues with racism that will prevent people from analysing the data.