The problem could potentially be solved by conducting GWASes that identify the SNPs of things known to correlate with the proxy measure other than intelligence and then subtracting those SNPs, but like you mention later in your reply, the question is what approach is faster and/or cheaper. Unless there is some magic I don’t know about with GSEM, I can’t see a convincing reason why it would have intelligence SNPs buoy to the top of lists ranked on the basis of effect size, especially with the sample size we would likely end up working with (<1 million). If you don’t know what SNPs contribute to intelligence versus something else, applying a flat factor to each allele’s effect size would just increase the scale of difference rather than help distill out intelligence SNPs. Considering the main limitation of this project is the number of edits they’re wanting to make, minimizing the number of allele flips while maximizing the effect on intelligence is one of the major goals here (although I’ve already stated why I think this project is infeasible). Another important thing to consider is that the p-value of SNPs’ effects would be attenuated as the number of independent traits affecting the phenotype increases; if you’re only able to get 500,000 data points for the GWAS that uses SAT as the phenotype, you will most likely have the majority of causal intelligence SNPs falling below the genome-wide significance threshold of p < 5 * 10E-8.
It’s also possible that optimizing peoples’ brains (or a group of embryos) for acing the SAT to the point where they have a 100% chance of achieving this brings us as close to a superintelligent human as we need until the next iteration of superintelligent human.
The tragedy of all of this is that it’s basically a money problem—if some billionaire could just unilaterally fund genome sequencing and IQ testing en masse and not get blocked by some government or other bureaucratic entity, all of this crap about building an accurate predictor would disappear and we’d only ever need to do this once.
The problem could potentially be solved by conducting GWASes that identify the SNPs of things known to correlate with the proxy measure other than intelligence and then subtracting those SNPs
More or less. If you have an impure measurement like ‘years of education’ which lumps in half intelligence and half other stuff (and you know this, even if you never have measurements of IQ and EDU and the other-stuff within individuals, because you can get precise genetic correlations from much smaller sample sizes where you compare PGSes & alternative methods like GCTA or cross-twin correlations), then you can correct the respective estimates of both intelligence and other-stuff, and you can pool with other GWASes on other traits/cohorts to estimate all of these simultaneously. This gets you estimates of each latent trait effect size per allele, and you just rank and select.
you will most likely have the majority of causal intelligence SNPs falling below the genome-wide significance threshold of p < 5 * 10E-8.
A statistical-significance threshold is irrelevant NHST mumbo-jumbo. What you care about is posterior probability of the causal variant’s effect being above the cost-safety threshold, whatever that may be, but which will have nothing at all to do with ‘genome-wide statistical significance’.
A statistical-significance threshold is irrelevant NHST mumbo-jumbo. What you care about is posterior probability of the causal variant’s effect being above the cost-safety threshold, whatever that may be, but which will have nothing at all to do with ‘genome-wide statistical significance’.
I’m aware of this, but if you’re just indiscriminately shoveling heaps of edits into someone’s genome based on a GWAS with too low a sample size to reveal causal SNPs for the desired trait, you’ll be editing a whole bunch of what are actually tags, a whole bunch of things that are related to independent traits other than intelligence, and a whole bunch of random irrelevant alleles that made it into your selection by random chance. This is a sure-fire way to make a therapy that has no chance of working, and if an indiscriminate shotgun approach like this is used in experiments, the combinatorics of the matter dictates that there are more possible sure-to-fail multiplex genome editing therapies than there are humans on the Earth, let alone those willing to be guinea pigs for an experiment like this. Having a statistical significance threshold imposes a bar to pass for SNPs that at least makes the therapy less of an ascertained suicide mission.
if you’re just indiscriminately shoveling heaps of edits into someone’s genome based on a GWAS with too low a sample size to reveal causal SNPs for the desired trait, you’ll be editing a whole bunch of what are actually tags,
What I said was “What you care about is posterior probability of the causal variant’s effect being above the cost-safety threshold”. If you are ‘indiscriminately shoveling’, then you apparently did it wrong.
a whole bunch of things that are related to independent traits other than intelligence,
Pretty much all SNPs are related to something or other. The question is what is the average effect. Given the known genetic correlations, if you pick the highest posterior probability ones for intelligence, then the average effect will be good.
(And in any case, one should be aiming for maximizing the gain across all traits as an index score.)
and a whole bunch of random irrelevant alleles that made it into your selection by random chance.
If they’re irrelevant, then there’s no problem.
This is a sure-fire way to make a therapy that has no chance of working,
No it’s not. If you’re using common SNPs which already exist, why would it ‘have no chance of working’? If some random SNP had some devastating effect on intelligence, then it would not be ranked high.
The problem could potentially be solved by conducting GWASes that identify the SNPs of things known to correlate with the proxy measure other than intelligence and then subtracting those SNPs, but like you mention later in your reply, the question is what approach is faster and/or cheaper. Unless there is some magic I don’t know about with GSEM, I can’t see a convincing reason why it would have intelligence SNPs buoy to the top of lists ranked on the basis of effect size, especially with the sample size we would likely end up working with (<1 million). If you don’t know what SNPs contribute to intelligence versus something else, applying a flat factor to each allele’s effect size would just increase the scale of difference rather than help distill out intelligence SNPs. Considering the main limitation of this project is the number of edits they’re wanting to make, minimizing the number of allele flips while maximizing the effect on intelligence is one of the major goals here (although I’ve already stated why I think this project is infeasible). Another important thing to consider is that the p-value of SNPs’ effects would be attenuated as the number of independent traits affecting the phenotype increases; if you’re only able to get 500,000 data points for the GWAS that uses SAT as the phenotype, you will most likely have the majority of causal intelligence SNPs falling below the genome-wide significance threshold of p < 5 * 10E-8.
It’s also possible that optimizing peoples’ brains (or a group of embryos) for acing the SAT to the point where they have a 100% chance of achieving this brings us as close to a superintelligent human as we need until the next iteration of superintelligent human.
The tragedy of all of this is that it’s basically a money problem—if some billionaire could just unilaterally fund genome sequencing and IQ testing en masse and not get blocked by some government or other bureaucratic entity, all of this crap about building an accurate predictor would disappear and we’d only ever need to do this once.
More or less. If you have an impure measurement like ‘years of education’ which lumps in half intelligence and half other stuff (and you know this, even if you never have measurements of IQ and EDU and the other-stuff within individuals, because you can get precise genetic correlations from much smaller sample sizes where you compare PGSes & alternative methods like GCTA or cross-twin correlations), then you can correct the respective estimates of both intelligence and other-stuff, and you can pool with other GWASes on other traits/cohorts to estimate all of these simultaneously. This gets you estimates of each latent trait effect size per allele, and you just rank and select.
A statistical-significance threshold is irrelevant NHST mumbo-jumbo. What you care about is posterior probability of the causal variant’s effect being above the cost-safety threshold, whatever that may be, but which will have nothing at all to do with ‘genome-wide statistical significance’.
I’m aware of this, but if you’re just indiscriminately shoveling heaps of edits into someone’s genome based on a GWAS with too low a sample size to reveal causal SNPs for the desired trait, you’ll be editing a whole bunch of what are actually tags, a whole bunch of things that are related to independent traits other than intelligence, and a whole bunch of random irrelevant alleles that made it into your selection by random chance. This is a sure-fire way to make a therapy that has no chance of working, and if an indiscriminate shotgun approach like this is used in experiments, the combinatorics of the matter dictates that there are more possible sure-to-fail multiplex genome editing therapies than there are humans on the Earth, let alone those willing to be guinea pigs for an experiment like this. Having a statistical significance threshold imposes a bar to pass for SNPs that at least makes the therapy less of an ascertained suicide mission.
EDIT: misinterpreted what other party was saying.
What I said was “What you care about is posterior probability of the causal variant’s effect being above the cost-safety threshold”. If you are ‘indiscriminately shoveling’, then you apparently did it wrong.
Pretty much all SNPs are related to something or other. The question is what is the average effect. Given the known genetic correlations, if you pick the highest posterior probability ones for intelligence, then the average effect will be good.
(And in any case, one should be aiming for maximizing the gain across all traits as an index score.)
If they’re irrelevant, then there’s no problem.
No it’s not. If you’re using common SNPs which already exist, why would it ‘have no chance of working’? If some random SNP had some devastating effect on intelligence, then it would not be ranked high.