If you unintentionally insert a few base pairs into the promoter region of some protein, will the promoter just work a little less well, or will it break altogether? I
Molecular Biologist here. Promoters (and any non-coding regulatory sequence for that matter) are extremely sensitive to point mutations. Since their sequence determines how well the RNA-polymerase binds to them, any change in the sequence of bind motifs or even in the distance between these motifs has a major (generally negative) impact in transcription initiation efficiency. https://www.nature.com/articles/s41580-018-0028-8
In fact, there is a whole field of research based on randomizing certain parts of a promoter to create a library with different properties/strength.
Promoters (and any non-coding regulatory sequence for that matter) are extremely sensitive to point mutations.
A really important question here is whether the causal SNPs that affect polygenic traits tend to be located in these highly sensitive sequences. One hypothesis would be that regulatory sequences which are generally highly sensitive to mutations permit the occasional variant with a small effect, and these variants are a predominant influence on polygenic traits. This would be bad news for us, since even the best available editors have non-negligible indel rates at target sites.
Another question: there tend to be many enhancers per gene. Is losing one enhancer generally catastrophic for the expression of that gene?
Thanks for the comment. This is actually quite helpful, as the effects of off-target edits or indels to promoter and enhancer regions is one of the primary uncertainties we have regarding feasibility of the proposal.
My prior for thinking that a few off-targets targets or indels wouldn’t necessarily be catastrophic was a paper I read that looked at the total accumulation of random mutations to neurons over the lifespan. I believe by age 40 the average person has about 1500.
Regulatory regions make up about 2% of the genome, so the average neuron has about 30 mutations in regulatory regions by the age of 40. So if we can keep our de novo mutations from increasing that number very much it will probably be ok.
Now it’s possible that the types of errors introduced by random mutations are of a different kind than those introduced by indels and off-targets from base and prime editors. A quick google search reveals that most de novos seem to be single base pair changes rather than insertion or deletion errors. So perhaps it WILL be an issue, at least for some editor variants.
I think the ideal approach to answer this question would be to use (or make) a computational model to predict the distribution of off-target edits and indels from editor variants, another to predict binding affinity as a function of sequence, and see how strongly such errors affect binding affinity. We could then compare those results to the affects of binding affinity from de novo mutations to see whether they were comparable in magnitude.
Perhaps others have already made such models. A quick search didn’t turn up anything, but I will continue looking.
The other option is to just test it empirically in cell cultures and then animal models.
If you have any other advice about how to approach this problem, I’d appreciate it.
Regulatory regions make up about 2% of the genome, so the average neuron has about 30 mutations in regulatory regions by the age of 40. So if we can keep our de novo mutations from increasing that number very much it will probably be ok.
I agree that in the grand scheme of things it would probably not make much of a difference. Also your 2% estimation is generous, if you consider that in any differentiated human cell most of the genes are inactivated. Mutations on those genes would thus be harmless
Molecular Biologist here. Promoters (and any non-coding regulatory sequence for that matter) are extremely sensitive to point mutations. Since their sequence determines how well the RNA-polymerase binds to them, any change in the sequence of bind motifs or even in the distance between these motifs has a major (generally negative) impact in transcription initiation efficiency. https://www.nature.com/articles/s41580-018-0028-8
In fact, there is a whole field of research based on randomizing certain parts of a promoter to create a library with different properties/strength.
Well known library for bacterial promoters: https://parts.igem.org/Promoters/Catalog/Anderson
more info on promoter libraries: https://sci-hub.se/https://pubs.acs.org/doi/full/10.1021/acssynbio.8b00115
A really important question here is whether the causal SNPs that affect polygenic traits tend to be located in these highly sensitive sequences. One hypothesis would be that regulatory sequences which are generally highly sensitive to mutations permit the occasional variant with a small effect, and these variants are a predominant influence on polygenic traits. This would be bad news for us, since even the best available editors have non-negligible indel rates at target sites.
Another question: there tend to be many enhancers per gene. Is losing one enhancer generally catastrophic for the expression of that gene?
Thanks for the comment. This is actually quite helpful, as the effects of off-target edits or indels to promoter and enhancer regions is one of the primary uncertainties we have regarding feasibility of the proposal.
My prior for thinking that a few off-targets targets or indels wouldn’t necessarily be catastrophic was a paper I read that looked at the total accumulation of random mutations to neurons over the lifespan. I believe by age 40 the average person has about 1500.
Regulatory regions make up about 2% of the genome, so the average neuron has about 30 mutations in regulatory regions by the age of 40. So if we can keep our de novo mutations from increasing that number very much it will probably be ok.
Now it’s possible that the types of errors introduced by random mutations are of a different kind than those introduced by indels and off-targets from base and prime editors. A quick google search reveals that most de novos seem to be single base pair changes rather than insertion or deletion errors. So perhaps it WILL be an issue, at least for some editor variants.
I think the ideal approach to answer this question would be to use (or make) a computational model to predict the distribution of off-target edits and indels from editor variants, another to predict binding affinity as a function of sequence, and see how strongly such errors affect binding affinity. We could then compare those results to the affects of binding affinity from de novo mutations to see whether they were comparable in magnitude.
Perhaps others have already made such models. A quick search didn’t turn up anything, but I will continue looking.
The other option is to just test it empirically in cell cultures and then animal models.
If you have any other advice about how to approach this problem, I’d appreciate it.
I agree that in the grand scheme of things it would probably not make much of a difference. Also your 2% estimation is generous, if you consider that in any differentiated human cell most of the genes are inactivated. Mutations on those genes would thus be harmless