Stable population of asexual haploid bacteria considering only lethal mutations:
Let “G” be the genome string length in base pairs.
Let “M” be the mutations per base pair per division.
Let “numberOfDivisions” be the average number divisions a bacterium undergoes before dying.
Let “survivalFraction” be the probability that division produces another viable bacterium.
survivalFraction = (1 - M)**G. (Assuming mutation events are independent.)
1 = numberOfDivisions x survivalFraction. (Assuming population size is stable.)
Then ln(1/numberOfDivisions) = G ln(1 - M).
G = -ln(numberOfDivisions) / ln(1 - M).
Using Taylor series for small M gives
ln(1 - M) = -M + higher order terms of M.
So G = ln(numberOfDivisions) / M.
Which does not match the simulation observation that G = O(1 / M**2).
Summarizing my thoughts:
1) For lethal mutations the rule, “one mutation, one death”, holds.
In life few mutations will be lethal. Even fewer in a sexual species with genetic redundancy. So the information content limits calculated by assuming only lethal mutations will not apply to the human genome.
2) Selection may not directly affect population size.
E.g., in sexual selection winners and losers are balanced so the total number of offspring is relatively constant. So minor harmful mutations may be removed with high efficiency without affecting total population size.
3)High selection pressure may drive the specie gene pool high up a local fitness peak. However being “too optimized” might hurt specie survival by lowering variance and making the specie more vulnerable to environmental variation, e.g., new pathogens. Or it may decrease the probability of a two-mutation adaptation that might have improved competitiveness again a different species. (Humans may eventually out-compete fruit flies.)
4) Working with selection, crossover and assortative mating remove the most harmful mutations quickly at a high “death” cost (worst case is one death per mutation removal) and remove less harmful mutations slowly at a low “death” cost. The “mutation harmfulness” vs. “mutation frequency” graph likely follows a power law. It should be possible to derive a “mutation removal efficiency” relationship for each “mutation fitness cost”. Such functions are likely different for each specie and population structure.
5) Selection operates on traits. Traits usually depend on complex network interaction of genetic elements. Most genetic elements simultaneously affect many traits. Therefore most trait values will follow an inverted bathtub curve, i.e., low and high values are bad and the mid-range is good. (Body homeostasis requires stable temperature, ph, oxygen level, nutrient level, etc.) Evolution has favored robust systems with regulatory feedback to adjust for optimal trait values in the face of genetic, stochastic, and environmental variation.
(The “bath tub curve” is essentially a one-state system. Multi-state regulatory systems are also common in biology and can be used to differentiate cells.)
6) Total genome information content is limited by the mutation rate and the number of bit errors that are removed by selection. (In the Shannon sense of a message being a string of symbols from a finite set and transmission between generations being a noisy communication channel.) I believe this numerical limit is highly dependent on specie reproductive biology and population dynamics.
Increases in genome information content are not directly related to “evolutionary progress”. In evolution the genome “meaning” is more important than the genome “message”. Over evolutionary time the average “meaning” value of each bit may be increasing. Evolution of complex genetic regulatory systems increased the average “meaning” value per bit. Evolution of complex brains capable of “culture” increased the average “meaning” value per bit. (The information bits that give humans the ability to read are more valuable than the information bits in a book.)
7) The total information in a specie genome can be far greater than the information contained in any individual genome. This is true for sexual bacteria colonies that exchange plasmids. It is also true for animal species, e.g., variation in immune system DNA that protects the specie from pathogens. Variation is the fuel that selection burns for adaptation.
Stable population of asexual haploid bacteria considering only lethal mutations:
Let “G” be the genome string length in base pairs.
Let “M” be the mutations per base pair per division.
Let “numberOfDivisions” be the average number divisions a bacterium undergoes before dying.
Let “survivalFraction” be the probability that division produces another viable bacterium.
survivalFraction = (1 - M)**G. (Assuming mutation events are independent.)
1 = numberOfDivisions x survivalFraction. (Assuming population size is stable.)
Then ln(1/numberOfDivisions) = G ln(1 - M).
G = -ln(numberOfDivisions) / ln(1 - M).
Using Taylor series for small M gives
ln(1 - M) = -M + higher order terms of M.
So G = ln(numberOfDivisions) / M.
Which does not match the simulation observation that G = O(1 / M**2).
Summarizing my thoughts:
1) For lethal mutations the rule, “one mutation, one death”, holds.
In life few mutations will be lethal. Even fewer in a sexual species with genetic redundancy. So the information content limits calculated by assuming only lethal mutations will not apply to the human genome.
2) Selection may not directly affect population size.
E.g., in sexual selection winners and losers are balanced so the total number of offspring is relatively constant. So minor harmful mutations may be removed with high efficiency without affecting total population size.
3)High selection pressure may drive the specie gene pool high up a local fitness peak. However being “too optimized” might hurt specie survival by lowering variance and making the specie more vulnerable to environmental variation, e.g., new pathogens. Or it may decrease the probability of a two-mutation adaptation that might have improved competitiveness again a different species. (Humans may eventually out-compete fruit flies.)
4) Working with selection, crossover and assortative mating remove the most harmful mutations quickly at a high “death” cost (worst case is one death per mutation removal) and remove less harmful mutations slowly at a low “death” cost. The “mutation harmfulness” vs. “mutation frequency” graph likely follows a power law. It should be possible to derive a “mutation removal efficiency” relationship for each “mutation fitness cost”. Such functions are likely different for each specie and population structure.
5) Selection operates on traits. Traits usually depend on complex network interaction of genetic elements. Most genetic elements simultaneously affect many traits. Therefore most trait values will follow an inverted bathtub curve, i.e., low and high values are bad and the mid-range is good. (Body homeostasis requires stable temperature, ph, oxygen level, nutrient level, etc.) Evolution has favored robust systems with regulatory feedback to adjust for optimal trait values in the face of genetic, stochastic, and environmental variation.
(The “bath tub curve” is essentially a one-state system. Multi-state regulatory systems are also common in biology and can be used to differentiate cells.)
6) Total genome information content is limited by the mutation rate and the number of bit errors that are removed by selection. (In the Shannon sense of a message being a string of symbols from a finite set and transmission between generations being a noisy communication channel.) I believe this numerical limit is highly dependent on specie reproductive biology and population dynamics.
Increases in genome information content are not directly related to “evolutionary progress”. In evolution the genome “meaning” is more important than the genome “message”. Over evolutionary time the average “meaning” value of each bit may be increasing. Evolution of complex genetic regulatory systems increased the average “meaning” value per bit. Evolution of complex brains capable of “culture” increased the average “meaning” value per bit. (The information bits that give humans the ability to read are more valuable than the information bits in a book.)
7) The total information in a specie genome can be far greater than the information contained in any individual genome. This is true for sexual bacteria colonies that exchange plasmids. It is also true for animal species, e.g., variation in immune system DNA that protects the specie from pathogens. Variation is the fuel that selection burns for adaptation.