Natural Selection’s Speed Limit and Complexity Bound
Followup to: An Alien God, The Wonder of Evolution, Evolutions Are Stupid
Yesterday, I wrote:
Humans can do things that evolutions probably can’t do period over the expected lifetime of the universe. As the eminent biologist Cynthia Kenyon once put it at a dinner I had the honor of attending, “One grad student can do things in an hour that evolution could not do in a billion years.” According to biologists’ best current knowledge, evolutions have invented a fully rotating wheel on a grand total of three occasions.
But then, natural selection has not been running for a mere million years. It’s been running for 3.85 billion years. That’s enough to do something natural selection “could not do in a billion years” three times. Surely the cumulative power of natural selection is beyond human intelligence?
Not necessarily. There’s a limit on how much complexity an evolution can support against the degenerative pressure of copying errors.
(Warning: A simulation I wrote to verify the following arguments did not return the expected results. See addendum and comments.)
(Addendum 2: This discussion has now been summarized in the Less Wrong Wiki. I recommend reading that instead.)
The vast majority of mutations are either neutral or detrimental; here we are focusing on detrimental mutations. At equilibrium, the rate at which a detrimental mutation is introduced by copying errors, will equal the rate at which it is eliminated by selection.
A copying error introduces a single instantiation of the mutated gene. A death eliminates a single instantiation of the mutated gene. (We’ll ignore the possibility that it’s a homozygote, etc; a failure to mate also works, etc.) If the mutation is severely detrimental, it will be eliminated very quickly—the embryo might just fail to develop. But if the mutation only leads to a 0.01% probability of dying, it might spread to 10,000 people before one of them died. On average, one detrimental mutation leads to one death; the weaker the selection pressure against it, the more likely it is to spread. Again, at equilibrium, copying errors will introduce mutations at the same rate that selection eliminates them. One mutation, one death.
This means that you need the same amount of selection pressure to keep a gene intact, whether it’s a relatively important gene or a relatively unimportant one. The more genes are around, the more selection pressure required. Under too much selection pressure—too many children eliminated in each generation—a species will die out.
We can quantify selection pressure as follows: Suppose that 2 parents give birth to an average of 16 children. On average all but 2 children must either die or fail to reproduce. Otherwise the species population very quickly goes to zero or infinity. From 16 possibilities, all but 2 are eliminated—we can call this 3 bits of selection pressure. Not bits like bytes on a hard drive, but mathematician’s bits, information-theoretical bits; one bit is the ability to eliminate half the possibilities. This is the speed limit on evolution.
Among mammals, it’s safe to say that the selection pressure per generation is on the rough order of 1 bit. Yes, many mammals give birth to more than 4 children, but neither does selection perfectly eliminate all but the most fit organisms. The speed limit on evolution is an upper bound, not an average.
This 1 bit per generation has to be divided up among all the genetic variants being selected on, for the whole population. It’s not 1 bit per organism per generation, it’s 1 bit per gene pool per generation. Suppose there’s some amazingly beneficial mutation making the rounds, so that organisms with the mutation have 50% more offspring. And suppose there’s another less beneficial mutation, that only contributes 1% to fitness. Very often, an organism that lacks the 1% mutation, but has the 50% mutation, will outreproduce another who has the 1% mutation but not the 50% mutation.
There are limiting forces on variance; going from 10 to 20 children is harder than going from 1 to 2 children. There’s only so much selection to go around, and beneficial mutations compete to be promoted by it (metaphorically speaking). There’s an upper bound, a speed limit to evolution: If Nature kills off a grand total of half the children, then the gene pool of the next generation can acquire a grand total of 1 bit of information.
I am informed that this speed limit holds even with semi-isolated breeding subpopulations, sexual reproduction, chromosomal linkages, and other complications.
Let’s repeat that. It’s worth repeating. A mammalian gene pool can acquire at most 1 bit of information per generation.
Among mammals, the rate of DNA copying errors is roughly 10^-8 per base per generation. Copy a hundred million DNA bases, and on average, one will copy incorrectly. One mutation, one death; each non-junk base of DNA soaks up the same amount of selection pressure to counter the degenerative pressure of copying errors. It’s a truism among biologists that most selection pressure goes toward maintaining existing genetic information, rather than promoting new mutations.
Natural selection probably hit its complexity bound no more than a hundred million generations after multicellular organisms got started. Since then, over the last 600 million years, evolutions have substituted new complexity for lost complexity, rather than accumulating adaptations. Anyone who doubts this should read George Williams’s classic “Adaptation and Natural Selection”, which treats the point at much greater length.
In material terms, a Homo sapiens genome contains roughly 3 billion bases. We can see, however, that mammalian selection pressures aren’t going to support 3 billion bases of useful information. This was realized on purely mathematical grounds before “junk DNA” was discovered, before the Genome Project announced that humans probably had only 20-25,000 protein-coding genes. Yes, there’s genetic information that doesn’t code for proteins—all sorts of regulatory regions and such. But it is an excellent bet that nearly all the DNA which appears to be junk, really is junk. Because, roughly speaking, an evolution isn’t going to support more than 10^8 meaningful bases with 1 bit of selection pressure and a 10^-8 error rate.
Each base is 2 bits. A byte is 8 bits. So the meaningful DNA specifying a human must fit into at most 25 megabytes.
(Pause.)
Yes. Really.
And the Human Genome Project gave the final confirmation. 25,000 genes plus regulatory regions will fit in 100,000,000 bases with lots of room to spare.
Addendum: genetics.py, a simple Python program that simulates mutation and selection in a sexually reproducing population, is failing to match the result described above. Sexual recombination is random, each pair of parents have 4 children, and the top half of the population is selected each time. Wei Dai rewrote the program in C++ and reports that the supportable amount of genetic information increases as the inverse square of the mutation rate(?!) which if generally true would make it possible for the entire human genome to be meaningful.
In the above post, George Williams’s arguments date back to 1966, and the result that the human genome contains <25,000 protein-coding regions comes from the Genome Project. The argument that 2 parents having 16 children with 2 surviving implies a speed limit of 3 bits per generation was found here, and I understand that it dates back to Kimura’s work in the 1950s. However, the attempt to calculate a specific bound of 25 megabytes was my own.
It’s possible that the simulation contains a bug, or that I used unrealistic assumptions. If the entire human genome of 3 billion DNA bases could be meaningful, it’s not clear why it would contain <25,000 genes. Empirically, an average of O(1) bits of genetic information per generation seems to square well with observed evolutionary times; we don’t actually see species gaining thousands of bits per generation. There is also no reason to believe that a dog has greater morphological or biochemical complexity than a dinosaur. In short, only the math I tried to calculate myself should be regarded as having failed, not the beliefs that are wider currency in evolutionary biology. But until I understand what’s going on, I would suggest citing only George Williams’s arguments and the Genome Project result, not the specific mathematical calculation shown above.
- The Simple Math of Everything by 17 Nov 2007 22:42 UTC; 94 points) (
- Beware of Stephen J. Gould by 6 Nov 2007 5:22 UTC; 60 points) (
- 8 Dec 2008 4:31 UTC; 18 points) 's comment on Artificial Mysterious Intelligence by (
- Observing Optimization by 21 Nov 2008 5:39 UTC; 12 points) (
- [SEQ RERUN] Natural Selection’s Speed Limit and Complexity Bound by 17 Oct 2011 4:43 UTC; 8 points) (
- 13 Apr 2009 12:25 UTC; 3 points) 's comment on Open Thread: April 2009 by (
A lot of our DNA was acquired in the days when our ancestors were not yet mammals.
“Surely the cumulative power of natural selection is beyond human intelligence?”
Even if it was, why would you want to use it? Evolution has thoroughly screwed over more human beings than every brutal dictator who ever lived, and that’s just humans, never mind the several billion extinct species which litter our planet’s history.
Nuclear technology has been used to kill hundreds of thousands of people, it’s still a useful form of energy.
So the meaningful DNA specifying a human must fit into at most 25 megabytes.
And that’s before compression :-)
“So the meaningful DNA specifying a human must fit into at most 25 megabytes.”
These are bits of entropy, not bits on a hard drive. It’s mathematically impossible to compress bits of entropy.
Eliezer, your argument seems to confuse two different senses of information. You first define “bit” as “the ability to eliminate half the possibilities”—in which case, yes, if every organism has O(1) children then the logical “speed limit on evolution” is O(1) bits per generation.
But you then conclude that “the meaningful DNA specifying a human must fit into at most 25 megabytes”—and more concretely, that “it is an excellent bet that nearly all the DNA which appears to be junk, really is junk.” I don’t think that follows at all.
The underlying question here seems to be this: suppose you’re writing a software application, and as you proceed, many bits of code are generated at random, many bits are logically determined by previous bits (albeit in a more-or-less “mindless” way), and at most K times you have the chance to fix a bit as you wish. (Bits can also be deleted as you go.) Should we then say that whatever application you end up with can have at most K bits of “meaningful information”?
Arguably from some God’s-eye view. But any mortal examining the code could see far more than K of the bits fulfilling a “functional role”—indeed, possibly even all of them. The reason is that the web of logical dependencies, by which the K “chosen” bits interacted with the random bits to produce the code we see, could in general be too complicated ever to work out within the lifetime of the universe. And crucially, when biologists talk about how many base pairs are “coding” and how many are “non-coding”, it’s clearly the pragmatic sense of “meaningful information” they have in mind rather than the Platonic one.
Indeed, it’s not even clear that God could produce a ~K-bit string from which the final application could be reliably reconstructed. The reason is that the application also depends on random bits, of which there are many more than K. Without assuming some conjecture about pseudorandom number generators, it seems the most God could do would be to give us a function mapping the random bits to K bits, such that by applying that function we’d end up most of the time with an application that did more-or-less the same thing. (This actually leads to some interesting CS questions, but I’ll spare you for now! :) )
To say something more concrete, without knowing much more than I do about biology, I wouldn’t venture a guess as to how much of the “junk DNA” is really junk. The analogy I prefer is the following: if I printed out the MS Word executable file, almost all of it would look like garbage to me, with only a few “coding regions” here and there (“It looks like you’re writing a letter. Would you like help?”). But while the remaining bits might indeed be garbage in some sense, they’re clearly not in the sense a biologist would mean.
Excluding the complex and subtle regulatory functions that non-coding DNA can possess strikes me as being extremely unwise.
There is no DNA in the maize genome that codes for striped kernels, because that color pattern is the result of transposons modulating gene expression. The behavior of one transposon is intricately linked to the total behavior of all transposons, and the genetic shifts they result in defy the simple mathematical rules of Mendelian inheritance. But more importantly, the behavior of transposons is deeply linked to the physical structure of the encoding regions they’re associated with.
Roughly half the genome of corn is made up of transposons. Is this ‘junk’ or not?
Aaronson, McCabe:
Actually, these mathematician’s bits are very close to bits on a hard drive. Genomes, so far as I know, have no ability to determine what the next base ought logically to be; there is no logical processing in a ribosome. Selection pressure has to support each physical DNA base against the degenerative pressure of copying errors. Unless changing the DNA base has no effect on the organism’s fitness (a neutral mutation), the “one mutation, one death” rule comes into play.
Now certainly, once the brain is constructed and patterned, there are billions of neurons, all of them playing a functional role, and once these neurons are exposed to the environment, the algorithmic complexity will begin to actually increase. But the core learning algorithms must still in principle be specifiable in 25 megabytes. There may not be junk neurons, but there is surely junk DNA.
Now, even junk DNA may help, in a certain sense, because the metabolic load of DNA is tiny, and the more junk DNA you have, the more crossover you can do with a smaller probability of swapping in the middle of a coding gene. This “function” of junk DNA does not depend on its information content, so it doesn’t have to be supported against the degenerative pressure of a per-base probability of copying error.
To sum up: The mathematician’s bits here are very close to bits on a hard drive, because every DNA base that matters has to be supported by “one mutation, one death” to overcome per-base copying errors.
However, mutation rates vary and can be selected. They aren’t simply a constraint.
Also, it’s been a long time since I’ve thought about this and I may be wrong, but aren’t you talking about 1 bit per linkage group and not one bit per genome? (And the size of linkage groups also varies and can be selected.)
Some viruse genomes face severe constraints on size—they have a container they must fit into—say an icosahedral shape—and it would be a big step to increase that size. And some of those make proteins off both strands of DNA and sometimes in more than one reading frame. 3 proteins from the same DNA sequence. Presumably each protein is less efficient than it might be if the DNA evolved to make it alone, but they do an adequate job of reproducing the virus.
Probably size constraints can usually be fudged better than that.
You can make mathematical theories about evolution, but they’re highly sensitive to their beginning assumptions. It’s too soon to say how far evolution has gone to produce genetic mechanisms that let evolution proceed more efficiently.
Eliezer, so long an organism’s fitness depends on interactions between many different base pairs, the effect can be as if some of the base pairs are logically determined by others.
Also, unless I’m mistaken there are some logical operations that the genome can perform: copying, transpositions, reversals...
To illustrate, suppose (as apparently happens) a particular DNA stretch occurs over and over with variations: sometimes forwards and sometimes backwards, sometimes with 10% of the base pairs changed, sometimes chopped in half and sometimes appended to another stretch, etc. Should we then count all but one of these occurrences as “junk”? Of course, we could measure the number of bits using our knowledge of how the sequence actually arose (“first stretch X was copied Y times, then the copies were modified as follows...”). But the more such knowledge we bring in, the further we get from the biologist’s concept of information and the closer to the Platonic mathematical concept.
Tom:These are bits of entropy...mathematically impossible to compress
My bad, was thinking of the meaningful base pairs. Thanks for correcting me.
I interpret Eliezer to be saying that the Kolmogorov complexity of the human genome is roughly 25MB—the absolute smallest computer program that could output a viable human genome would be about that size. But this minimal program would use a ridiculous number of esoteric tricks to make itself that small. You’d have to multiply that number by a large factor (representing how compressible, in principle, modern applications are) to make a comparison to hard drive bits as they are actually used.
Eek I just noticed an unfortunate way that last comment could be read. I meant I was thinking of material bits of information when I should have thought of information-theoretical bits. I in no way interpret your “bits of entropy” to mean physical, non-meaningful base pairs!
OK, I came up with a concrete problem whose solution would (I think) tell us something about whether Eliezer’s argument can actually work, assuming a certain stylized set of primitive operations available to DNA: insertion, deletion, copying, and reversal. See here if you’re interested.
Eliezer, I see two potential flaws in your argument, let me try and explain:
1.) The copy error rate can’t directly translate mathematically into how often individuals in a species die out due to the copy error rate. We simply can’t know how often a mutation is neutral, good, or detrimental, in part because that depends on the specific genome involved. I imagine some genomes are simply more robust than others. But I believe the prevailing wisdom is that most mutations are neutral, simply because proteins are too physically big to be effected by small changes. Either way, I can’t see how anyone knows enough about this to be confident in coming up with specific mathematically calculated numbers.
2.) One bad mutation does NOT equal one death, as far as I see it. Greater intelligence leads to greater capability to cope with detrimental circumstances. Sickle-Cell-Anemia is detrimental, but people live and reproduce with it, and have for generations. But it’s almost entirely detrimental, especially if your risk of Malaria is low. It’s true, organisms with non-detrimental versions of the genes will gradually take over, but that doesn’t mean the detrimental versions can’t survive on their own and with just a lower population cap.
And not referring to you in saying this Eliezer, but this whole “Most of the DNA is junk” mantra reeks of conventionalist thinking, a classic form of bias, and has always annoyed me when I saw it in science programs and news articles. Current scientific knowledge knows more about proteins than any other aspect of the function of DNA, so it follows that people will focus on this and gloss over the importance of the other functions of DNA. If you know something very concrete about DNA: proteins, that are amazing enough in themselves, it’s very easy to justify the case that the rest is simply junk DNA. I doubt that, I think we just do know what it does yet on a mechanical level.
Scott, the mechanisms you’ve described indeed allow us to end up with more meaningful physical DNA than the amount of information in it. To give a concrete example, a protein-coding gene is copied, then mutates, and then there’s two highly structured proteins performing different chemical functions, which because of their similarity, evolved faster than counting the bases separately would seem to allow.
So the 1 bit/generation speed limit on evolution is not a speed limit on altered DNA bases—definitely not!
The problem is that these meaningful bases also have 10^-8 copying errors per base per generation, so the above does not bypass the total complexity bound on evolution. The total complexity bound on evolution is not some hyper-compressed Kolmogorov algorithmic complexity, it’s counting meaningful bases (bases such that if they are changed they degrade the fitness of the organism).
So: Mammalian evolutions can create 1 bit of algorithmic complexity per generation, possibly highly compressed; but the total number of meaningful DNA bases is limited to 100,000,000 bases or less, without compression in the reproductively transmitted genetic material, but potentially with all kinds of unpacking over the lifetime of a particular organism.
Quite a lot of mutations are so lethal that they abort embryonic development, yes. This is a severe problem with organisms drawn from a narrow gene pool, like humans and corn, and less so with others. It’s worth noting that, if we consider these mutations in the argument, we have to consider not only the children who are born and are weeded out, but all of the embryos conceived and lost as well.
Given how few conceptions actually make it to birth, and how many infants died young before the advent of modern medicine, humans didn’t lose two out of four, they lost more like two out of eight-to-twelve.
Eliezer, I’m a little skeptical of your statement that sexual reproduction/recombination won’t add information...
Single base pairs don’t even code for amino acids, much less proteins. 2. If we’re looking at how a mutation affects an organism’s ability to reproduce, we want to consider at least an entire protein, not just an amino acid. 3. There can be multiple genes that are neutral on their own, yet in combination are either very harmful or very beneficial.
Can you provide an argument as to why none of this affects the “speed limit” (not even by a constant factor?)
“To sum up: The mathematician’s bits here are very close to bits on a hard drive, because every DNA base that matters has to be supported by “one mutation, one death” to overcome per-base copying errors.”
There are only twenty amino acids plus a stop code for each codon, so the theoretical information bound is 4.4 bits/codon, not 6 bits, even for coding DNA. A common amino acid, such as leucine, only requires two base pairs to specify; the third base pair can freely mutate without any phenotypic effects at all.
“Can you provide an argument as to why none of this affects the “speed limit” (not even by a constant factor?)”
For a full explanation, see an evolutionary biology textbook. But basically, the 1 bit/generation bound is information-theoretic; it applies, not just to any species, but to any self-reproducing organism, even one based on RNA or silicon. The specifics of how information is utilized, in our case DNA → mRNA → protein, don’t matter.
Even in the argument, it applies to organisms that lose half of their offspring to selection. It’s different for those that lose more, or less.
Among mammals, it’s safe to say that the selection pressure per generation is on the rough order of 1 bit. Yes, many mammals give birth to more than 4 children, but neither does selection perfectly eliminate all but the most fit organisms. The speed limit on evolution is an upper bound, not an average.
One bit per generation equates to a selection pressure which kills half of each generation before they reproduce according to the first part of your post. Then you say 1 bit per generation is the most mammalian reproduction can sustain. But, more than half of mammals (in many, perhaps most, species) die without reproducing. Wouldn’t this result in a higher rate of selection and, therefore, more functional DNA?
“But, more than half of mammals (in many, perhaps most, species) die without reproducing. Wouldn’t this result in a higher rate of selection and, therefore, more functional DNA?”
“Yes, many mammals give birth to more than 4 children, but neither does selection perfectly eliminate all but the most fit organisms. The speed limit on evolution is an upper bound, not an average.”
But mammals have many ways of weeding out harmful variations, from antler fights to spermatozoa competition. And that’s just if they have the four children. The provided 1 bit/generation figure isn’t an upper bound, either.
Life spends a lot of time in non-equilibrium states as well, and those are the states in which evolution can operate most quickly.
“But basically, the 1 bit/generation bound is information-theoretic; it applies, not just to any species, but to any self-reproducing organism, even one based on RNA or silicon. The specifics of how information is utilized, in our case DNA → mRNA → protein, don’t matter.”
OK, and I’m familiar with information theory (less so with evolutionary biology, but I understand the basics) but I’m thinking that the 1 bit/generation bound is—pardon the pun—a bit misleading, since:
A lot—I mean a lot—of crazy assumptions are made without any hard evidence to back them up. (E.g., the “mammals produce on average ~4 offspring, and when they produce more, it’s compensated for by selection’s inefficiencies.”)
I’m still not convinced that we’re measuring in the right units. Some mutations do absolutely nothing (for example, if a segment of DNA translating to a UAU codon mutated into one translating to UAC), and some make a ridiculously huge difference. This kind of redundancy, along with many other factors, makes me wonder if we need to change the 1 bit by some scaling factor...
“But basically, the 1 bit/generation bound is information-theoretic; it applies, not just to any species, but to any self-reproducing organism, even one based on RNA or silicon. The specifics of how information is utilized, in our case DNA → mRNA → protein, don’t matter.”
OK, and I’m familiar with information theory (less so with evolutionary biology, but I understand the basics) but I’m thinking that the 1 bit/generation bound is—pardon the pun—a bit misleading, since:
A lot—I mean a lot—of crazy assumptions are made without any hard evidence to back them up. (E.g., the “mammals produce on average ~4 offspring, and when they produce more, it’s compensated for by selection’s inefficiencies.”)
I’m still not convinced that we’re measuring in the right units. Some mutations do absolutely nothing (for example, if a segment of DNA translating to a UAU codon mutated into one translating to UAC), and some make a ridiculously huge difference. This kind of redundancy, along with many other factors, makes me wonder if we need to change the 1 bit by some scaling factor...
David MacKay did a paper on this. Here’s a quote from the abstract:
G is the size of the genome in bits.
I’ve been enjoying your evolution posts and wanted to toss in my own thoughts and see what I can learn.
“Our first lemma is a rule sometimes paraphrased as “one mutation, one death”.”
Imagine that having a working copy of gene “E” is essential. Now suppose a mutation creates a broken gene “Ex”. Animals that are heterozygous with “E” and “Ex” are fine and pass on their genes. Only homozygous “Ex” “Ex” result in a “death” that removes 2 mutations.
Now imagine that a duplication event gives four copies of “E”. In this example an animal would only need one working gene out of the four possible copies. When the rare “Ex” “Ex” “Ex” “Ex” combination arises then the resulting “death” removes four mutations.
In fruit fly knock-out experiments, breaking one development gene often had no visible affect. Backup genes worked well enough. The backup gene could have multiple roles: First, it has a special function that improves the animal fitness. Second, it works as a backup when the primary gene is disabled. The resulting system is robust since the animal can thrive with many broken copies and evolution is efficient since a single “death” can remove four harmful mutations.
I’ve focussed on protein-coding genes, but this concept also applies to short DNA segments that code for elements such as miRNA’s. Imagine that the DNA segment is duplicated. Being short, it is rarely deactivated by a mutation. Over time a genome may acquire many working copies that code for that miRNA. Rarely an animal would inherit no working copies and so a “death” would remove multiple chromosomes that “lacked” that DNA segment. On the other hand, too many copies might also be fatal. Chromosomes with too few or too many active copies would suffer a fitness penalty.
On a different note, imagine two stags. The first stag has lucked-out and inherited many alleles that improve its fitness. The second stag wasn’t so lucky and inherited many bad alleles. The first stag successfully mates and the second doesn’t. One “death” removed many inferior alleles.
Animals may have evolved sexual attraction based on traits that depend on the proper combined functioning of many genes. An unattractive mate might have many slightly harmful mutations. Thus one “death” based on sexual selection might remove many harmful mutations.
Evolution might be a little better than the “one mutation, one death” lemma implies. (I agree that evolution is an inefficient process.)
“This 1 bit per generation has to be divided up among all the genetic variants being selected on, for the whole population. It’s not 1 bit per organism per generation, it’s 1 bit per gene pool per generation.”
Suppose new allele “A” has fitness advantage 1.03 compared to the wild allele “a” and that another allele “B” on the same type chromosome has fitness advantage 1.02. Eventually the “A” and “B” alleles will be sufficiently common that a crossover creating a new chromosome “AB” with “A” and “B” alleles is likely (This crossover probability depends on the population sizes of “Ab” and “aB” chromosomes and the distance between the alleles). The new chromosome “AB” should have a fitness of 1.05 compared to the chromosome “ab”. Both “A” and “B” should then see an accelerated spread until the “ab” chromosomes are largely displaced. The rate would then diminish as “AB” displaced “Ab” and “aB” chromosomes. Thus multiple beneficial mutations of the same type chromosome should spread faster than the “single mutation” formula would indicate.
Due to crossover, good “bits” would tend to accumulate on good chromosomes thereby increasing the fitness of the entire chromosome as described above. The highly fit good chromosome thus displaces chromosome with many bad “bits”. The good “bits” are no longer inherited independently and each “death” can now select multiple information “bits”.
We seem to view evolution from a similar perspective.
Information requires selection in order to be preserved. The DNA information in an animal genome could be ranked in “fitness” value and the resulting graph would likely follow a power law. I.e., some DNA information is extremely important and likely to be preserved while most of the DNA is relatively free to drift. In a species such as fruit flies with many offspring selection can drive the species high up a local fitness peak. Much of the animal genome will be optimized. In a species such as humans with few offspring there is much less selection pressure and the specie gene pool wanders further from local peaks. More of the human genome drifts. (E.g., human regulatory elements are less conserved than rodent regulatory elements.)
“But mammals have many ways of weeding out harmful variations, from antler fights to spermatozoa competition. And that’s just if they have the four children. The provided 1 bit/generation figure isn’t an upper bound, either.”
Read a biology textbook, darn it. The DNA contents of a sperm have negligible impact on the sperm’s ability to penetrate the egg. As for antler fights, it doesn’t matter how individuals are removed from the gene pool. They can only be removed at a certain rate or else the species population goes to zero. Note than nonreproduction = death as far as evolution is concerned.
“Life spends a lot of time in non-equilibrium states as well, and those are the states in which evolution can operate most quickly.”
Yes, but they must be balanced by states where it operates more slowly. You can certainly have a situation where 1.5 bits are added in odd years and .5 bits in even years, but it’s a wash: you still get 1 bit/year long term.
“1. A lot—I mean a lot—of crazy assumptions are made without any hard evidence to back them up. (E.g., the “mammals produce on average ~4 offspring, and when they produce more, it’s compensated for by selection’s inefficiencies.”)”
The bit rate is O(log(offspring)), not O(offspring), so even if you produced 16 offspring, that’s only three bits/generation. How many offspring do you think we have? 8,589,934,592? (= 32 bits/generation)? Selection will have inefficiencies, so these are upper bounds.
“This kind of redundancy, along with many other factors, makes me wonder if we need to change the 1 bit by some scaling factor...”
The factor due to redundant coding sequences is 1.36 (1.4 bits/base instead of 2.0). This does increase the amount of storable information, because it makes the degenerative pressure (mutation) work less efficiently. Then again, it’s only a factor of 35%, so the conclusion is still basically the same.
OK, I posted the following update to my blog entry:
Rereading the last few paragraphs of Eliezer’s post, I see that he actually argues for his central claim—that the human genome can’t contain more than 25MB of “meaningful DNA”—on different (and much stronger) grounds than I thought! My apologies for not reading more carefully.
In particular, the argument has nothing to do with the number of generations since the dawn of time, and instead deals with the maximum number of DNA bases that can be simultaneously protected, in steady state, against copying errors. According to Eliezer, copying DNA sequence involves a ~10^-8 probability of error per base pair, which — because only O(1) errors per generation can be corrected by natural selection — yields an upper bound of ~10^8 on the number of “meaningful” base pairs in a given genome.
However, while this argument is much better than my straw-man based on the number of generations, there’s still an interesting loophole. Even with a 10^-8 chance of copying errors, one could imagine a genome reliably encoding far more than 10^8 bits (in fact, arbitrarily many bits) by using an error-correcting code. I’m not talking about the “local” error-correction mechanisms that we know DNA has, but about something more global—by which, say, copying errors in any small set of genes could be completely compensated by other genes. The interesting question is whether natural selection could read the syndrome of such a code, and then correct it, using O(1) randomly-chosen insertions, deletions, transpositions, and reversals. I admit that this seems unlikely, and that even if it’s possible in principle, it’s probably irrelevant to real biology. For apparently there are examples where changing even a single base pair leads to horrible mutations. And on top of that, we can’t have the error-correcting code be too good, since otherwise we’ll suppress beneficial mutations!
Incidentally, Eliezer’s argument makes the falsifiable prediction that we shouldn’t find any organism, anywhere in nature, with more than 25MB of functional DNA. Does anyone know of a candidate counterexample? (I know there are organisms with far more than humans’ 3 billion base pairs, but I have no idea how many of the base pairs are functional.)
Lastly, in spite of everything above, I’d still like a solution to my “pseudorandom DNA sequence” problem. For if the answer were negative—if given any DNA sequence, one could efficiently reconstruct a nearly-optimal sequence of insertions, transpositions, etc. producing it—then even my original straw-man misconstrual of Eliezer’s argument could put up a decent fight! :-)
Defective sperm—which are more-than-normally likely to be carry screwed-up DNA—is far less likely to reach the egg, and far less likely to penetrate it before a fully functional spermatozoan does. It’s a weeding-out process.
Of course it does! Just not to the maximum-bit-rate argument.No, they mustn’t. They can theoretically be kept in a constant non-equilibrium.
Eliezer, sorry for spamming, but I think I finally understand what you were getting at.
Von Neumann showed in the 50′s that there’s no in-principle limit to how big a computer one can build: even if some fraction p of the bits get corrupted at every time step, as long as p is smaller than some threshold one can use a hierarchical error-correcting code to correct the errors faster than they happen. Today we know that the same is true even for quantum computers.
What you’re saying—correct me if I’m wrong—is that biological evolution never discovered this fact.
If true, this is a beautiful argument for one of two conclusions: either that (1) digital computers shouldn’t have as hard a time as one might think surpassing billions of years of evolution, or (2) 25MB is enough for pretty much anything!
Scott said: “25MB is enough for pretty much anything!”
Have people tried to measure the complexity of the ‘interpreter’ for the 25MB of ‘tape’ of DNA? Replication machinery is pretty complicated, possibly much more so than any genome.
Actually, Scott Aaronson, something you said in your second to last post made me think of another reason why the axiom “one mutation, one death” may not be true. Actually, it’s just an elaberation of the point I made earlier but I thought I’d flesh it out a bit more.
The idea is that the more physically and mentally complex, and physically larger, a species gets, the more capable is it is of coping with detrimental genes and still surviving to reproduce. When you’re physically bigger, and smarter, there’s more ‘surplus’ resources to draw upon to help in survivial. Example: There is a rare genetic disorder that causes some people to have no finger prints. This mean’s that their manual dexterity is greatly reduced because of lack of friction in the fingers. And while detrimental, this is a historicaly prevelant case that has not gone away just because it’s bad for an individual. You can learn to avoid situations where failure in manual dexterity could be fatal, etc.
I also believe it’s possible for long standing sections of DNA to evolve and become more robust to mutation once they have “proven themselves”. Meaning if a certain series of genes/DNA that serve a benificial function are around long enough, they will become more refined and effective, and especially robust. However this is accomplished specifically, which of course I don’t know, I don’t see why it’s mechanically impossible. Thus, large sections of DNA could essentially be “subtracted” from amount of DNA to be mutated per generation.
Any flaws in this logic?
“Defective sperm—which are more-than-normally likely to be carry screwed-up DNA—is far less likely to reach the egg,”
Then the DNA never gets collected by researchers and included in the 10^-8 mutations/generation/base pair figure. If the actual rate of mutations are higher, but the non-detected mutations are weeded out, you still get the exact same result as if the rate of mutations is lower with no weeding-out.
“Of course it does! Just not to the maximum-bit-rate argument.”
True.
“No, they mustn’t. They can theoretically be kept in a constant non-equilibrium.”
Yes, they can be- it doesn’t change the bit rate. Non-equilibria where the population is shrinking must be balanced by non-equilibria where the population is growing, or the population will go to zero or infinity.
A mammalian gene pool can acquire at most 1 bit of information per generation.
Eliezer,
That’s a very provocative, interestingly empirical, yet troublingly ambiguous statement. :)
I think it’s important to note that evolution is very effective (within certain constraints) in figuring out ways to optimize not only genes but also genomes—it seems probable that a large amount of said “bits” have been on the level of structural or mechanical optimizations.
These structural/mechanical optimizations might in turn involve mechanisms by which to use existing non-coding, “junk” DNA in various ways (which might, in some sense, effectively increase the “bit size” of a single adaption into the megabytes).
It may be telling that we haven’t seen, in three billion years and given all the other genetic complexity out there, any organisms evolve a mechanism to clean the junk out of its DNA.
At any rate, I think your argument is interesting, and the topic is simply fascinating, but I take your numbers with a grain of salt. No offense.
Wiseman, if it’s true that (1) copying DNA inherently incurs a 10^-8 probability of error per base pair, and that (2) evolution hasn’t invented any von-Neumann-type error-correction mechanism, then all the objections raised by you and others (and by me, earlier!) are irrelevant.
In particular, it doesn’t matter how capable a species is of coping with a few detrimental mutations. For if the mutation rate is higher than what natural selection can correct, the species will just keep on mutating, from one generation to the next, until the mutations finally do become detrimental.
Also, your idea that sections of DNA can become more robust once they’ve “proven themselves” violates the 10^-8 assumption—which I’d imagine (someone correct me if I’m wrong) comes from physics and chemistry, not because it’s favored by natural selection.
Aaronson: What you’re saying—correct me if I’m wrong—is that biological evolution never discovered this fact [error-correcting codes].
You’re not wrong. As you point out, it would halt all beneficial mutations as well. Plus there’d be some difficulty in crossover. Can evolution ever invent something like this? Maybe, or maybe it could just invent more efficient copying methods with 10^-10 error rate. And then a billion years later, much more complex organisms would abound. All of this is irrelevant, given that DNA is on the way out in much less than a million years, but it makes for fun speculation...
Aaronson: If true, this is a beautiful argument for one of two conclusions: either that (1) digital computers shouldn’t have as hard a time as one might think surpassing billions of years of evolution, or (2) 25MB is enough for pretty much anything!
In an amazing and completely unrelated coincidence, I work in the field of Artificial Intelligence. Did I mention DNA is on the way out in much less than a million years?
Cyan, MacKay’s paper talks about gaining bits as in bits on a hard drive, which goes as O(N^1/2) because of Price’s Equation and because variance in the sum of bits goes as the square root of the number of bits. I spent some time scribbling and didn’t prove that this never gains more than one bit of information relative to a fitness function, but I don’t see how eliminating half the organisms in a randomly mixed gene pool can make it gain more than one bit of information in the allele frequencies.
I think that a subset of Fly’s objections may be valid, especially the ones about sexual selection concentrating harmful mutations in a small subset of the population. This could plausibly increase the number of bits by a significant factor. OTOH, 25M is an upper bound, so the actual number of bits could easily still be less.
Great point about evolution not discovering hierarchical error-correcting code Scott A. Chris Phoenix frequently makes similar points about molecular nanotechnology in response to its critics.
Regarding the earlier posts point about evolution not being able to extrapolate the fitness benefits of a gene and save time in it’s penetration into a population, I will point out that in financial markets this ability is what creates bubbles. In evolutionary dynamics it can potentially produce runaway sexual selection, which is sort of like a bubble that never resolves itself in a price correction.
Scott A. I wasn’t suggesting DNA would magically not mutate after it had evolved towards sophistication, only that the system of genes/DNA that govern a system would become robust enough so it would be immune to the effects of the mutations.
Anway, evolution does not have to “correct” these mutations, as long as the organism can survive with them, they have as much a chance of mutating to a neutral, positive, or other equally detremental state as it has of becoming worse. As a genome becomes larger and larger, it can cope with the same ratio of mutations it always has. The effects of the mutations don’t “add up” as is assumed by Eliezer, they effect the local region of DNA and its related function, and that’s it. If an organism happens to have a synergetically enhanced group of detrimental mutations, then yes that one will die, but showing empirically that that would happen more often than not, I thin, would be very difficult.
In any case, I still don’t see where the ~25 megabyte number comes from. Wouldn’t you need to know precisely how many mutations were detrimental to work that number out? And I’m assuming it’s reasonable to say we don’t have that information?
Wiseman, let M be the number of “functional” base pairs that get mutated per generation, and let C be the number of those base pairs that natural selection can affect per generation. Then if M>>C, the problem is that the base pairs will become mostly (though not entirely) random junk, regardless of what natural selection does. This is a point about random walks that has nothing to do with biology.
To illustrate, suppose we have an n-bit string. At every time step, we can change one of the bits to anything we want, but then two bits get chosen at random and inverted. Question: in “steady state,” how many of the bits can we ensure are 0?
I claim the answer is only 3n/4. For suppose pn of the bits are 1, and that we always pick a ‘1’ bit and change it to 0. Then the expected change in the number of bits after a single time step is
D(p) = (1/4) [p^2 (-3) + 2p(1-p) (-1) + (1-p)^2 (1)].
Setting D(p)=0 and solving yields p=1/4.
So we get an either/or behavior: either the mutation rate is small enough that we can keep the functional DNA pretty much exactly as it is, or else a huge fraction of the “functional” DNA becomes randomized. In other words, there’s no intermediate regime where the functional DNA keeps mutating around within some “interesting” region of configuration space, without spilling out into a huge region of configuration space.
Note that for the above argument, I’ve assumed that the “interesting” regions of DNA configuration space are necessarily small—and in particular, that they can’t correspond to Hamming balls of size c*n where c is a constant. This assumption is basically a restatement of our earlier observation that natural selection hasn’t discovered error-correcting codes. As such, it seems to me to be pretty secure biologically. But if this assumption fails then so does my argument.
“‘Life spends a lot of time in non-equilibrium states as well, and those are the states in which evolution can operate most quickly.’
Yes, but they must be balanced by states where it operates more slowly. You can certainly have a situation where 1.5 bits are added in odd years and .5 bits in even years, but it’s a wash: you still get 1 bit/year long term.”
This seems to contradict your earlier assertion that the 1 bit/generation rate is “an upper bound, not an average.” It seems to me to be more analogous to a roulette wheel or the Second Law of Thermodynamics (relax! I’m not about to make a creationist argument just ’cause I said that!), so a gene pool can certainly acquire more than 1.36 bits (or whatever the actual figure is) in some generations, but in the long run “the house always wins.”
“The factor due to redundant coding sequences is 1.36 (1.4 bits/base instead of 2.0). This does increase the amount of storable information, because it makes the degenerative pressure (mutation) work less efficiently. Then again, it’s only a factor of 35%, so the conclusion is still basically the same.”
Thank you. As long as everyone’s clear that the speed limit is O(1) bits/generation (over long stretches? on average?) and not necessarily precisely 1 bit no matter what, I’m happy.
Scott: “What you’re saying—correct me if I’m wrong—is that biological evolution never discovered [error-correcting codes]...[O]n top of that, we can’t have the error-correcting code be too good, since otherwise we’ll suppress beneficial mutations!”
Whoa—that’s really helpful. Scott, as usual, you’ve broken through all the troubling and confusing jargon (“equilibrium? durr...”) so that some poor schmuck like me can actually see the main point. Thanks. =)
But of course, evolution itself is a sort of crude error-correcting code—and one that discriminates between beneficial mutations and detrimental ones! So here’s my question: Can you actually do asymptotically better than natural natural selection by applying an error-correcting code that doesn’t hamper beneficial mutations? Or is natural selection (plus local error-correction of the form existing in DNA) optimal?
With sufficiently large selection populations, it’s not clear to me how anything could be better than natural selection, since natural selection is what the system is trying to beat. Any model of natural selection will necessarily contain inaccuracies.
So here’s my question: Can you actually do asymptotically better than natural natural selection by applying an error-correcting code that doesn’t hamper beneficial mutations?
In principle, yes. In a given generation, all we want is a mutation rate that’s nonzero, but below the rate that natural selection can correct. That way we can maintain a steady state indefinitely (if we’re indeed at a local optimum), but still give beneficial mutations a chance to take over.
Now with DNA, the mutation rate is fixed at ~10^-8. Since we need to be able to weed out bad mutations, this imposes an upper bound of ~10^8 on the number of functional base pairs. But there’s nothing special mathematically about the constant 10^-8 -- that (unless I’m mistaken) is just an unwelcome intruder from physics and chemistry. So by using an error-correcting code, could we make the “effective mutation rate” nonzero, but as far below 10^-8 as we wanted?
Indeed we could! Here’s my redesigned, biology-beating DNA that achieves this. Suppose we want to simulate a mutation rate ε<also stick in “parity-check pairs” from a good error-correcting code. These parity-check pairs let us correct as many mutations as we want, with only a tiny probability of failure.
Next we let the physics and chemistry of DNA do their work, and corrupt a 10^-8 fraction of the base pairs. And then, using exotic cellular machinery whose existence we get to assume, we read the error syndrome off the parity-check pairs, and use it to undo all but one mutation in the unencoded, functional pairs. But how do we decide which mutation gets left around for evolution’s sake? We just pick it at random! (If we need random bits, we can just extract them from the error syndrome—the cosmic rays or whatever it is that cause the physical mutations kindly provide us with a source of entropy.)
Fly: Imagine that having a working copy of gene “E” is essential. Now suppose a mutation creates a broken gene “Ex”. Animals that are heterozygous with “E” and “Ex” are fine and pass on their genes. Only homozygous “Ex” “Ex” result in a “death” that removes 2 mutations.
Now imagine that a duplication event gives four copies of “E”. In this example an animal would only need one working gene out of the four possible copies. When the rare “Ex” “Ex” “Ex” “Ex” combination arises then the resulting “death” removes four mutations.
Fly, you’ve just postulated four copies of the same gene, so that one death will remove four mutations. But these four copies will suffer mutations four times as often. Unless I’m missing something, this doesn’t increase the bound on how much non-redundant information can be supported by one death. :)
McCabe: The factor due to redundant coding sequences is 1.36 (1.4 bits/base instead of 2.0). This does increase the amount of storable information, because it makes the degenerative pressure (mutation) work less efficiently. Then again, it’s only a factor of 35%, so the conclusion is still basically the same.
This increases the potential number of semi-meaningful bases (bases such that some mutations have no effect but other mutations have detrimental effect) but cancels out the ability to store any increased information in such bases.
Eliezer, could you provide a link to this result? Something looks wrong about it.
Fisher’s fundamental theorem of natural selection says the rate of natural selection is directly proportional to the variance in additive fitness in the population. At first sight that looks incompatible with your result.
You mention a site with selection at 0.01%. This would take a very long time for selection to act, and it would require that there not be stronger selection on any nearby linked site. It seems implausible that this site would have been selected before, with the result that it should be a 50:50 chance whether each change is a small favorable or unfavorable one. Tiny selective effects are neutral for all practical purposes. But tiny unfavorable changes have only a tiny chance to spread in the same way that you have little chance to win big at the casino when you make a very long series of small bets with the odds a little bit against you. Tiny favorable changes have only a very small chance to spread because they’re usually lost by accident before they get a large enough stake.
Your numbers are clearly correct when all mutations are dominant lethal ones. When the mutation rate is high enough that half the offspring get a dominant lethal mutation, and there are only twice as many offspring as parents, then the population can barely survive those mutations and any higher mutation rate would drive it extinct.
I’m not sure that reasoning applies to mutations that affect relative reproduction rather than absolute, though. When a mutation lets its bearer survive better than other individuals when competing with them, but they survive just fine when it isn’t around, that could be a different story.
Clearly there are limits to the rate of natural selection. It’s proportional to the variance in fitness, so anything that limits the variation in fitness limits the rate of evolution. Mutation and recombination create variation in fitness, and there’s some limit on the mutation rate because of mutations that reduce the absolute level of functioning of the organism. But the reasoning expressed in the post doesn’t look convincing to me.
“Now with DNA, the mutation rate is fixed at ~10^-8.”
Well no, it isn’t. Not to get too complicated, usually the mutation rate is lower than that, but occasionally things happen that bring the mutation rate rather higher. We have things like DNA repair mechanisms that are mutagenic and others that are less so, and when the former get turned on we get a burst of mutations.
“Since we need to be able to weed out bad mutations, this imposes an upper bound of ~10^8 on the number of functional base pairs.”
Definitely no more than 10^8 sites that would mutate into dominant lethals. For lesser deleterious mutations it gets murkier.
But there’s nothing special mathematically about the constant 10^-8 -- that (unless I’m mistaken) is just an unwelcome intruder from physics and chemistry. So by using an error-correcting code, could we make the “effective mutation rate” nonzero, but as far below 10^-8 as we wanted?
Yes, and it happens some.
Indeed we could! Here’s my redesigned, biology-beating DNA that achieves this. Suppose we want to simulate a mutation rate ε<<10^-8, allowing us to maintain ~1/ε functional base pairs in a steady state. Then we simply stick those 1/ε base pairs (in unencoded form) into our DNA strand, and also stick in “parity-check pairs” from a good error-correcting code. These parity-check pairs let us correct as many mutations as we want, with only a tiny probability of failure.
It’s been years since I’ve looked at this. I may have some of it wrong and it might have changed while I wasn’t looking. But one way we used to handle that was to keep track of which strand of DNA is the old known strand and which is the new one. Then if there’s a mismatch, you repair the new one instead of the old one.
If you have two copies of the DNA sequence and one of them is being replicated while the other waits, and there’s an error, you can copy DNA from the reserve copy and splice it into one or both of the new ones.
Since each DNA repair system might possibly do misrepair under some circumstance, and since they are potentially disruptive, it makes some sense that they would only be activated when needed.
If a species can deal with detrimental mutations for several generations, then that simply means that the species has more time to weed out those really bad mutations, making the “one mutation, one death” equation inadequate to describe the die off rate based purely on the mutation rate. Yes, new mutations pop up all the time, but unless those mutations directly add on to the detrimental effects of previous mutations, the species still will survive another generation.
To add on to my other argument that we “know too little” to make hard mathamatical calculations on how big a functional genome can be, we also shouldn’t work under the assumption that mutation rates are static. Wikipedia’s “Mutation rate” article states the rate varies from species to species, and there is even some disagreement as to what the human rate is. There is NO REASON why a species can’t evolve redudent, error correction copy mechanisms so the mutation rate is right at the sweetspot, providing variation but not so much as to cause extinction.
AGAIN, I still advocate that the original point Eliezer made can’t be proven untill we know exactly how many mutations are detrimental. As a neutral mutation simply doesn’t count, no matter how many generations you look forward, and benificial mutations can counter detrimental ones.
A comment from Shtetl-Optimized discussion:
It’s actually a common misconception that biological systems should have mechanisms that allow a certain number of mutations for the purpose of accruing beneficial adaptations. From an evolutionary perspective, all genes should favor the highest possible fidelity copies. Any genes that have any lower copying fidelity will necessarily have fewer copies in future generations and thus lose the evolutionary game to a gene with higher copying fidelity.
Remember, folks, evolution doesn’t work for the good of the species, and there’s no Evolution Fairy who tries to ensure its own continued freedom of action. It’s just a statistical property of some genes winning out over others.
From an evolutionary perspective, all genes should favor the highest possible fidelity copies.
Hmm… Suppose there are two separated populations, identical except that one has a gene that makes the mutation rate negligibly low. Naturally the mutating population will acquire greater variation over time. If the environment shifts, the homogeneous population may be wiped out but part of the diverse population may survive. So in this case, lower-fidelity copying is more fit in the long run. This is highly contrived, of course.
Disagree. Any genome that has lower copy fidelity will only be removed from the gene pool if the errors in copy actually make the resultant organism unable to survive and reproduce, otherwise it’s irrelevant how similar the copied genese are to the original. If the copy error rate produces detrimental genes at a rate that will not cause the species to go extinct, it will allow for any benificial mutations to arise and spread themselves throughout the gene pool at ‘leisure’. As long as those positive genese are attached to a genome structure which produces mutations at a specific rate, that mutation rate genome will continue to exist because it’s ‘carried’ by an otherwise healthy genome.
Sexual reproduction supports this concept very well. Fathers share only a portion of their actual genome with their offspring, (effectively a very low copy fidelity from parent to offspring.) And yet this is the most powerful type of reproduction because it allows for rapid adaptation to changing enviroments. However it arose, it’s here to stay.
Remember, folks, evolution doesn’t work for the good of the species, and there’s no Evolution Fairy who tries to ensure its own continued freedom of action. It’s just a statistical property of some genes winning out over others.
Right, but if the mutation rate for a given biochemistry is itself relatively immutable, then this might be a case where group selection actually works. In other words, one can imagine RNA, DNA, and other replicators fighting it out in the primordial soup, with the winning replicator being the one with the best mutation properties.
Eliezer: “Fly, you’ve just postulated four copies of the same gene, so that one death will remove four mutations. But these four copies will suffer mutations four times as often. Unless I’m missing something, this doesn’t increase the bound on how much non-redundant information can be supported by one death. :)”
Yeah, you are right. You only gain if the redundancy means that the fitness hit is sufficiently minor that more than four errors could be removed with a single death.
The “one death, one mutation” rule applies if the mutation immediately affects the first generations. However, having backup copies means that mutations are seldom all that damaging. Humans have two copies of the genome (except for us poor males who suffer from X-linked genetic diseases). A loss-of-function mutation in a gene may have minor fitness impact. If a mutation causes failure to implant or an early miscarriage, then it should have little affect on the number of offspring a woman produces. If the mutation has minor fitness impact then the more efficient error correcting that occurs through crossover, chromosome competition, and mate competition could come into play.
Redundancy might increase the amount non-redundant information supported by one death, but not in the manner I presented in that example.
PS
In some cases assortative mating could also act to segregate beneficial and harmful alleles and accelerate filtering.
I like that evolution inherently prioritizes error removal. The worst mutations are removed quickly at a high “death” cost. Less harmful mutations are removed more slowly and at a lower “death” cost (since multiple “errors” are removed with each death).
“On average all but 2 children must either die or fail to reproduce. Otherwise the species population very quickly goes to zero or infinity.”
A population of infinity is of course non-existing. An “infinity” population is not just a mathematical impossibility. What you forget to take into account is that a growing population changes the conditions of the population, and changes selection pressure.
Furthermore you consider evolution of just a single species. But all species are considered to be descendants of the same LUCA (Last Universal Common Ancestor), and there is no mathematical reason to consider each species separately. Or is there? When would you split populations into species and have each their own independent evolutionary progress? Is evolution faster when there are more species? But doesn’t the same reasoning count on the number of species: “At equilibrium, each new species means that another species dies out”.
The problem is: there is no equilibrium. Equilibrium is a simplified hypothetical state of evolution to make it easy to apply mathematics. As a first step, it is of course a good thing, because it is easy. But drawing conclusions from this simplified situation is a bit too fast. The next step should be to try and find some mathematics that applies to non-equilibrium states. Maybe then you can draw some conclusions about the real world.
“There’s a limit on how much complexity an evolution can support against the degenerative pressure of copying errors.”
It depends on which level you look. You look at the complexity of each species separately. But why not take each chromosome separately, or each gene, or why not look at ecosystems. The thing is that evolution is not just a thing of species, evolution takes places at all those levels, and what happens at each level, influences what happens on other levels. Again: don’t draw conclusions from a simplified model.
Taka, if you don’t draw conclusions from simplified models, then you can’t make any decisions ever.
So let me be more concrete. Because every model is a simplification. What I mean to say is that the model used here, is far too simple to draw conclusions.
The central statement of this entry is “There’s a limit on how much complexity an evolution can support against the degenerative pressure of copying errors”.
In order to check the model, the statement should be quantified, so it can be matched with measurements. Maybe something like “the genome of a species can have maximally 50k genes”. That requires that the model should be enhanced.
If on purely mathematical grounds was realized that selection pressure can not support 3 billion bases of useful information, before the discovery of junk DNA and the 25k genes in the human genome, then surely there has been some development in the mathematical modeling of evolution since then.
“This increases the potential number of semi-meaningful bases (bases such that some mutations have no effect but other mutations have detrimental effect) but cancels out the ability to store any increased information in such bases.”
If 27% of all mutations have absolutely no effect, the “one mutation = one death” rule is broken, and so more information can be stored because the effective mutation rate is lower (this also means, of course, that the rate of beneficial mutations is lower). So it may be a 40 MB bound instead of a 25 MB bound, but it doesn’t change the basic conclusion.
“If the environment shifts, the homogeneous population may be wiped out but part of the diverse population may survive.”
If you start postulating group selection arguments, you won’t be able to understand evolution clearly. And the professional evolutionary biologists will think of you as a crackpot. And your dog will get sick and die.
“But all species are considered to be descendants of the same LUCA (Last Universal Common Ancestor), and there is no mathematical reason to consider each species separately.”
If the species have stopped interbreeding, deletrious mutations can accumulate in each species independently. Evolution is a mathematical process which does not care what happened ten million years ago.
“What you forget to take into account is that a growing population changes the conditions of the population, and changes selection pressure.”
Yes, that’s precisely the point. If you have a long period of weak selection pressure, the population will increase and selection pressure will increase. If you have a long period of strong selection pressure, the population will decrease (unless the species is driven to extinction). Hence, you can reliably predict an average selection pressure, because the two must balance each other out.
“The next step should be to try and find some mathematics that applies to non-equilibrium states. Maybe then you can draw some conclusions about the real world.”
This has probably already been done.
“The thing is that evolution is not just a thing of species, evolution takes places at all those levels”
I repeat: if you use group selection arguments, your dog will get sick and die.
Most of our DNA is shared with all eukariots, so it was evolved before mammals existed.
MacKay’s paper talks about gaining bits as in bits on a hard drive
I don’t think MacKay’s paper even has a coherent concept of information at all. As far as I can tell, in MacKay’s model, if I give you a completely randomized 100 Mb hard drive, then I’ve just given you 50 Mb of useful information, because half of the bits are correct (we just don’t know which ones.) This is not a useful model.
Rolf,
If you look at equation 3 of MacKay’s paper, you’ll see that he defines information in terms of frequency of an allele in a population, so you’d have to provide a whole population of randomized hard drives, and if you did so, the population would have zero information.
First, there is the correct point that our mutation rate has been at a steady decline—the first couple of billion years had a much higher rate of data encoding than the last couple of billion years, of which, the former had a much higher.
Second, there is the point that a significant portion of pregnancies are failures—we could possibly double the rate of data encoding from that alone, presuming all of one of those bits is improvement on genetic repair and similar functionality. (Reducing mutation rates of critical genes.)
Third, multiple populations could encode multiple bits of data, if they are kept distinct except for a very small level of cross-breeding to keep both populations compliant. (That is, a low level of geographic isolation could, in sexually reproducing creatures, increase the number of gene pools to play with, although at a nonlinear rate—it wouldn’t be a huge increase over a bit per half of population lost.)
Fourth, and finally, not only did you forget the first two billion years of evolution, you forgot DNA transfusion in its varying forms—which occurs occasionally in bacteria, whereby one can acquire the information encoded in another.
If you look at equation 3 of MacKay’s paper, you’ll see that he defines information in terms of frequency of an allele in a population
I apologize, my statement was ambiguous. The topic of Eliezer’s post is how much information is in an individual organism’s genome, since that’s what limits the complexity of a single organism, which is what I’m talking about.
Equation 3 addresses the holistic information of the species, which I find irrelevant to the topic at hand. Maybe Alice, Bob, and Charlie’s DNA could together have up to 75 MB of data in some holographic sense. Maybe a dog, cat, mouse, and anteater form a complex 100 MB system, but I don’t care.
Would you agree that the information-theoretic increase in the amount of adaptive data in a single organism is still limited by O(1) bits in Mackay’s model? If not, please let me know, because in that case I’m clearly missing something and would like to learn from my mistake.
FYI, God did not design humans, we are all naturally evolved. Evolution can and has indeed designed lots of fully rotating wheels and any other fancy contraptions, it just wound up using the clever approach of first evolving some grad students.
Anything that a human can do, natural selection can do, by definition. We’re nothing more special than cogs in the evolutionary machine, albeit special cases of cogs that work a lot better than previous generations.
Or did you think that human thought is some kind of deus ex machina?
OK, Let me make my point clearer, why we can’t calculate the actual complexity limit of working DNA:
1.) Not all mutations are bad. Accepted knowledge: most are simply neutral, a few are bad, and even a fewer are good.
2.) If the mutations are good or neutral, they should effectivly be subtracted from the mutation rate, as they do not contribute to the “one mutation, one death” axiom because good/neutral mutations do not increase death probability.
3.) The mutations will not accumulate either, over many generations, if they are good/neutral. If a mutation really is good or neutral, that’s EXACTLY what it is. It’s like it never happened, it effectivly doesn’t count in the “one mutation, one death” calculations.
4.) We do not know exactly how many mutations are good/bad/neutral. THUS we simply cannot come up with a specific upper boundary to the amount of working DNA in a genome.
Did Eliezer take this into account in the calculations in this article? Or am I missing something here?
Anything that a human can do, natural selection can do, by definition.
Ah, yes, the old “Einstein’s mother must have been one heck of a physicist” argument, or “Shakespeare only wrote what his parents and teachers taught him to write: Words.”
Even in the sense of Kolmogorov complexity / algorithmic information, humans can have complexity exceeding the complexity of natural selection because we are only a single one out of millions of species to have ever evolved.
And the things humans “do” are completely out of character for the things that natural selection actually does, as opposed to doing “by definition”.
If you can’t distinguish between human intelligence and natural selection, why bother distinguishing between any two phenomena at all?
Rolf,
I can’t really process this query until you relate the words you’ve used to the math MacKay uses, i.e., give me some equations. Also, Eliezer is pretty clearly talking about information in populations, not just single genomes. For example, he wrote, “This 1 bit per generation has to be divided up among all the genetic variants being selected on, for the whole population. It’s not 1 bit per organism per generation, it’s 1 bit per gene pool per generation.”
Eliezer,
I’ve thought hard about your reply, but it’s not clear to me what the distinction is between bits on a hard drive (or in a genome) and information-theoretic bits. One bit on a hard drive answers one yes-or-no question, just like an information-theoretic bit.
The third section of the paper is entitled “The maximum tolerable rate of mutation”. (MacKay left the note “This section needs checking over...” immediately under the title, so there’s room for doubt about his argument.) MacKay derives the rate of change of fitness in his models as a function of mutation rate. He concludes (as you did) that the maximum genome size scales as the inverse of the mutation rate, but only when mutation is the sole source of variation. He makes the claim that maximum genome size scales as the inverse of the square of the mutation rate when crossover is used.
It seems to me that this is a perfect example of your idea that one doesn’t really understand something until the equations are written down. MacKay has tried to do just that. Either his math is wrong, or the idea that truncation can only give on the order of one bit of selection pressure is just the wrong abstraction for the job.
(Just as a follow up, MacKay demonstrates that the key difference between mutation and crossover is that the full progeny (i.e., progeny before truncation) generated by mutation have a smaller average fitness than their parents, while the full progeny generated by crossover have average fitness equal to their parents’.)
Even if most mutations is neutral, that just says that most of the genome don’t contain any information. If you flip a base and it doesn’t make any difference, then you’ve just proved that it was junk-DNA, right?
Hi Erik,
It’s not junk DNA, it merely has usefulness in many different configurations. Perhaps if the mutation would be to skip a base pair entirely, rather than just mis-copy it, it would be more likely to be detrimental.
“If you flip a base and it doesn’t make any difference, then you’ve just proved that it was junk-DNA, right?”
Not quite. Certain bases in the protein-coding sections of genes (i.e., definitely not junk DNA!) can be flipped without changing the resulting proteins. This can happen because there are 64 different codons, but only 20 different amino acids are used to build proteins, so the DNA code is not one-to-one.
It might be safer to say that if you delete the base and it makes no difference, then it was junk, but even this will run into problems...
Here are some mutation strategies that life uses that may be of value in programming towards AI-- (evolving software is part of the program- true?)
1)adaptive mutation (aka directed mutation)--
It has been observed that bacteria will mutate more quickly when under stress.
http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Retrieve&db=PubMed&list_uids=11433357&dopt=AbstractPlus
http://newsinfo.iu.edu/news/page/normal/1160.html
http://www.iscid.org/boards/ubb-get_topic-f-1-t-000196.html
2)purposeful mutation (How long do you suppose that description will last?)
This type of mutation involves the human immune system (at least that’s what is being studied so far)
http://www.aecom.yu.edu/cellbiology/PDF/scharff/scharff2007_4.pdf
“junk DNA” is an unfortunate misnomer. The actual term “hetrochormatin” isn’t exactly a winner either. But it ain’t junk and is an area of intense current interest.
Don’t forget the role of epigenetics in the evolution of life forms either—this may turn out to be biggest factor yet-- (Wouldn’t that be cool if science had that much advance coming in the near future?)
I think MacKay’s “If variation is created by recombination, the population can gain O(G^0.5) bits per generation.” is correct. Here’s my way of thinking about it. Suppose we take two random bit strings of length G, each with G/2-G^0.5 zeros and G/2+G^0.5 ones, randomly mix them twice, then throw away the result has fewer ones. What is the expected number of ones in the surviving mixed string? It’s G/2+G^0.5+O(G^0.5).
Or here’s another way to think about it. Parent A has 100 good (i.e., above average fitness) genes and 100 bad genes. Same with parent B. They reproduce sexually and have 4 children, two with 110 good genes and 90 bad genes, the other two (who do not survive to further reproduce) with 90 good genes and 110 bad genes. Now in one generation they’ve managed to eliminate 10 bad genes instead of just 1.
This seems to imply that the human genome may have much more than 25 MB of information.
Wiseman: “It’s not junk DNA, it merely has usefulness in many different configurations.”
Perhaps it was unfortunate to use the term junk DNA. What I was thinking of was more on the lines of information content. If a given base is useful in several configurations, it contains that much less information.
If base X has to be e.g. guanine to have its effect, that is one out of four possible states, i.e. 2 bits of information. If it could be either guanine or thymine, then it only contains one bit.
It may be that the actual human genome uses more than 5*10^7 bases to encode 10^8 bits of information. The total information would still have to fit in 25 MB after you got rid of redundancy because of the limitations in the information upholding abilities of evolution. (Assuming the rest of Eliezers calculations.)
How can we define the information content of the mutation responsible for Huntington’s Disease? It occurs in a non-coding section, it’s technically a collection of similiar mutations, and it seems to have something to do with the physical structure of the chromosome rather than coding in the simple sense.
A point to note is that corrupting pressure to genome through adverse point mutations occurring on protein coding DNA regions are partly counterbalanced by selection happening already before birth, in form of miscarriages (late and early) and cell death or cell inefficiency during earlier stages of the germline development, even before fertilization.
Even if the value of 1 bit per generation holds true for addition of new ‘relevant’ information, the above acts as additional positive factor that only acts to negate the degrading effects of random mutations. Obviously this doesn’t matter when talking about post-birth (meaning, say, post-one-week-after-inception, assuming only a lowly value of 50% of miscarrying after that point) relevant DNA information, which thus might still be capped around 25MB. However, it’s far from obvious how and where exactly this information is embedded in the DNA.
It would seem that more relevant than protein data itself is the data that affects how and in which situations proteins express. This hints that post-birth relevant information is stored elsewhere, in regulating sections and perhaps in the ‘junk’ DNA regions. And what comes to understanding these regions, our knowledge is flimsy. The information might actually be encoded in these regions in a manner that allows for error-correction schemes not quite unlike the Von Neumann point made earlier in the comments, about computer memory error correction.
I think it’s not fitting to state that post-birth relevant information is “the meaningful DNA specifying a human”, without stretching the meaning. After all, what good is a program without understanding the interpreter and having a platform to run it on?
To sum up: Adverse point mutation pressure occuring in protein coding regions is at least partially offset in early stages of germ line, where the quantities selected upon are huge. Point mutations occuring in other regions have implications and mechanisms which are not nearly well enough understood. Thus I dont see a solid ground for the quantitative conclusions made in this article, and only some ground for the qualitative conclusions.
Wei Dai, being able to send 10 bits each with a 60% probability of being correct, is not the same as being able to transmit 6 bits of mathematical information. It would be if you knew which 6 bits would be correct, but you don’t. I’m not sure how to bridge the disconnect between variance going as the square root of a randomized genome, and the obvious argument that eliminating half the population is only going to get you 1 bit of mathematical information. It would probably be obvious if I’d spent more time on maxentropy methods.
Petteri, I hadn’t known that the DNA in a gamete was executed to construct the gamete itself—I thought the gamete was constructed by the parent, and the genetic material packaged inside. At least in mammals, fertilized embryos that begin execution of their packaged DNA are rare enough to be considered conserved. Not so?
Petteri, I think I can explain. First throw away the lemma that says one death can’t remove more than one gene. That’s a red herring.
Imagine a population of asexual bacterial. Imagine that they have a collection of sites that will kill them immediately if any one of those sites mutates. If a cell averages 1 such mutation per generation, it will on average produce one live and one dead daughter cell per generation. It will not survive. This is an absolute limit for the mutation rate, for that kind of mutation.
Now suppose that it has a lot of sites that result in loss of function when mutated, but that don’t simply kill the cell. Imagine that it averages 1 such mutation per generation. So assuming a poisson distribution, after the first generation all but about 40% will have lost a capability. More than 20% will have lost two or more capabilities. The population as a whole will degrade and no amount of selection can prevent it. Of course each mutation can be reversed about as easily as it happened. If the mutations are all independent then at equilibrium about half of the nonessential functions would work even without selection. But they aren’t independent. On average there are usually around a hundred different ways to make an enzyme stop working with one mutation, and in each case only one way to reverse it. So without selection each function would be destroyed 50 times over. And again the mutations are happening faster than they could be selected against. If the population is size N and you get less than N/2 perfect individuals in the next generation you can’t keep up.
Mutations with very small effect are insignificant. Things that are selected at 0.01% have never been made common due to that selection and it doesn’t matter that they won’t be maintained by selection. Mutations with small disadvantage are not so important and unless there are a whole lot of them they can be mostly ignored in a population that’s still evolving. Most of them will disappear pretty fast, like gamblers who have a very low reserve in a casino where the odds are a little bit against them. The majority of them disappear the first generation, many of the rest disappear the second generation, the ones that hang on awhile are the ones that accidentally got a fair number of instances early, by random chance. Like flipping 20 heads in a row with a coin that’s slightly biased toward tails. It happens, but rarely. And those will usually be wiped out when a favorable mutation gets established. About the time there are hundreds of mutations that provide a 1% disadvantage but haven’t been removed, a single mutation with a 1% advantage shows up and pushes them all out over the next thousand generations or so. Mutations that persist with a 1% disadvantage are rare; mutations that persist against a 2% relative advantage are rarer. The presence of advantageous mutations can overcome large numbers of disadvantageous mutations provided the disadvantageous ones don’t come fast enough to seriously slow their spread.
When a favorable mutation spreads, it reduces the variation in fitness in the population. It spreads mostly by replacing the ones with lower fitness. About the time it approaches fixation, most of the others are gone. If there are ten different favorable mutations spreading at the same time probably no one of them will be fixed; the others will get their share. That doesn’t increase the speed of evolution much at all. The big thing is that selection has decreased the variability and so decreased the speed for further selection. The descendants of the mutant continue to mutate and build up variability to replace what was lost, but that’s a slow process.
SEXUALITY CHANGES THIS ALL AROUND.
Given sexuality and chromosome assortment but no recombination, a species with 100 chromosomes can evolve much faster than an asexual bacterial population! Each individual chromosome can be selected, mostly independently of the others. MacKay basicly says this should go by the square root of the number of linkage groups. Here’s my explanation—if the different mutations present affect fitness in orthogonal ways, then their total effect on fitness is likely to be something like sqrt(a^2 + b^2 + c^2 etc). It comes from them being orthogonal.
Recombination breaks up the linkages within chromosomes and lets things go even faster.
So ignoring other effects, the smaller the chromosomes and the smaller the linkage groups, the higher the limit on evolution speed. But the limits on lethal (or crippling) mutations are still there. If you have too high a chance to get a dominant lethal mutation, it doesn’t matter how many chromosomes you’ve split the genome over, it’s still lethal.
So ideally it might be proper to be extra careful not to mutate DNA sequences that have been maintained unchanged for a long time, and less so with things that are less important. I don’t know how much that happens.
I’m not sure I correctly understood your comment, so I hope I’m not responding to something you didn’t actually say.
I agree there’s a distinction between human intelligence (and any other human trait), and natural selection, but I believe the distinction is that the former is a subset of the latter, rather than a separate and later entity.
In other words, I’m not claiming that nothing can ever create something better than itself, but that in the case of humanity’s work it is still a part of natural selection (and hence natural selection can do anything humans do by way of the human activity that is part of natural selection).
Am I violating some existing convention on the usage of the phrase “natural selection”? It seems to me that it’s purely through the same natural selection which evolves some wheels out of a bunch of organic and other materials that also evolve some wheels out of some human neurons and a bunch of organic and inorganic materials.
Is there a non-arbitrary point at which one stops counting the behavior of evolved systems as natural selection? (And calling human artifice the stopping point doesn’t automatically make artificial evolution a non-subset of natural evolution.)
Eliezer, you are mistaken that eliminating half of the population gives only 1 bit of mathematical information. If you have a population of N individuals, there are C(N,N/2) = N!/((N/2)!*(N/2)!) different ways to eliminate half of them. See http://en.wikipedia.org/wiki/Combination. Therefore it takes log_2(C(N,N/2)) (which is O(N)) bits to specify how to eliminate half of the population.
So, it appears that with sexual recombination, the maximum number of bits a species can gain per generation is min(O(G^0.5), O(N)).
Oh, sending 10 bits each with a 60% probability of being correct actually lets you send a total of just 0.29 bits of information. Each bit in this communications channel gives the receiver only 0.029 bits of information, using a formula from http://en.wikipedia.org/wiki/Binary_symmetric_channel. But, the amount of information actually received still grows linearly with the number of bits put into the channel.
“Wei Dai, being able to send 10 bits each with a 60% probability of being correct, is not the same as being able to transmit 6 bits of mathematical information. It would be if you knew which 6 bits would be correct, but you don’t.”
“Given sexuality and chromosome assortment but no recombination, a species with 100 chromosomes can evolve much faster than an asexual bacterial population!”
No, it can’t. Suppose that you want to maintain the genome against a mutation pressure of one hundred bits per generation (one base flip/chromosome, to make it simple). Each member of the population, on average, will still have fifty good chromosomes. But you have to select on the individual level, and you can’t let only those organisms with no bad chromosomes reproduce: the chances of such an organism existing are astronomical. You would have to stop, say, anyone with more than forty bad chromosomes from reproducing, so maybe the next generation you’d have an average of 37 or so. But then you add fifty more the next generation… the species will quickly die out, because you can’t remove them as fast as they’re being introduced.
“Each individual chromosome can be selected, mostly independently of the others.”
All the chromosomes are packaged together in a single individual. Reproduction occurs at the individual level; you can’t reproduce some chromosomes and not others.
Tom McCabe, having 100 chromosomes with no recombination lets you maintain the genome against a mutation pressure of 10 bits per generation, not 100. (See my earlier comment.) But that’s still much better than 1 bit per generation, which is what you get with no sex.
Wei Dai, that’s the amount of Bayesian information a human observer could extract from a message deliberately encoded into eliminating a very precisely chosen half of the population. It’s not the amount of information that’s going to end up in the global allele frequencies of a sexually reshuffled gene pool.
I found the MacKay logic disturbingly persuasive. So I wrote a Python program to test the hypothesis; it’s now attached to the main article. Increasing the genome size did not increase the supported bits as the square root of the genome size, though I do not totally understand the behavior of this program. The number of supportable bits does seem to go as the log of the number of children per parent. However, one bit of selection working against a 0.1 probability of mutation seems to support 20 bits of information instead of 10 bits, and I’m not really sure why.
The discussion Iâm reading is interesting from a computational perspective. From the biology perspective there is a basic problem in the premise and this could be due to unexamined bias.
The âreductionistâ model for biology is no longer considered workable. Here is a website that includes a discussion of this with good links to other articles.
http://www.psrast.org/strohmnewgen.htm
The âreductionistâ model is no longer considered valuable in medicine either. Check this link:
http://medicine.plosjournals.org/perlserv/?request=get-document&doi=10.1371%2Fjournal.pmed.0030208
In other words, making any conclusion about life or evolution or the nature of man based on how many bits of information is held in a strand of DNA does not square with current thought.
For example, the speed limit of mutation does not fit well with the observed âadaptive mutationsâ where in a life form will increase the number of mutations in response to a stress so as to survive. (The increased mutations does not lead to death- but survival)
Eliezer, I have to admit I’m not studied on the field enough, and I’ve not read papers on this particular. The initial cell bodies of a gamete of course come from the parent cell.
However after that, they have to keep on living. And they do this using their own genetic material to support and repair themselves, manufacturing new cell bodies, enzymes and other constituting proteins as old bodies deterioriate. All ovum in ovaries are present at birth of the female, thus having to be able to maintain their function at least up to menopause. This makes them one of the longest living cells beside neurons, while being haploid at that. Of course, as far as protein synthesis goes, I dont think this feat requires the usage of that great percentage of all genes after all, but it’s something anyway.
On the other hand, in the germ line there is a development phase from a fertilized ovum to the new gonads, where the germ line DNA is carried across multiple divisions in singular line. This accumulates more of said mutations, while selecting only against the most destructive mutations which express themselves in a relatively short period of time (the germ line cells are stem cells up to when gonad cells start to actually differentiate, and the differentiation obviously begins from more or less healthy stem cells at that point). Still, on the whole, this increases the corruptive pressure.
However, what I find more interesting, is how the point mutations affect regulatory areas and relevant ‘junk’ DNA. But we just don’t know enough of the mechanics there.
This is my relevant contribution.
Other commenters have made interesting points on how small adverse effect mutations dont really spread out quickly in the population, and how when selection actually happens post birth, it often tends to be the result of a combined work of many such adverse mutations, not just one. In this case, one death removes, on the average, more than just one adverse mutation from the pool. I haven’t delved deeper into the subject so I can’t say if this contradicts the initial assumption of “one mutation, one death” or not, although to me it seems it does. Why wouldn’t it? My apologies if this is ignorant question, I would do the maths now myself, but it’s late and I need the sleep.
Found factual errors from my previous post, after doing last round of research. Should’ve done it before submitting…
The ova are not in haploid state for the lifetime of the female, but in diploid state, arrested at prophase of meiosis I. Couldn’t find out how much cellular activity they have at this stage, but anyway, there is thus still one DNA replication at only a short period before possible fertilization. This renders much of my above argumentation mostly null. Indeed could be the DNA of ova and spermatozoa isn’t expressed at all at the haploid cell. This would leave the pre-birth selection pressure only to spontaneous zygote abortions (which have other explanations than mere point mutations to them as well).
Meh, not thinking straight anymore. There are no DNA replications at meiosis, although the recombination does happen here. But still, if there’s no cellular activity during the arrested prophase, the DNA isn’t tested anyway. Now I go to sleep, hoping I wont make any more mistakes.
Cyan,
> I can’t really process this query until you relate the words you’ve used to the math MacKay uses
On Page 1, MacKay posits x as a bit-sequence of an individual. Pick an individual an random. The question at hand is whether the Shannon Entropy of x, for that individual, decreases at a rate of O(1) per generation.
This would be one way to quantify the information-theoretic adaptive complexity of an individual’s DNA.
In contrast, if for some odd reason you wanted to measure the total information-theoretic adaptive complexity of the entire species, then as the population N → â, the total amount of information maxes out in one generation (since, if he had access to the entire population, anyone with a calculator and a lot of spare time could, more-or-less, deduce the entire environment after one generation.)
Eliezer, the simulation is a great idea. I’ve used it to test the following hypothesis: given sufficiently large population and genome size, the number of useful bits that a sexual species can maintain against a mutation probability (per base) of m is O(1/m^2). The competing hypothesis is the one given in your opening post, namely that it’s O(1/m).
To do this, I set Number=1000, Genome=1000, Beneficial=0, and let Mutation range from 0.03 to 0.14 in steps of 0.01. Then I plotted Fitness (which in the program equals the number of useful bits in the genome) against both 1/Mutation and 1/Mutation^2. I think the results [1] are pretty clear: when 1/Mutation^2 is small compared to Number and Genome, Fitness is linear in 1/Mutation^2, not 1/Mutation.
Oh, I rewrote the simulation code in C++ to make it run faster. It’s available at http://www.weidai.com/fitness/adhoc.cpp.
[1] http://www.weidai.com/fitness/fitness.htm, first plot is Fitness vs 1/Mutation, second is Fitness vs 1/Mutation^2.
No, No, No, No!!
First of all I strongly object to the use of the word ‘information’ in this article. There is no limit on the amount of information in the gene if by information we mean anything like Kolmogorov complexity. Any extra random string of bases that get tacked on to the DNA sequence is information by any reasonable definition. If you mean something like Shannon’s entropy then things are just totally unclear because you need to tell us what distribution you are measuring the information of.
As far as the actual conclusions your post is too unclear and nonrigorous to really make sense of so I’m going by the paper a commenter linked to by MacKay so if your arguments substantially differ please clarify.
The first problem in MacKay’s analysis is his model of fitness as being distance to one particular master genome. This is a good model if the question is how much information could god convey to someone who just got to see the bitstring of human DNA and no other facts about the universe by appropriate selection of physical laws but doesn’t seem particularly related to what we usually mean when we informally talk about an organisms complexity. What we want to know is not how quickly convergence could be driven to a particular pre-selected bitstring but how quickly complex functionality could be evolved.
The second problem in MacKay’s analysis is that he assumes sexual mating occurs at random. It is easy to give a counterexample to his bound in a society where people choose mates wholly on the basis of their fitness (i.e. in his model distance from the ideal bitstring).
I’m not going to go any further because I have other things to do now but while it’s an interesting little mathematical toy many of the other simplifying assumptions seem questionable as well. In particular what I would primarily derive from this work is the importance of meta-systems in evolution.
Yup, if we all mated at random it would be tough to evolve. Hence we choose mates based on our estimation of their evolutionary fitness. Similarly with easy to modify control sequences affecting gene traits.
To make the simulation really compelling it has to include some sort of assortative mating.
“Given sexuality and chromosome assortment but no recombination, a species with 100 chromosomes can evolve much faster than an asexual bacterial population!”
No, it can’t. Suppose that you want to maintain the genome against a mutation pressure of one hundred bits per generation (one base flip/chromosome, to make it simple). Each member of the population, on average, will still have fifty good chromosomes. But you have to select on the individual level, and you can’t let only those organisms with no bad chromosomes reproduce: the chances of such an organism existing are astronomical.
A sexual population can’t handle more lethal mutations than an asexual one. Since the mutation is lethal, before a cell carrying it can have sex and reproduce, it dies.
And just as an asexual population can’t handle a mutation rate that swamps it (say, 1 very deleterious mutation per cell per generation) a sexual population with 100 chromosomes can’t handle 100 per cell per generation.
But still the sexual population can do better. Here’s why:
Since back-mutation is rare, an asexual population gets onto a slippery slope when it loses all its nonmutants. If the time comes when there are no individuals without an unfavorable mutation, then the exact same mechanism which got them there will get them to a time when there are no individuals without two unfavorable mutations and so on. (See “Muller’s Ratchet”.)
But a sexual population can maintain an equilibrium just fine where the average individual has 8 or 10 unfavorable mutations, provided those individuals are still viable.
Consider the simple example—one individual has one mutation, a second individual has another unlinked one. They mate and produce 4 offspring, and the average result would be 1 wild-type, 2 with one mutation and 1 with 2 mutations. More room for selection than just 2 cells with a mutation each.
Sexual populations depend on the average cells, and the distribution of mutations is dependably binomial. Asexual populations depend on the wild-type to outcompete everything else, and when the superior wild-type gets to be too small a fraction they become undependable.
logicnazi,
Wei Dai’s post doesn’t make sense except in the context of MacKay’s paper. If you’ve read the paper thoroughly, it should be pretty clear what he’s talking about.
The fact that the MacKay’s fitness function is the distance to a “master” genome has nothing to do with how much information god could convey to someone. It’s just a way to model constant environmental conditions, like the sort of thing that has kept modern sharks around since they evolved 100 million years ago.
His model about as simplified as one could get and still have descent with modification and selection, but it’s entirely adequate for the limited purpose for which I brought it up in this conversation. That is, it contains the minimal set of features that a system needs to be covered by Eliezer’s dictum of 1 bit gained per generation; therefore we can use it to test the assertion just by running the model. This is just what Wei Dai has done—nothing more and nothing less. In particular, the model has no notion of complexity, but it doesn’t need one for us to test Eliezer’s assertion.
michael vassar,
“To make the simulation really compelling it has to include some sort of assortative mating.”
Meh. Assortive mating can decrease or increase the variance of the progeny, depending on whether the sorting is by similarity or dissimilarity, respectively. I’m happy with random mating as a first step.
“The second problem in MacKay’s analysis is that he assumes sexual mating occurs at random. It is easy to give a counterexample to his bound in a society where people choose mates wholly on the basis of their fitness (i.e. in his model distance from the ideal bitstring).”
Logicnazi, Eliezer’s model may do that somewhat and would be easy to adapt to do what you want.
I don’t know python, so I may be wrong when I suppose that the children in his model come out sorted. The next generation they receive a random number of mutations, typically 10, and then the ones that are side-by-side are mated and assorted. The number of new mutations they get is random but the number of old mutations they already had may be ordered—maybe the ones that were most fit at that point are the ones that mate together, and the ones that were least fit, and so on. To do complete sexual selection based on perfect knowledge, you could simply sort the parents after their mutations and then it ought to work your way. One simple extra step.
The point of MacKay’s toy model is that it displays results people were already arguing about. He designs it to show the results he wants. So he does truncation selection—probabilistic selection would be slower. He has his genes assort with no linkage. In all the most basic ways he makes the choice that should result in the most effective selection. He does not do sexual selection based on fitness, probably because readers would call foul. It isn’t clear that humans can judge fitness with 100% reliability, much less rotifers or yeast cells. Sometimes mating might be closer to random.
Wei, did you run at only 300 generations and if so, did you check to see if the fitness had reached equilibrium? I noticed it was declining but rather slowly even after 2000 generations with a genome that size.
(Will try running my own sims tomorrow, got to complete today’s post today.)
I only ran 300 generations, but I just redid them with 5000 generations (which took a few hours), and the results aren’t much different. See plots at http://www.weidai.com/fitness/plot3.png and http://www.weidai.com/fitness/plot4.png.
I also reran the simulations with Number=100. Fitness is lower at all values of Mutation (by about 1⁄3), but it’s still linear in 1/Mutation^2, not 1/Mutation. The relationship between Fitness and Number is not clear to me at this point. As Eliezer said, the combinatorial argument I gave isn’t really relevant.
Also, with Number=1000, Genome=1000, Mutate=0.005, Fitness stabilizes at around 947. So, extrapolating from this, when 1/Mutate^2 is much larger than Genome, which is the case for human beings, almost the entire genome can be maintained against mutations. It doesn’t look like this line of inquiry gives us a reason to believe that most of human DNA is really junk.
Elizer, I’ve noticed an apparent paradox in information theory that may or may not be related to your “disconnect between variance going as the square root of a randomized genome, and the obvious argument that eliminating half the population is only going to get you 1 bit of mathematical information.” It may be of interest in either case, so I’ll state it here.
Suppose Alice is taking a test consisting of M true/false questions. She has no clue how to answer them, so it seems that the best she can do is guess randomly and get an expected score of M/2. Fortunately her friend Bob has broken into the teacher’s office and stolen the answer key, but unfortunately he can’t send her more than 1 bit of information before the test ends. What can they do, assuming they planned ahead of time?
The naive answer would be to have Bob send Alice the answer to one of the questions, which raises the expected score to 1+(M-1)/2.
A better solution is to have Bob tell Alice whether “true” answers outnumber “false” answers. If the answers are uniformly distributed, the variance of the number of “true” answers is M/4, which means Alice can get an expected score of M/2+sqrt(M)/2 if she answers all “true” or all “false” according to what Bob tells her. So here’s the paradox: how did Alice get sqrt(M)/2 more answers correct when Bob only sent her 1 bit of information?
(What if the teacher knew this might happen and made the number of “true” answers exactly equal to the number of “false” answers? Alice and Bob should have established a common random bit string R of length M ahead of time. Then Bob can send Alice a bit indicating whether she should answer according to R or the complement of R, with the same expected outcome.)
Wei, the result of my own program makes no sense to me. It wasn’t predicted by any of our prior arguments. MacKay says that the supportable information should go as the square root of the genome size, not that supportable information should go as the inverse square of the mutation rate. We’re not getting a result that fits even with what MacKay said, let alone with what Williams said; and, I should point out, we’re also not getting a result that fits with there being <25,000 protein-coding regions in the human genome.
Maybe you can’t sort and truncate the population and have to use probabilistic reproduction proportional to fitness? If so, that would indicate the intuitive argument from “4 children, 2 survivors” is wrong. But in my own experiments fitness did seem roughly proportional to log children, and not proportional to (square root) genome size, which was the only part I had thought to predict.
The big puzzle here is the inverse square of the mutation rate. The example of improvement in a starting population with a randomized genome of maximum variance, which can’t be used to send a strongly informative message, doesn’t explain the maintenance of nearly all information in a genome.
Are there any professional evolutionary theorists in the audience? Help!
Eliezer, MacKay actually does predict what we have observed in the simulations. Specifically equation 28 predicts it if you let δf=f instead of δf=f-1/2. You need to make that change to the equation because in your simulation with Beneficial=0, mutations only flips 1 bits to 0, whereas in MacKay’s model mutations also flip 0 bits to 1 with equal probability.
Wei, MacKay says in a footnote that the size of the maximally supported genome goes as 1/m^2, but as I understand his logic, this is because of larger genomes creating a factor of sqrt(G) improvement in how much information can be supported against a given mutation rate. Haven’t had time yet to examine equation 28 in detail.
Stable population of asexual haploid bacteria considering only lethal mutations:
Let “G” be the genome string length in base pairs.
Let “M” be the mutations per base pair per division.
Let “numberOfDivisions” be the average number divisions a bacterium undergoes before dying.
Let “survivalFraction” be the probability that division produces another viable bacterium.
survivalFraction = (1 - M)**G. (Assuming mutation events are independent.)
1 = numberOfDivisions x survivalFraction. (Assuming population size is stable.)
Then ln(1/numberOfDivisions) = G ln(1 - M).
G = -ln(numberOfDivisions) / ln(1 - M).
Using Taylor series for small M gives
ln(1 - M) = -M + higher order terms of M.
So G = ln(numberOfDivisions) / M.
Which does not match the simulation observation that G = O(1 / M**2).
Summarizing my thoughts:
1) For lethal mutations the rule, “one mutation, one death”, holds.
In life few mutations will be lethal. Even fewer in a sexual species with genetic redundancy. So the information content limits calculated by assuming only lethal mutations will not apply to the human genome.
2) Selection may not directly affect population size.
E.g., in sexual selection winners and losers are balanced so the total number of offspring is relatively constant. So minor harmful mutations may be removed with high efficiency without affecting total population size.
3)High selection pressure may drive the specie gene pool high up a local fitness peak. However being “too optimized” might hurt specie survival by lowering variance and making the specie more vulnerable to environmental variation, e.g., new pathogens. Or it may decrease the probability of a two-mutation adaptation that might have improved competitiveness again a different species. (Humans may eventually out-compete fruit flies.)
4) Working with selection, crossover and assortative mating remove the most harmful mutations quickly at a high “death” cost (worst case is one death per mutation removal) and remove less harmful mutations slowly at a low “death” cost. The “mutation harmfulness” vs. “mutation frequency” graph likely follows a power law. It should be possible to derive a “mutation removal efficiency” relationship for each “mutation fitness cost”. Such functions are likely different for each specie and population structure.
5) Selection operates on traits. Traits usually depend on complex network interaction of genetic elements. Most genetic elements simultaneously affect many traits. Therefore most trait values will follow an inverted bathtub curve, i.e., low and high values are bad and the mid-range is good. (Body homeostasis requires stable temperature, ph, oxygen level, nutrient level, etc.) Evolution has favored robust systems with regulatory feedback to adjust for optimal trait values in the face of genetic, stochastic, and environmental variation.
(The “bath tub curve” is essentially a one-state system. Multi-state regulatory systems are also common in biology and can be used to differentiate cells.)
6) Total genome information content is limited by the mutation rate and the number of bit errors that are removed by selection. (In the Shannon sense of a message being a string of symbols from a finite set and transmission between generations being a noisy communication channel.) I believe this numerical limit is highly dependent on specie reproductive biology and population dynamics.
Increases in genome information content are not directly related to “evolutionary progress”. In evolution the genome “meaning” is more important than the genome “message”. Over evolutionary time the average “meaning” value of each bit may be increasing. Evolution of complex genetic regulatory systems increased the average “meaning” value per bit. Evolution of complex brains capable of “culture” increased the average “meaning” value per bit. (The information bits that give humans the ability to read are more valuable than the information bits in a book.)
7) The total information in a specie genome can be far greater than the information contained in any individual genome. This is true for sexual bacteria colonies that exchange plasmids. It is also true for animal species, e.g., variation in immune system DNA that protects the specie from pathogens. Variation is the fuel that selection burns for adaptation.
Eliezer, I just noticed that you’ve updated the main post again. The paper by Worden that you link to makes the mistake of assuming no crossing or even chromosomal assortment, as you can see from the following quotes. It’s not surprising that sex doesn’t help under those assumptions.
(being quote)
Next consider what happens to one of the haploid genotypes j in one generation. Through random mating, it gets paired with another haploid genotype k, with probability q; then the pair have a probability of surviving sigmajk.
…
(b) Crossing: Similarly, in a realistic model of crossing, we can show that it always decreases the diploid genotype information Jµ. This is not quite the same as proving that crossing always decreases Iµ, but is a powerful plausibility argument that it does so. In that case, crossing will not violate the limit.
(end quote)
As for not observing species gaining thousands of bits per generation, that might be due to the rarity of beneficial mutations. A dog not apparently having greater morphological or biochemical complexity than a dinosaur can also be explained in many other ways.
If you have the time, I think it would be useful to make another post on this topic, since most people who read the original article will probably not see the detailed discussions in the comments or even notice the Addendum. You really should cite MacKay. His paper does provide a theoretical explanation for what happens in the simulations, if you look at the equations and how they are derived.
Wei, I need to find enough time to go over the math with a fine-toothed comb, both in Worden’s paper and MacKay’s. Worden says his result holds for sexual reproduction and I’m not sure the simulation disproves that; in my own experiments, sustainable information did go as log(children). Rather than Worden being wrong, it may be that one bit of mathematical information per generation, suffices to cancel out an amount of mutation at equilibrium which goes as the square root of the number of mutations—for reasons similar to the ones you gave above. One death, many mutations, as some previous commenter remarked. In other words the only math mistake would have been, neither in Worden’s paper nor MacKay’s, but the single calculation I tried to do myself—which seems plausible, as they underwent peer review and I didn’t. (Actually, I guess I just did, sort of.)
When I’m sure I’ve got the math right this time, and worked out which commenters get the credit for correcting me, I can do another post. Meanwhile I inserted a quickie warning so that no additional readers would be misled.
Eliezer: “If the entire human genome of 3 billion DNA bases could be meaningful, it’s not clear why it would contain <25,000 genes”
I wouldn’t say we know enough about biological mechanics to say we necessarily need more protein coding-DNA that protein-regulating DNA. If you think about it, collagen the protein is used in everything from skin, tendons, ligaments, muscles, fascia, etc. But you can’t code for all of those uses of collagen just by HAVING the collagen code in the DNA, you need regulating code to instruct when/where/how to use it.
Also, as I explained earlier, it seems doubtful that you could ever calculate the maximum sustainable DNA that actually codes, unless you know how many mutations are detrimental. You might be able to come up with a mathematical relationship, but not an absolute amount that would rule all that human DNA is junk.
MacKay comments on Kimura’s and Worden’s work and its relation to his own on page 12 of the paper. In particular, he notes that in Worden’s model, fitness isn’t defined as a relative quality involving competition with other individuals in the population; rather, one’s genotype determines the probability of having children absolutely. MacKay says that this is how Worden proves a speed limit of one bit per generation even with sexual reproduction, but he doesn’t do any math on the point.
“The big puzzle here is the inverse square of the mutation rate. The example of improvement in a starting population with a randomized genome of maximum variance, which can’t be used to send a strongly informative message, doesn’t explain the maintenance of nearly all information in a genome.”
(hacks program for asexual reproduction)
I’ve found that, assuming asexual reproduction, the genome’s useful information really does scale nice and linearly with the mutation rate. The amount of maintainable information decreases significantly (by a factor the three or so, in the original test data).
If you take a population of organisms, and you divide it arbitrarily into 2 groups, and you show the 2 groups to God and ask, “Which one of these groups is, on average, more fit?”, and God tells you, then you have been given 1 bit of information.
But if you take a population of organisms, and ask God to divide it into 2 groups, one consisting of organisms of above-average fitness, and one consisting of organisms of below-average fitness, that gives you a lot more than 1 bit. It takes n lg(n) bits to sort the population; then you subtract out the information needed to sort each half, so you gain n lg(n) − 2(n/2)lg(n/2) = n[lg(n) - lg(n/2)]
= nlg(2) = n bits.
If you do tournament selection, you have n/2 tournaments, each of which gives you 1 bit, so you get n/2 bits per generation.
I was curious about the remark that simulation results differed from theoretical ones, so I tried some test runs. I think the difference is due to sexual reproduction.
Eliezer’s code uses random mating. I modified it to use asexual reproduction or assortative mating to see what difference that made.
Asexual reproduction:
mutation rate 0.1 gave 6 bits preserved
0.05 preserved 12-13 bits
0.025 preserved 27
increasing population size from 100 to 1000 bumped this to 28
decreasing the beneficial mutation rate brought it down to 27 again
so the actual preserved information is fairly consistently 0.6 times the theoretical value, with some sort of caveat about larger populations catching beneficial mutations.
Random mating:
mutation rate 0.1 gave 20 bits preserved (already twice the theoretical value)
Assortative mating:
mutation rate 0.1 gave 25-26 bits preserved
0.05 preserved 66 bits
increasing population size from 100 to 1000 bumped this to 73
So sexual reproduction helps in a big way, especially if mating is assortative and/or the population is large. Why? At least part of the explanation, as I understand it, is that it lets several bad mutations be shuffled into one victim. I don’t know the mathematics here, but there’s a book with the memorable title ‘Mendel’s Demon’ that I read some years ago, which proposed this (in addition to the usual explanation of fast adaptation to parasites) as an explanation for the existence of sex in the first place. These results would seem to support the theory.
According to http://www.technologyreview.com/view/513781/moores-law-and-the-origin-of-life/?utm_content=bufferc6744&utm_source=buffer&utm_medium=twitter&utm_campaign=Buffer http://arxiv.org/abs/1304.3381
This rate is increasing with time, or the earth is younger than life.