I agree with that—I mean its main graph has only 5 datapoints.
Still—the general idea (even if poorly executed) is interesting and could be roughly correct—but showing it in the way they intend to will require much more sophisticated computable measures of biological complexity. Machine learning techniques—acting as general compressors—could eventually help with that.
But any measure of biological complexity you could care to generate can increase or decrease over evolutionary time. Looking at modern organisms doesn’t help you.
But any measure of biological complexity you could care to generate can increase or decrease over evolutionary time.
Only at high frequencies. But at a more general level we have strong reasons to believe that the basic form of the argument is correct—that the overall complexity of the terrestrial biome has generally increased over the course of history from the origin of life up to today. Computational models of evolution more than suggest this—it is almost a given.
The problem of course is in actually quantifying the biome complexity—using say KC type measures, which require sophisticated compression. In fact, one’s ability to compute the true KC measure is only achieved in the limit of perfect compression—which incidentally corresponds to perfect understanding of the data!. But with more sophisticated compression we could perhaps approach or estimate that limit.
A useful approximate measure would need to consider the full set of DNA in existence across the biome at a certain point in time. Duplications and related transformations are obviously compressible, whereas handling noise-like variation is more of a challenge. One way to handle it is to consider random draws from the implied species-defining distribution. For a species with lots of high variance/noisy (junk) sequences, the high variance sections then become highly compressible because one only has to specify the aggregate distribution (such that draws from that distribution would implement the phenotype). At the limit a sequence which is completely unused and under no selection pressure wouldn’t contribute anything to the K-complexity.
I agree with that—I mean its main graph has only 5 datapoints.
Still—the general idea (even if poorly executed) is interesting and could be roughly correct—but showing it in the way they intend to will require much more sophisticated computable measures of biological complexity. Machine learning techniques—acting as general compressors—could eventually help with that.
But any measure of biological complexity you could care to generate can increase or decrease over evolutionary time. Looking at modern organisms doesn’t help you.
Only at high frequencies. But at a more general level we have strong reasons to believe that the basic form of the argument is correct—that the overall complexity of the terrestrial biome has generally increased over the course of history from the origin of life up to today. Computational models of evolution more than suggest this—it is almost a given.
The problem of course is in actually quantifying the biome complexity—using say KC type measures, which require sophisticated compression. In fact, one’s ability to compute the true KC measure is only achieved in the limit of perfect compression—which incidentally corresponds to perfect understanding of the data!. But with more sophisticated compression we could perhaps approach or estimate that limit.
A useful approximate measure would need to consider the full set of DNA in existence across the biome at a certain point in time. Duplications and related transformations are obviously compressible, whereas handling noise-like variation is more of a challenge. One way to handle it is to consider random draws from the implied species-defining distribution. For a species with lots of high variance/noisy (junk) sequences, the high variance sections then become highly compressible because one only has to specify the aggregate distribution (such that draws from that distribution would implement the phenotype). At the limit a sequence which is completely unused and under no selection pressure wouldn’t contribute anything to the K-complexity.