Of course the description of the genome presented here is over-simplification of the process. The genome and the cell are much more complicated than a simple Turing machine. However I have to emphasize two main ideas:
First, saying that the biological machinery is more complex doesn’t mean that you cannot evaluate the complexity of an organism from its genome. It’s just means that you need to make your model more complex, and include the necessary parameters such as RNA interference, epigenetic, and cellular activity into account.
The technical problem to take one organism DNA and transfer it to another (changing the program), really just say something about our current technology and not about the theoretical impossibility to do so. I believe that these kinds of experiments will be completely possible in a metazoan cell within the next decade or two. Time will tell.
Second, in a deep sense, every aspect of an organism is encoded in the genome. The organelles and membrane structure, the biochemical activity, the system-biology, the development of milticellular organism, the epigenetic itself, its all relay eventually to specific genomic sequences. These can be regulatory sequences, epigenetic related sequences, protein coding genes, RNA genes etc. even tough you cannot directly associate all the cellular functions to the knowledge you currently have about the genome, it does not means that it is doesn’t there. It has to be. The reason for that is that the “programmer of the genome”, hence evolution, works primarily by changing the genome. The genetic code is the information that last across the ages, and it have to encode all the programming necessary to make an organism, including the entire cellular and organism biochemistry and epigenetics.
Third, if you want to estimate the true complexity of the genome, you need to appreciate the algorithms it is actually encodes: The genome encodes to thousands of proteins that create a computational system by themselves. For example, it encodes proteins that can sense the environment, calculate huge amount of possible states, by constructing large networks of protein interactions and signaling, and eventually changing the entire biochemistry of the cell, and even the genome itself. So the real complexity of the cells does not hidden in the number of proteins or RNAs it encodes. The true complexity is in the regulatory system telling which protein to express when, and how to react to environmental cues.
In multi-cellular environments it even more intricate—each cell (with the same genome) have different functions, asserted by the states of proteins networks that responded to complicated signaling and internal states during development. The genome is patterning a remarkable complex organism, without actually encoding all the developmental stages, just the mechanism that can calculate and response to the environment.
Another example will be the immune system: instead of encoding all the possible responses to all the possible pathogens (infinite possibilities), inside the genome, you encode to a learning system that can gradually learn to react to different threats when needed. So again, the genome encodes to another computation system by itself.
One final example would be the animal brain – the genome does not encode all the information and processing power of the brain. Instead, it encodes the basic developmental and cell biology processes needed to develop a brain, which can learn, calculate and react to changing environments.
So, if you want to generalize the complexity of an organism from its DNA, you have to account for the ways this DNA actually compress its algorithms in systems that can learn, calculate and create infinite number of possible outcomes by themselves. In analogy, it will be like understanding the complexity of a computer program to program new programs that each can respond to the environment and change itself accordingly, and each of these programs are interconnected into one big system. To add more complexity, these programs are running simultaneously, changing one another, and eventually even changing the basic code that encoded them in the first place. This is how complex DNA program is.
Of course the description of the genome presented here is over-simplification of the process. The genome and the cell are much more complicated than a simple Turing machine. However I have to emphasize two main ideas:
First, saying that the biological machinery is more complex doesn’t mean that you cannot evaluate the complexity of an organism from its genome. It’s just means that you need to make your model more complex, and include the necessary parameters such as RNA interference, epigenetic, and cellular activity into account. The technical problem to take one organism DNA and transfer it to another (changing the program), really just say something about our current technology and not about the theoretical impossibility to do so. I believe that these kinds of experiments will be completely possible in a metazoan cell within the next decade or two. Time will tell.
Second, in a deep sense, every aspect of an organism is encoded in the genome. The organelles and membrane structure, the biochemical activity, the system-biology, the development of milticellular organism, the epigenetic itself, its all relay eventually to specific genomic sequences. These can be regulatory sequences, epigenetic related sequences, protein coding genes, RNA genes etc. even tough you cannot directly associate all the cellular functions to the knowledge you currently have about the genome, it does not means that it is doesn’t there. It has to be. The reason for that is that the “programmer of the genome”, hence evolution, works primarily by changing the genome. The genetic code is the information that last across the ages, and it have to encode all the programming necessary to make an organism, including the entire cellular and organism biochemistry and epigenetics.
Third, if you want to estimate the true complexity of the genome, you need to appreciate the algorithms it is actually encodes: The genome encodes to thousands of proteins that create a computational system by themselves. For example, it encodes proteins that can sense the environment, calculate huge amount of possible states, by constructing large networks of protein interactions and signaling, and eventually changing the entire biochemistry of the cell, and even the genome itself. So the real complexity of the cells does not hidden in the number of proteins or RNAs it encodes. The true complexity is in the regulatory system telling which protein to express when, and how to react to environmental cues. In multi-cellular environments it even more intricate—each cell (with the same genome) have different functions, asserted by the states of proteins networks that responded to complicated signaling and internal states during development. The genome is patterning a remarkable complex organism, without actually encoding all the developmental stages, just the mechanism that can calculate and response to the environment. Another example will be the immune system: instead of encoding all the possible responses to all the possible pathogens (infinite possibilities), inside the genome, you encode to a learning system that can gradually learn to react to different threats when needed. So again, the genome encodes to another computation system by itself. One final example would be the animal brain – the genome does not encode all the information and processing power of the brain. Instead, it encodes the basic developmental and cell biology processes needed to develop a brain, which can learn, calculate and react to changing environments.
So, if you want to generalize the complexity of an organism from its DNA, you have to account for the ways this DNA actually compress its algorithms in systems that can learn, calculate and create infinite number of possible outcomes by themselves. In analogy, it will be like understanding the complexity of a computer program to program new programs that each can respond to the environment and change itself accordingly, and each of these programs are interconnected into one big system. To add more complexity, these programs are running simultaneously, changing one another, and eventually even changing the basic code that encoded them in the first place. This is how complex DNA program is.