There seems to be a culture clash between computer scientists and biologists with this matter. DNA bit length as a back-of-the-envelope complexity estimate for a heavily compressed AGI source seems obvious to me, and, it seems, to Larry Page. Biologists are quick to jump to the particulars of protein synthesis and ignore the question of extra information, because biologists don’t really deal with information theoretical existence proofs.
It really doesn’t help the matter that Kurzweil threw out his estimate when talking about getting at AGI by specifically emulating the human brain, instead of just trying to develop a general human-equivalent AI using code suitable for the computation platform used. This seems to steer most people into thinking that Kurzweil was thinking of using the DNA as literal source code instead of just a complexity yardstick.
Myers seems to have pretty much gone into his creationist-bashing attack mode on this, so I don’t have a very high hopes for any meaningful dialogue from him.
I’m still not sure what people are trying to say with this. Because the kolmogorov complexity of the human brain given the language of the genetic code and physics is low, therefore X? What is that X precisely?
Because of kolmogorov complexities additive constant, which could be anything from 0 to 3^^^3 or higher, I think it only gives us weak evidence for the amount of code we should expect it to take to code an AI on a computer. It is even weaker evidence for the amount of code it would take to code for it with limited resources. E.g. the laws of physics are simple and little information is taken from the womb, but to create an intelligence from them might require a quantum computer the size of the human head to decompress the compressed code. There might be short cuts to do it, but they might be of vastly greater complexity.
We tend to ignore additive constants when talking about Complexity classes, because human designed algorithms tend not to have huge additive constants. Although I have come across some in my time such as this…
discrete DNA code → lots of messy chemistry and biology → human intelligence
and we’re comparing it to :
discrete computer code → computer → human intelligence
Kurzweil is arguing that the size of the DNA code can tell us about the max size of the computer code needed to run an intelligent brain simulation (or a human-level AI), and PZ Myers is basically saying “no, ’cause that chemistry and biology is really really messy”.
Now, I agree that the computer code and the DNA code are very very different (“a huge amount of enzymes interacting with each other in 3D real time” isn’t the kind of thing you easily simulate on a computer), and the additive constant for converting one into the other is likely to be pretty darn big.
But I also don’t see a reason for intelligence to be easier to express with messy biology and chemistry than with computer code. The things about intelligence that are the closest to biology (interfacing with the real world, how one neuron functions) are also the kind of things that we can already do quite well with computer programs.
There are some things that are “natural” to code in Prolog, but not natural in Fortran, fotran. So a short program in prolog might require a long program in Fotran to do the same thing, and for different programs it might be the other way around. I don’t see any reason to think that it’s easier to encode intelligence in DNA than it is in computer code.
(Now, Kurzweil may be overstating his case when he talks about “compressed” DNA, because to be fair you should compare that to compressed (or compiled) computer code, which translates to much more actual code. I still think the size of the DNA is a very reasonable upper limit, especially when you consider that the DNA was coded by a bloody idiot whose main design pattern is “copy-and-paste”, resulting in the bloated code we know)
But I also don’t see a reason for intelligence to be easier to express with messy biology and chemistry than with computer code.
Do you have any reason to expect it to be the same? Do we have any reason at all? I’m not arguing that it will take more than 50MBs of code, I’m arguing that the DNA value is not informative.
The things about intelligence that are the closest to biology (interfacing with the real world, how one neuron functions) are also the kind of things that we can already do quite well with computer programs.
We are far less good at the doing the equivalent of changing neural structure or adding new neurons (we don’t know why or how neurogenesis works for one) in computer programs.
But I also don’t see a reason for intelligence to be easier to express with messy biology and chemistry than with computer code.
Do you have any reason to expect it to be the same? Do we have any reason at all?
If I know a certain concept X requires 12 seconds of speech to express in English, and I don’t know anything about Swahili beyond the fact that it’s a human language, my first guess will be that concept X requires 12 seconds of speech to express in Swahili.
I would also express compressed versions of translations in various languages of the same book to be roughly the same size.
So, even with very little information, a first estimate (with a big error margin) would be that it takes as many bits to “encode” intelligence in DNA than it does in computer code.
In addition, the fact that some intelligence-related abilities such as multiplying large numbers are easy to express in computer code, but rare in nature would make me revise that estimate towards “code as more expressive than DNA for some intelligence-related stuff”.
In addition, knowledge about the history of evolution would make me suspect that large chunks of the human genome are not required for intelligence, either because they aren’t expressed, or because they only concern traits that have no impact on our intelligence beyond the fact of keeping us alive. That would also make me revise my estimate downwards for the code size needed for intelligence.
None of those are very strong reasons, but they are reasons nonetheless!
If I know a certain concept X requires 12 seconds of speech to express in English, and I don’t know anything about Swahili beyond the fact that it’s a human language, my first guess will be that concept X requires 12 seconds of speech to express in Swahili.
You’d be very wrong for a lot of technical language, unless they just imported the English words whole sale. For example, “Algorithmic Information Theory,” expresses a concept well but I’m guessing it would be hard to explain in Swahili.
Even given that, you can expect the languages of humans to all have roughly the same length because they are generated by the roughly the same hardware and have roughly the same concerns. E.g. things to do with humans.
To give a more realistic translation problem, how long would you expect it to take to express/explain any random English in C code or vice versa?
There seems to be a culture clash between computer scientists and biologists with this matter. DNA bit length as a back-of-the-envelope complexity estimate for a heavily compressed AGI source seems obvious to me, and, it seems, to Larry Page. Biologists are quick to jump to the particulars of protein synthesis and ignore the question of extra information, because biologists don’t really deal with information theoretical existence proofs.
It really doesn’t help the matter that Kurzweil threw out his estimate when talking about getting at AGI by specifically emulating the human brain, instead of just trying to develop a general human-equivalent AI using code suitable for the computation platform used. This seems to steer most people into thinking that Kurzweil was thinking of using the DNA as literal source code instead of just a complexity yardstick.
Myers seems to have pretty much gone into his creationist-bashing attack mode on this, so I don’t have a very high hopes for any meaningful dialogue from him.
I’m still not sure what people are trying to say with this. Because the kolmogorov complexity of the human brain given the language of the genetic code and physics is low, therefore X? What is that X precisely?
Because of kolmogorov complexities additive constant, which could be anything from 0 to 3^^^3 or higher, I think it only gives us weak evidence for the amount of code we should expect it to take to code an AI on a computer. It is even weaker evidence for the amount of code it would take to code for it with limited resources. E.g. the laws of physics are simple and little information is taken from the womb, but to create an intelligence from them might require a quantum computer the size of the human head to decompress the compressed code. There might be short cuts to do it, but they might be of vastly greater complexity.
We tend to ignore additive constants when talking about Complexity classes, because human designed algorithms tend not to have huge additive constants. Although I have come across some in my time such as this…
We have something like this going on like:
discrete DNA code → lots of messy chemistry and biology → human intelligence
and we’re comparing it to :
discrete computer code → computer → human intelligence
Kurzweil is arguing that the size of the DNA code can tell us about the max size of the computer code needed to run an intelligent brain simulation (or a human-level AI), and PZ Myers is basically saying “no, ’cause that chemistry and biology is really really messy”.
Now, I agree that the computer code and the DNA code are very very different (“a huge amount of enzymes interacting with each other in 3D real time” isn’t the kind of thing you easily simulate on a computer), and the additive constant for converting one into the other is likely to be pretty darn big.
But I also don’t see a reason for intelligence to be easier to express with messy biology and chemistry than with computer code. The things about intelligence that are the closest to biology (interfacing with the real world, how one neuron functions) are also the kind of things that we can already do quite well with computer programs.
There are some things that are “natural” to code in Prolog, but not natural in Fortran, fotran. So a short program in prolog might require a long program in Fotran to do the same thing, and for different programs it might be the other way around. I don’t see any reason to think that it’s easier to encode intelligence in DNA than it is in computer code.
(Now, Kurzweil may be overstating his case when he talks about “compressed” DNA, because to be fair you should compare that to compressed (or compiled) computer code, which translates to much more actual code. I still think the size of the DNA is a very reasonable upper limit, especially when you consider that the DNA was coded by a bloody idiot whose main design pattern is “copy-and-paste”, resulting in the bloated code we know)
Do you have any reason to expect it to be the same? Do we have any reason at all? I’m not arguing that it will take more than 50MBs of code, I’m arguing that the DNA value is not informative.
We are far less good at the doing the equivalent of changing neural structure or adding new neurons (we don’t know why or how neurogenesis works for one) in computer programs.
If I know a certain concept X requires 12 seconds of speech to express in English, and I don’t know anything about Swahili beyond the fact that it’s a human language, my first guess will be that concept X requires 12 seconds of speech to express in Swahili.
I would also express compressed versions of translations in various languages of the same book to be roughly the same size.
So, even with very little information, a first estimate (with a big error margin) would be that it takes as many bits to “encode” intelligence in DNA than it does in computer code.
In addition, the fact that some intelligence-related abilities such as multiplying large numbers are easy to express in computer code, but rare in nature would make me revise that estimate towards “code as more expressive than DNA for some intelligence-related stuff”.
In addition, knowledge about the history of evolution would make me suspect that large chunks of the human genome are not required for intelligence, either because they aren’t expressed, or because they only concern traits that have no impact on our intelligence beyond the fact of keeping us alive. That would also make me revise my estimate downwards for the code size needed for intelligence.
None of those are very strong reasons, but they are reasons nonetheless!
You’d be very wrong for a lot of technical language, unless they just imported the English words whole sale. For example, “Algorithmic Information Theory,” expresses a concept well but I’m guessing it would be hard to explain in Swahili.
Even given that, you can expect the languages of humans to all have roughly the same length because they are generated by the roughly the same hardware and have roughly the same concerns. E.g. things to do with humans.
To give a more realistic translation problem, how long would you expect it to take to express/explain any random English in C code or vice versa?
Selecting a random English sentence will introduce a bias towards concepts that are easy to express in English.