It seems to me like PZ Meyers really doesn’t understand information theory. He’s attacking Kurzweil and calling him a kook. Initially due to a relatively straightforward complexity estimate.
And I’m pretty confident that Myers is wrong on this, unless there is another information rich source of inheritance besides DNA, which Meyers knows about but Kurzweil and I do not.
This looks to me like a popular science blogger doing huge PR damage to everything singularity related, and being wrong about it. Even if he is later convinced of this point.
I don’t see how to avoid this short of just holding back all claims which seem exceptional and that some ‘reasonable’ person might fail to understand and see as a sign of cultishness. If we can’t make claims as basic as the design of the brain being in the genome, then we may as well just remain silent.
But then we wouldn’t find out if we’re wrong, and we’re rationalists.
For instance, you can’t measure the number of transistors in an Intel CPU and then announce, “A-ha! We now understand what a small amount of information is actually required to create all those operating systems and computer games and Microsoft Word, and it is much, much smaller than everyone is assuming.”
This analogy made me cringe. Myers is disagreeing with the claim that human DNA completely encodes the structure and functioning of the human brain: the hardware and software, roughly. Looking at the complexity of the hardware and making claims about the complexity of the software, as he does here, is completely irrelevant to his disagreement. It serves only to obscure the actual point under debate, and demonstrates that he has no idea what he’s talking about.
There seems to be a culture clash between computer scientists and biologists with this matter. DNA bit length as a back-of-the-envelope complexity estimate for a heavily compressed AGI source seems obvious to me, and, it seems, to Larry Page. Biologists are quick to jump to the particulars of protein synthesis and ignore the question of extra information, because biologists don’t really deal with information theoretical existence proofs.
It really doesn’t help the matter that Kurzweil threw out his estimate when talking about getting at AGI by specifically emulating the human brain, instead of just trying to develop a general human-equivalent AI using code suitable for the computation platform used. This seems to steer most people into thinking that Kurzweil was thinking of using the DNA as literal source code instead of just a complexity yardstick.
Myers seems to have pretty much gone into his creationist-bashing attack mode on this, so I don’t have a very high hopes for any meaningful dialogue from him.
I’m still not sure what people are trying to say with this. Because the kolmogorov complexity of the human brain given the language of the genetic code and physics is low, therefore X? What is that X precisely?
Because of kolmogorov complexities additive constant, which could be anything from 0 to 3^^^3 or higher, I think it only gives us weak evidence for the amount of code we should expect it to take to code an AI on a computer. It is even weaker evidence for the amount of code it would take to code for it with limited resources. E.g. the laws of physics are simple and little information is taken from the womb, but to create an intelligence from them might require a quantum computer the size of the human head to decompress the compressed code. There might be short cuts to do it, but they might be of vastly greater complexity.
We tend to ignore additive constants when talking about Complexity classes, because human designed algorithms tend not to have huge additive constants. Although I have come across some in my time such as this…
discrete DNA code → lots of messy chemistry and biology → human intelligence
and we’re comparing it to :
discrete computer code → computer → human intelligence
Kurzweil is arguing that the size of the DNA code can tell us about the max size of the computer code needed to run an intelligent brain simulation (or a human-level AI), and PZ Myers is basically saying “no, ’cause that chemistry and biology is really really messy”.
Now, I agree that the computer code and the DNA code are very very different (“a huge amount of enzymes interacting with each other in 3D real time” isn’t the kind of thing you easily simulate on a computer), and the additive constant for converting one into the other is likely to be pretty darn big.
But I also don’t see a reason for intelligence to be easier to express with messy biology and chemistry than with computer code. The things about intelligence that are the closest to biology (interfacing with the real world, how one neuron functions) are also the kind of things that we can already do quite well with computer programs.
There are some things that are “natural” to code in Prolog, but not natural in Fortran, fotran. So a short program in prolog might require a long program in Fotran to do the same thing, and for different programs it might be the other way around. I don’t see any reason to think that it’s easier to encode intelligence in DNA than it is in computer code.
(Now, Kurzweil may be overstating his case when he talks about “compressed” DNA, because to be fair you should compare that to compressed (or compiled) computer code, which translates to much more actual code. I still think the size of the DNA is a very reasonable upper limit, especially when you consider that the DNA was coded by a bloody idiot whose main design pattern is “copy-and-paste”, resulting in the bloated code we know)
But I also don’t see a reason for intelligence to be easier to express with messy biology and chemistry than with computer code.
Do you have any reason to expect it to be the same? Do we have any reason at all? I’m not arguing that it will take more than 50MBs of code, I’m arguing that the DNA value is not informative.
The things about intelligence that are the closest to biology (interfacing with the real world, how one neuron functions) are also the kind of things that we can already do quite well with computer programs.
We are far less good at the doing the equivalent of changing neural structure or adding new neurons (we don’t know why or how neurogenesis works for one) in computer programs.
But I also don’t see a reason for intelligence to be easier to express with messy biology and chemistry than with computer code.
Do you have any reason to expect it to be the same? Do we have any reason at all?
If I know a certain concept X requires 12 seconds of speech to express in English, and I don’t know anything about Swahili beyond the fact that it’s a human language, my first guess will be that concept X requires 12 seconds of speech to express in Swahili.
I would also express compressed versions of translations in various languages of the same book to be roughly the same size.
So, even with very little information, a first estimate (with a big error margin) would be that it takes as many bits to “encode” intelligence in DNA than it does in computer code.
In addition, the fact that some intelligence-related abilities such as multiplying large numbers are easy to express in computer code, but rare in nature would make me revise that estimate towards “code as more expressive than DNA for some intelligence-related stuff”.
In addition, knowledge about the history of evolution would make me suspect that large chunks of the human genome are not required for intelligence, either because they aren’t expressed, or because they only concern traits that have no impact on our intelligence beyond the fact of keeping us alive. That would also make me revise my estimate downwards for the code size needed for intelligence.
None of those are very strong reasons, but they are reasons nonetheless!
If I know a certain concept X requires 12 seconds of speech to express in English, and I don’t know anything about Swahili beyond the fact that it’s a human language, my first guess will be that concept X requires 12 seconds of speech to express in Swahili.
You’d be very wrong for a lot of technical language, unless they just imported the English words whole sale. For example, “Algorithmic Information Theory,” expresses a concept well but I’m guessing it would be hard to explain in Swahili.
Even given that, you can expect the languages of humans to all have roughly the same length because they are generated by the roughly the same hardware and have roughly the same concerns. E.g. things to do with humans.
To give a more realistic translation problem, how long would you expect it to take to express/explain any random English in C code or vice versa?
And I’m pretty confident that Myers is wrong on this, unless there is another information rich source of inheritance besides DNA, which Meyers knows about but Kurzweil and I do not.
The environment is information-rich, especially the social environment.
Meyers make it quite clear that interactions with the environment are an expected input of information in his understanding.
Do you disagree with information input from the environment?
If he’s not talking about some stable information that is present in all environments that yield intelligent humans, then what’s important is a kind of information that can be mass generated at low complexity cost.
Even language exposure is relatively low complexity, and the key parts might be inferable from brain processes. And we already know how to offer a socially rich environment, so I don’t think it should add to the complexity costs of this problem.
And I think a reverse engineering of a newborn baby brain would be quite sufficient for kurzweil’s goal.
In short: we know intelligent brains get reliably generated. We know it’s very complex. The source of that complexity must be something information rich, stable, and universal. I know of exactly one such source.
Right now I’m reading myers argument as “a big part of human heredity is memetic rather than just genetic, and there is complex interplay between genes and memes, so you’ve got to count the memes as part of the total complexity.”
I say that Kurzweil is trying to create something compatible with human memes in the first plalce, so we can load them the same way we load children (at worst) And even some classes of memes (age appropriate language exposure) do interact tightly with genes, their information content is not all that high.
And I think a reverse engineering of a newborn baby brain would be quite sufficient for kurzweil’s goal.
While doable this seems like a very time consuming project and potentially morally dubious. How do you know when you have succeeded and not got a mildly brain damaged one, because you have missed an important detail for language learning?
We really don’t want to be running multi year experiments, where humans have to interact with infant machines, that would be ruinously expensive. The quicker you can evaluate the capabilities of the machine the better.
Well in Kurzweils’ case, you’d look at the source code and debug it to make sure it’s doinjg everything it’s supposed to, because he’s no dealing with a meat brain.
I guess my real point is that language learning should not be tacked on to the problem of reverse engineering the brain, If he makes something that is as capable of learning, that’s a win for him. (Hopefully he also reverse engineers all of human morality.)
You are assuming the program found via the reverse engineering process is human understandable.… what if it is a strange cellular automata with odd rules. Or an algorithm with parameters you don’t know why they are what they are.
Language is an important part of learning for humans. Imagine trying to learn chess if no one explained the legal moves. Something without the capability for language isn’t such a big win IMHO.
I think we might have different visions of what this reverse engineering would entail, By my concept, if you don’t understand the function of the program you wrote, you’re not done reverse engineering.
I do think that something capable of learning language would be necessary for a win. but the information content of the language does not count towards the complexity estimate of the thing capable of learning langauge.
It seems to me like PZ Meyers really doesn’t understand information theory. He’s attacking Kurzweil and calling him a kook. Initially due to a relatively straightforward complexity estimate.
I see it that way too. The DNA can give us an upper bound on the information needed to create a human brain, but PZ Myers reads that as “Kurzweil is saying we will be able to take a strand of DNA and build a brain from that in the next 10 years!”, and then procede to attack that straw man.
This, however:
His timeline is absurd. I’m a developmental neuroscientist; I have a very good idea of the immensity of what we don’t understand about how the brain works. No one with any knowledge of the field is claiming that we’ll understand how the brain works within 10 years. And if we don’t understand all but a fraction of the functionality of the brain, that makes reverse engineering extremely difficult.
… I am quite enclined to trust. I would trust it more if it wasn’t followed by wrong statements about information theory (that seem wrong to me, at least).
Looking at the comments is depressing. I wish there was some “sane” ways for two communities (readers of PZ Myers and “singularitarians”) to engage without it degenerating into name-calling.
Brian: “We should unite against our common enemy!”
Others: “The Judean People’s Front?”
Brian: “No! The Romans!”
Though there are software solutions for that (takeonit and other stuff that’s been discussed here), it wouldn’t help either if the “leaders” (PZ Myers, Kurzweil, etc.) were a bit more responsible and made a genuine effort to acknowledge the other’s points when there are strong. So they could converge or at least agree to disagree on something narrow.
But nooo, it’s much more fun to get angry, and it gets you more traffic too!
The DNA can give us an upper bound on the information needed to create a human brain [...]
Why do you say this? If humans were designed by human engineers, the ‘blueprints’ would actually be complete blueprints, sufficient unto the task of determining the final organism … but they weren’t. There’s no particular reason to doubt that a significant amount of the final data is encoded in the gestational environment.
I’m not sure about what you mean about the “complete blueprints”—I agree that the DNA isn’t a complete blueprint, and that an alien civilization with a different chemistry would (probably) find it impossible to rebuild a human if they were just given it’s DNA. The gestational environment is essential, I just don’t think it encodes much data on the actual working of the brain.
It seems to me that the interaction between the baby and the gestational environment is relatively simple, at least compared to organ development and differentiation. There are a lot of essential things for it to go right, and hormones and nutrients, but 1) I don’t see a lot of information transfer in there (“making the brain work a certain way” as opposed to “making the brain work period”), and 2) A lot of the information on how that works is probably encoded in the DNA too.
I would say that the important bits that may not be in the DNA (or in mitocondrial DNA) are the DNA interpretation system (transcription, translation).
That’s a strong point, but I think it’s still worth bearing in mind that this subject is P. Z. Myers’ actual research focus: developmental biology. It appears to me that Kurzweil should be getting Myers’ help revising his 50 MB estimate*, not dismissing Myers arguments as misinformed.
Yes, Myers made a mistake in responding to a summary secondhand account rather than Kurzweil’s actual position, but Kurzweil is making a mistake if he’s ignoring expert opinion on a subject directly relating to his thesis.
* By the way: 50 MB? That’s smaller than the latest version of gcc! If that’s your complexity estimate, the complexity of the brain could be dominated by the complexity of the gestational environment!
I agree that Kurzweil could have acknowledged P.Z.Myers’ expertise a bit more, especially the “nobody in my field expects a brain simulation in the next ten years” bit.
50 MB—that’s still a hefty amount of code, especially if it’s 50MB of compiled code and not 50 MB of source code (comparing the size of the source code to the size of the compressed DNA looks fishy to me, but I’m not sure Kurzweil has been actually doing that—he’s just been saying “it doesn’t require trillions of lines of code”).
Is the size of gcc the source code or the compiled version? I didn’t see that info on Wikipedia, and don’t have gcc on this machine.
As I see it, Myers delivered a totally misguided rant. When his mistakes were exposed he failed to apologise. Obviously, there is no such thing as bad publicity.
There’s no particular reason to doubt that a significant amount of the final data is encoded in the gestational environment.
To the contrary, there is every reason to doubt that. We already know that important pieces of the gestational environment (the genetic code itself, core metabolism, etc.) are encoded in the genome. By contrast, the amount of epigenetic information that we know of is miniscule. It is, of course, likely that we will discover more, but it is very unlikely that we will discover much more. The reason for this skepticism is that we don’t know of any reliable epigenetic means of transmitting generic information from generation to generation. And the epigenetic information inheritance mechanisms that we do understand all require hundreds of times as much genetic information to specify the machinery as compared to the amount of epigenetic information that the machinery can transmit.
To my mind, it is very clear that (on this narrow point) Kurzweil was right and PZ wrong: The Shannon information content of the genome places a tight upper bound on the algorithmic (i.e. Kolmogorov) information content of the embryonic brain. Admittedly, when we do finally construct an AI, it may take it 25 years to get through graduate school, and it may have to read thru several hundred Wikipedia equivalents to get there, but I am very confident that specifying the process for generating the structure and interconnect of the embryonic AI brain will take well under 7 billion bits.
To my mind, it is very clear that (on this narrow point) Kurzweil was right and PZ
wrong: The Shannon information content of the genome places a tight upper
bound on the algorithmic (i.e. Kolmogorov) information content of the embryonic brain.
I think you may have missed my devastating analysis of this issue a couple of years back:
“So, who is right? Does the brain’s design fit into the genome? - or not?
The detailed form of proteins arises from a combination of the nucleotide sequence that specifies them, the cytoplasmic environment in which gene expression takes place, and the laws of physics.
We can safely ignore the contribution of cytoplasmic inheritance—however, the contribution of the laws of physics is harder to discount. At first sight, it may seem simply absurd to argue that the laws of physics contain design information relating to the construction of the human brain. However there is a well-established mechanism by which physical law may do just that—an idea known as the anthropic principle. This argues that the universe we observe must necessarily permit the emergence of intelligent agents. If that involves a coding the design of the brains of intelligent agents into the laws of physics then: so be it. There are plenty of apparently-arbitrary constants in physics where such information could conceivably be encoded: the fine structure constant, the cosmological constant, Planck’s constant—and so on.
At the moment, it is not even possible to bound the quantity of brain-design information so encoded. When we get machine intelligence, we will have an independent estimate of the complexity of the design required to produce an intelligent agent. Alternatively, when we know what the laws of physics are, we may be able to bound the quantity of information encoded by them. However, today neither option is available to us.”
You suggest that the human brain might have a high Kolmogorov complexity, the information for which is encoded, not in the human genome (which contains a mere 7 gigabits of information), but rather in the laws of physics, which contain arbitrarily large amounts of information, encoded in the exact values of physical constants. For example, first 30 billion decimal digits of the fine structure constant contain 100 gigabits of information, putting the genome to shame.
Do I have that right?
Well, I will give you points for cleverness, but I’m not buying it. I doubt that it much matters what the constants are, out past the first hundred digits or so. Yes, I realize that the details of how the universe proceeds may be chaotic; it may involve sensitive
dependence both on initial conditions and on physical constants. But I don’t think that really matters. Physical constants haven’t changed since the Cambrian, but genomes have. And I think that it is the change in genomes which led to the human brain, the dolphin brain, the parrot brain, and the octopus brain. Alter the fine structure constant in the 2 billionth decimal place, and those brain architectures would still work, and those genomes would still specify development pathways leading to them. Or so I believe.
I doubt that it much matters what the constants are, out past the first
hundred digits or so
What makes you think that?
I realize that the details of how the universe proceeds may be chaotic; it
may involve sensitive dependence both on initial conditions and on physical
constants. But I don’t think that really matters.
...and why not?
Physical constants haven’t changed since the Cambrian, but genomes have.
And I think that it is the change in genomes which led to the human brain,
the dolphin brain, the parrot brain, and the octopus brain.
Under the hypothesis that physics encodes relevant information, a lot of the
required information was there from the beginning. The fact that brains only became manifest after the Cambrian doesn’t mean the propensity for making brains was not there from the beginning. So: that observation doesn’t tell you very much.
Alter the fine structure constant in the 2 billionth decimal place, and those
brain architectures would still work, and those genomes would still
specify development pathways leading to them. Or so I believe.
Right—but what evidence do you have of that? You are aware of chaos theory, no? Small changes can lead to dramatic changes surprisingly quickly.
Organisms inherit the laws of physics (and indeed the initial conditions of the universe they are in) - as well as their genomes. Information passes down the generations both ways. If you want to claim the design information is in one inheritance channel more than the other one, it seems to me that you need some evidence relating to that issue. The evidence you have presented so far seems pretty worthless—the delayed emergence of brains seems equally compatible with both of the hypotheses under consideration.
No other rational [ETA: I meant physical and I am dumb] process is known to rely on physical constants to the degree you propose. What you propose is not impossible, but it is highly improbable.
Sensitive dependence on initial conditions is an extremely well-known phenomenon. If you change the laws of physics a little bit, the result of a typical game of billiards will be different. This kind of phenomenon is ubiquitous in nature, from the orbit of planets, to the paths rivers take.
If a butterfly’s wing flap can cause a tornado, I figure a small physical constant jog could easily make the difference between intelligent life emerging, and it not doing so billions of years later.
Sensitive dependence on initial conditions is literally everywhere. Check it out:
The universe took about 14 billion years to get this far—and if you look into the math of chaos theory, the changes propagate up very rapidly. There is an ever-expanding avalanche of changes—like an atomic explosion.
For the 750mb-or-so of data under discussion, you could easily see the changes at a macroscopic scale rapidly. Atoms in stars bang into each other pretty quickly. I haven’t attempted to calculate it—but probably within a few minutes, I figure.
Would you actually go as far as maintaining that, if a change were to happen tomorrow to the 1,000th decimal place of a physical constant, it would be likely to stop brains from working, or are you just saying that a similar change to a physical constant, if it happened in the past, would have been likely to stop the sequence of events which has caused brains to come into existence?
Option 2. Existing brains might be OK—but I think newly-constructed ones would have to not work properly when they matured. So, option 2 would not be enough on its own.
Thanks for telling me that. I was logged in and didn’t see it, but I will look more carefully next time.
I’m actually proof-reading a document now which improves the “action selection process”. I was never happy with what I described and it was a kind of placeholder. The new stuff will be very short though.
Anyway, what do you do? I have the idea it is something computer related, maybe?
Apologies for the comment I inadvertently placed here. I thought I was answering a PM and did not mean to add personal exchanges. I find computers annoying sometimes, and will happily stop using them when something else that is Turing equivalent becomes available.
I figure a small physical constant jog could easily make the difference between intelligent life emerging, and it not doing so billions of years later.
First, that is VERY different than the design information being in the constant, but not in the genome. (you could more validly say that the genome is what it is because the constant is precisely what it is.)
Second, the billiard ball example is invalid. It doesn’t matter exactly where the billiard balls are if you’re getting hustled. Neurons are not typically sensitive to the precise positions of their atoms. Information processing relies on the ability to largely overlook noise.
What physical process would cease to function if you increased c by a billionth of a percent? Or one of the other Planck units? Processes involved in the functioning of both neurons and transistors don’t count, because then there’s no difference to account for.
Would I be correct in thinking that one would need to modify the relationship of c to some other constant (the physics equation that represent some physical law?) for the change to be meaningful? I may be failing to understand the idea of dimension.
Thank you for the excuse to learn more math, by the way.
Yes, you would be correct, at least in terms of our current knowledge.
In fact, it’s not that unusual to choose units so that you can set c = 1 (ie, to make it unitless). This way units of time and units of distance are the same kind, velocities are dimensionless geometric quantities, etc...
You might want to think of “c” not so much as a speed as a conversion factor between distance type units and time type units.
That isn’t really the idea. It would have to interfere with the development of a baby enough for its brain not to work out properly as an adult, though—I figure.
Myers has always had a tendency to attack other people’s arguments like enemy soldiers. A good example is his take on evolutionary psychology, which he hates so much it is actually funny.
And then look at the source: Satoshi Kanazawa, the Fenimore Cooper of Sociobiology, the professional fantasist of Psychology Today. He’s like the poster boy for the stupidity and groundlessness of freakishly fact-free evolutionary psychology. Just ignore anything with Kanazawa’s name on it.
He also claims to have desecrated a consecrated host (the sacramental wafers Catholics consider to be the body of Jesus). That will show those evil theists how a good, rational person behaves!
I’m pretty confident that Myers is wrong on this, unless there is another information rich source of inheritance besides DNA, which Myers knows about but Kurzweil and I do not.
Myers’ thesis is that you are not going to figure out by brute-force physical simulation how the genome gives rise to the organism, knowing just the genomic sequence. On every scale—molecule, cell, tissue, organism—there are very complicated boundary conditions at work. You have to do experimental biology, observe those boundary conditions, and figure out what role they play. I predict he would be a lot more sympathetic if Kurzweil was talking about AIs figuring out the brain by doing experimental biology, rather than just saying genomic sequence + laws of physics will get us there.
Myers’ thesis is that you are not going to figure out by brute-force physical simulation how the genome gives rise to the organism, knowing just the genomic sequence.
And he is quite possibly correct. However, that has nothing at all to do with what Kurzweil said.
I predict he would be a lot more sympathetic if Kurzweil was talking about AIs figuring out the brain by doing experimental biology, rather than just saying genomic sequence + laws of physics will get us there.
I predict he would be more sympathetic if he just made the effort to figure out what Kurzweil said. But, of course, we all know there is no chance of that, so “conjecture” might be a better word than “predict”.
Myers doesn’t have an argument against Kurzweil’s estimate of the brain’s complexity. But his skepticism about Kurzweil’s timescale can be expressed in terms of the difficulty of searching large spaces. Let’s say it does take a million lines of code to simulate the brain. Where is the argument that we can produce the right million lines of code within twenty years? The space of million-line programs is very large.
I agree, both regarding timescale, and regarding reason for timescale difficulties.
As I understand Kurzweil, he is saying that we will build the AI, not by finding the program for development and simulating it, but rather by scanning the result of the development and duplicating it in a different medium. The only relevance of that hypothetical million-line program is that it effectively puts a bound on the scanning and manufacturing tolerances that we need to achieve. Well, while it is probably true in general that we don’t need to get the wiring exactly right on all of the trillions of neurons, there may well be some where the exact right embryonic wiring is crucial to success. And, since we don’t yet have or understand that million-line program that somehow gets the wiring right reliably, we probably won’t get them right ourselves. At least not at first.
It feels a little funny to find myself making here an argument right out of Bill Dembski’s playbook. No free lunch! Needle in a haystack. Only way to search that space is by exhaustion. Well, we shall see what we shall see.
I agree, but at the same time, I wish biologists would learn more information theory, since their focus should be identifying the information flows going on, as this is what will lead us to a comprehensible model of human development and functionality.
(I freely admit I don’t have years in the trenches, so this may be a naive view, but if my experience with any other scientific turf war is any guide, this is important advice.)
How would you address this?
http://scienceblogs.com/pharyngula/2010/08/kurzweil_still_doesnt_understa.php
It seems to me like PZ Meyers really doesn’t understand information theory. He’s attacking Kurzweil and calling him a kook. Initially due to a relatively straightforward complexity estimate.
And I’m pretty confident that Myers is wrong on this, unless there is another information rich source of inheritance besides DNA, which Meyers knows about but Kurzweil and I do not.
This looks to me like a popular science blogger doing huge PR damage to everything singularity related, and being wrong about it. Even if he is later convinced of this point.
I don’t see how to avoid this short of just holding back all claims which seem exceptional and that some ‘reasonable’ person might fail to understand and see as a sign of cultishness. If we can’t make claims as basic as the design of the brain being in the genome, then we may as well just remain silent.
But then we wouldn’t find out if we’re wrong, and we’re rationalists.
This analogy made me cringe. Myers is disagreeing with the claim that human DNA completely encodes the structure and functioning of the human brain: the hardware and software, roughly. Looking at the complexity of the hardware and making claims about the complexity of the software, as he does here, is completely irrelevant to his disagreement. It serves only to obscure the actual point under debate, and demonstrates that he has no idea what he’s talking about.
There seems to be a culture clash between computer scientists and biologists with this matter. DNA bit length as a back-of-the-envelope complexity estimate for a heavily compressed AGI source seems obvious to me, and, it seems, to Larry Page. Biologists are quick to jump to the particulars of protein synthesis and ignore the question of extra information, because biologists don’t really deal with information theoretical existence proofs.
It really doesn’t help the matter that Kurzweil threw out his estimate when talking about getting at AGI by specifically emulating the human brain, instead of just trying to develop a general human-equivalent AI using code suitable for the computation platform used. This seems to steer most people into thinking that Kurzweil was thinking of using the DNA as literal source code instead of just a complexity yardstick.
Myers seems to have pretty much gone into his creationist-bashing attack mode on this, so I don’t have a very high hopes for any meaningful dialogue from him.
I’m still not sure what people are trying to say with this. Because the kolmogorov complexity of the human brain given the language of the genetic code and physics is low, therefore X? What is that X precisely?
Because of kolmogorov complexities additive constant, which could be anything from 0 to 3^^^3 or higher, I think it only gives us weak evidence for the amount of code we should expect it to take to code an AI on a computer. It is even weaker evidence for the amount of code it would take to code for it with limited resources. E.g. the laws of physics are simple and little information is taken from the womb, but to create an intelligence from them might require a quantum computer the size of the human head to decompress the compressed code. There might be short cuts to do it, but they might be of vastly greater complexity.
We tend to ignore additive constants when talking about Complexity classes, because human designed algorithms tend not to have huge additive constants. Although I have come across some in my time such as this…
We have something like this going on like:
discrete DNA code → lots of messy chemistry and biology → human intelligence
and we’re comparing it to :
discrete computer code → computer → human intelligence
Kurzweil is arguing that the size of the DNA code can tell us about the max size of the computer code needed to run an intelligent brain simulation (or a human-level AI), and PZ Myers is basically saying “no, ’cause that chemistry and biology is really really messy”.
Now, I agree that the computer code and the DNA code are very very different (“a huge amount of enzymes interacting with each other in 3D real time” isn’t the kind of thing you easily simulate on a computer), and the additive constant for converting one into the other is likely to be pretty darn big.
But I also don’t see a reason for intelligence to be easier to express with messy biology and chemistry than with computer code. The things about intelligence that are the closest to biology (interfacing with the real world, how one neuron functions) are also the kind of things that we can already do quite well with computer programs.
There are some things that are “natural” to code in Prolog, but not natural in Fortran, fotran. So a short program in prolog might require a long program in Fotran to do the same thing, and for different programs it might be the other way around. I don’t see any reason to think that it’s easier to encode intelligence in DNA than it is in computer code.
(Now, Kurzweil may be overstating his case when he talks about “compressed” DNA, because to be fair you should compare that to compressed (or compiled) computer code, which translates to much more actual code. I still think the size of the DNA is a very reasonable upper limit, especially when you consider that the DNA was coded by a bloody idiot whose main design pattern is “copy-and-paste”, resulting in the bloated code we know)
Do you have any reason to expect it to be the same? Do we have any reason at all? I’m not arguing that it will take more than 50MBs of code, I’m arguing that the DNA value is not informative.
We are far less good at the doing the equivalent of changing neural structure or adding new neurons (we don’t know why or how neurogenesis works for one) in computer programs.
If I know a certain concept X requires 12 seconds of speech to express in English, and I don’t know anything about Swahili beyond the fact that it’s a human language, my first guess will be that concept X requires 12 seconds of speech to express in Swahili.
I would also express compressed versions of translations in various languages of the same book to be roughly the same size.
So, even with very little information, a first estimate (with a big error margin) would be that it takes as many bits to “encode” intelligence in DNA than it does in computer code.
In addition, the fact that some intelligence-related abilities such as multiplying large numbers are easy to express in computer code, but rare in nature would make me revise that estimate towards “code as more expressive than DNA for some intelligence-related stuff”.
In addition, knowledge about the history of evolution would make me suspect that large chunks of the human genome are not required for intelligence, either because they aren’t expressed, or because they only concern traits that have no impact on our intelligence beyond the fact of keeping us alive. That would also make me revise my estimate downwards for the code size needed for intelligence.
None of those are very strong reasons, but they are reasons nonetheless!
You’d be very wrong for a lot of technical language, unless they just imported the English words whole sale. For example, “Algorithmic Information Theory,” expresses a concept well but I’m guessing it would be hard to explain in Swahili.
Even given that, you can expect the languages of humans to all have roughly the same length because they are generated by the roughly the same hardware and have roughly the same concerns. E.g. things to do with humans.
To give a more realistic translation problem, how long would you expect it to take to express/explain any random English in C code or vice versa?
Selecting a random English sentence will introduce a bias towards concepts that are easy to express in English.
The environment is information-rich, especially the social environment.
Meyers make it quite clear that interactions with the environment are an expected input of information in his understanding.
Do you disagree with information input from the environment?
Yes, I disagree.
If he’s not talking about some stable information that is present in all environments that yield intelligent humans, then what’s important is a kind of information that can be mass generated at low complexity cost.
Even language exposure is relatively low complexity, and the key parts might be inferable from brain processes. And we already know how to offer a socially rich environment, so I don’t think it should add to the complexity costs of this problem.
And I think a reverse engineering of a newborn baby brain would be quite sufficient for kurzweil’s goal.
In short: we know intelligent brains get reliably generated. We know it’s very complex. The source of that complexity must be something information rich, stable, and universal. I know of exactly one such source.
Right now I’m reading myers argument as “a big part of human heredity is memetic rather than just genetic, and there is complex interplay between genes and memes, so you’ve got to count the memes as part of the total complexity.”
I say that Kurzweil is trying to create something compatible with human memes in the first plalce, so we can load them the same way we load children (at worst) And even some classes of memes (age appropriate language exposure) do interact tightly with genes, their information content is not all that high.
While doable this seems like a very time consuming project and potentially morally dubious. How do you know when you have succeeded and not got a mildly brain damaged one, because you have missed an important detail for language learning?
We really don’t want to be running multi year experiments, where humans have to interact with infant machines, that would be ruinously expensive. The quicker you can evaluate the capabilities of the machine the better.
Well in Kurzweils’ case, you’d look at the source code and debug it to make sure it’s doinjg everything it’s supposed to, because he’s no dealing with a meat brain.
I guess my real point is that language learning should not be tacked on to the problem of reverse engineering the brain, If he makes something that is as capable of learning, that’s a win for him. (Hopefully he also reverse engineers all of human morality.)
You are assuming the program found via the reverse engineering process is human understandable.… what if it is a strange cellular automata with odd rules. Or an algorithm with parameters you don’t know why they are what they are.
Language is an important part of learning for humans. Imagine trying to learn chess if no one explained the legal moves. Something without the capability for language isn’t such a big win IMHO.
I think we might have different visions of what this reverse engineering would entail, By my concept, if you don’t understand the function of the program you wrote, you’re not done reverse engineering.
I do think that something capable of learning language would be necessary for a win. but the information content of the language does not count towards the complexity estimate of the thing capable of learning langauge.
I see it that way too. The DNA can give us an upper bound on the information needed to create a human brain, but PZ Myers reads that as “Kurzweil is saying we will be able to take a strand of DNA and build a brain from that in the next 10 years!”, and then procede to attack that straw man.
This, however:
… I am quite enclined to trust. I would trust it more if it wasn’t followed by wrong statements about information theory (that seem wrong to me, at least).
Looking at the comments is depressing. I wish there was some “sane” ways for two communities (readers of PZ Myers and “singularitarians”) to engage without it degenerating into name-calling.
Though there are software solutions for that (takeonit and other stuff that’s been discussed here), it wouldn’t help either if the “leaders” (PZ Myers, Kurzweil, etc.) were a bit more responsible and made a genuine effort to acknowledge the other’s points when there are strong. So they could converge or at least agree to disagree on something narrow.
But nooo, it’s much more fun to get angry, and it gets you more traffic too!
Why do you say this? If humans were designed by human engineers, the ‘blueprints’ would actually be complete blueprints, sufficient unto the task of determining the final organism … but they weren’t. There’s no particular reason to doubt that a significant amount of the final data is encoded in the gestational environment.
I’m not sure about what you mean about the “complete blueprints”—I agree that the DNA isn’t a complete blueprint, and that an alien civilization with a different chemistry would (probably) find it impossible to rebuild a human if they were just given it’s DNA. The gestational environment is essential, I just don’t think it encodes much data on the actual working of the brain.
It seems to me that the interaction between the baby and the gestational environment is relatively simple, at least compared to organ development and differentiation. There are a lot of essential things for it to go right, and hormones and nutrients, but 1) I don’t see a lot of information transfer in there (“making the brain work a certain way” as opposed to “making the brain work period”), and 2) A lot of the information on how that works is probably encoded in the DNA too.
I would say that the important bits that may not be in the DNA (or in mitocondrial DNA) are the DNA interpretation system (transcription, translation).
That’s a strong point, but I think it’s still worth bearing in mind that this subject is P. Z. Myers’ actual research focus: developmental biology. It appears to me that Kurzweil should be getting Myers’ help revising his 50 MB estimate*, not dismissing Myers arguments as misinformed.
Yes, Myers made a mistake in responding to a summary secondhand account rather than Kurzweil’s actual position, but Kurzweil is making a mistake if he’s ignoring expert opinion on a subject directly relating to his thesis.
* By the way: 50 MB? That’s smaller than the latest version of gcc! If that’s your complexity estimate, the complexity of the brain could be dominated by the complexity of the gestational environment!
I agree that Kurzweil could have acknowledged P.Z.Myers’ expertise a bit more, especially the “nobody in my field expects a brain simulation in the next ten years” bit.
50 MB—that’s still a hefty amount of code, especially if it’s 50MB of compiled code and not 50 MB of source code (comparing the size of the source code to the size of the compressed DNA looks fishy to me, but I’m not sure Kurzweil has been actually doing that—he’s just been saying “it doesn’t require trillions of lines of code”).
Is the size of gcc the source code or the compiled version? I didn’t see that info on Wikipedia, and don’t have gcc on this machine.
As I see it, Myers delivered a totally misguided rant. When his mistakes were exposed he failed to apologise. Obviously, there is no such thing as bad publicity.
I’m looking at gcc-4.5.0.tar.gz.
That includes the source code, the binaries, the documentation, the unit tests, changelogs … I’m not surpised it’s pretty big!
I consider it pretty likely that it’s possible to program a human-like intelligence with a compressed source code of less than 50 MB.
However, I’m much less confident that the source code of the first actual human-like intelligence coded by humans (if there is one) will be that size.
To the contrary, there is every reason to doubt that. We already know that important pieces of the gestational environment (the genetic code itself, core metabolism, etc.) are encoded in the genome. By contrast, the amount of epigenetic information that we know of is miniscule. It is, of course, likely that we will discover more, but it is very unlikely that we will discover much more. The reason for this skepticism is that we don’t know of any reliable epigenetic means of transmitting generic information from generation to generation. And the epigenetic information inheritance mechanisms that we do understand all require hundreds of times as much genetic information to specify the machinery as compared to the amount of epigenetic information that the machinery can transmit.
To my mind, it is very clear that (on this narrow point) Kurzweil was right and PZ wrong: The Shannon information content of the genome places a tight upper bound on the algorithmic (i.e. Kolmogorov) information content of the embryonic brain. Admittedly, when we do finally construct an AI, it may take it 25 years to get through graduate school, and it may have to read thru several hundred Wikipedia equivalents to get there, but I am very confident that specifying the process for generating the structure and interconnect of the embryonic AI brain will take well under 7 billion bits.
I think you may have missed my devastating analysis of this issue a couple of years back:
“So, who is right? Does the brain’s design fit into the genome? - or not?
The detailed form of proteins arises from a combination of the nucleotide sequence that specifies them, the cytoplasmic environment in which gene expression takes place, and the laws of physics.
We can safely ignore the contribution of cytoplasmic inheritance—however, the contribution of the laws of physics is harder to discount. At first sight, it may seem simply absurd to argue that the laws of physics contain design information relating to the construction of the human brain. However there is a well-established mechanism by which physical law may do just that—an idea known as the anthropic principle. This argues that the universe we observe must necessarily permit the emergence of intelligent agents. If that involves a coding the design of the brains of intelligent agents into the laws of physics then: so be it. There are plenty of apparently-arbitrary constants in physics where such information could conceivably be encoded: the fine structure constant, the cosmological constant, Planck’s constant—and so on.
At the moment, it is not even possible to bound the quantity of brain-design information so encoded. When we get machine intelligence, we will have an independent estimate of the complexity of the design required to produce an intelligent agent. Alternatively, when we know what the laws of physics are, we may be able to bound the quantity of information encoded by them. However, today neither option is available to us.”
http://alife.co.uk/essays/how_long_before_superintelligence/
You suggest that the human brain might have a high Kolmogorov complexity, the information for which is encoded, not in the human genome (which contains a mere 7 gigabits of information), but rather in the laws of physics, which contain arbitrarily large amounts of information, encoded in the exact values of physical constants. For example, first 30 billion decimal digits of the fine structure constant contain 100 gigabits of information, putting the genome to shame.
Do I have that right?
Well, I will give you points for cleverness, but I’m not buying it. I doubt that it much matters what the constants are, out past the first hundred digits or so. Yes, I realize that the details of how the universe proceeds may be chaotic; it may involve sensitive dependence both on initial conditions and on physical constants. But I don’t think that really matters. Physical constants haven’t changed since the Cambrian, but genomes have. And I think that it is the change in genomes which led to the human brain, the dolphin brain, the parrot brain, and the octopus brain. Alter the fine structure constant in the 2 billionth decimal place, and those brain architectures would still work, and those genomes would still specify development pathways leading to them. Or so I believe.
What makes you think that?
...and why not?
Under the hypothesis that physics encodes relevant information, a lot of the required information was there from the beginning. The fact that brains only became manifest after the Cambrian doesn’t mean the propensity for making brains was not there from the beginning. So: that observation doesn’t tell you very much.
Right—but what evidence do you have of that? You are aware of chaos theory, no? Small changes can lead to dramatic changes surprisingly quickly.
Organisms inherit the laws of physics (and indeed the initial conditions of the universe they are in) - as well as their genomes. Information passes down the generations both ways. If you want to claim the design information is in one inheritance channel more than the other one, it seems to me that you need some evidence relating to that issue. The evidence you have presented so far seems pretty worthless—the delayed emergence of brains seems equally compatible with both of the hypotheses under consideration.
So: do you have any other relevant evidence?
No other rational [ETA: I meant physical and I am dumb] process is known to rely on physical constants to the degree you propose. What you propose is not impossible, but it is highly improbable.
What?!? What makes you think that?
Sensitive dependence on initial conditions is an extremely well-known phenomenon. If you change the laws of physics a little bit, the result of a typical game of billiards will be different. This kind of phenomenon is ubiquitous in nature, from the orbit of planets, to the paths rivers take.
If a butterfly’s wing flap can cause a tornado, I figure a small physical constant jog could easily make the difference between intelligent life emerging, and it not doing so billions of years later.
Sensitive dependence on initial conditions is literally everywhere. Check it out:
http://en.wikipedia.org/wiki/Chaos_theory
Did you miss this bit:
Sensitivity to initial conditions is one thing. Sensitivity to 1 billion SF in a couple of decades?
The universe took about 14 billion years to get this far—and if you look into the math of chaos theory, the changes propagate up very rapidly. There is an ever-expanding avalanche of changes—like an atomic explosion.
For the 750mb-or-so of data under discussion, you could easily see the changes at a macroscopic scale rapidly. Atoms in stars bang into each other pretty quickly. I haven’t attempted to calculate it—but probably within a few minutes, I figure.
Would you actually go as far as maintaining that, if a change were to happen tomorrow to the 1,000th decimal place of a physical constant, it would be likely to stop brains from working, or are you just saying that a similar change to a physical constant, if it happened in the past, would have been likely to stop the sequence of events which has caused brains to come into existence?
Option 2. Existing brains might be OK—but I think newly-constructed ones would have to not work properly when they matured. So, option 2 would not be enough on its own.
Correction: That last line should be “which has CAUSED brains to come into existence?”
You can edit comments after submitting them—when logged in, you should see an edit button.
By the way, I’m reading your part 15, section 2 now.
Hi Silas!
Thanks for telling me that. I was logged in and didn’t see it, but I will look more carefully next time.
I’m actually proof-reading a document now which improves the “action selection process”. I was never happy with what I described and it was a kind of placeholder. The new stuff will be very short though.
Anyway, what do you do? I have the idea it is something computer related, maybe?
Apologies for the comment I inadvertently placed here. I thought I was answering a PM and did not mean to add personal exchanges. I find computers annoying sometimes, and will happily stop using them when something else that is Turing equivalent becomes available.
First, that is VERY different than the design information being in the constant, but not in the genome. (you could more validly say that the genome is what it is because the constant is precisely what it is.)
Second, the billiard ball example is invalid. It doesn’t matter exactly where the billiard balls are if you’re getting hustled. Neurons are not typically sensitive to the precise positions of their atoms. Information processing relies on the ability to largely overlook noise.
What physical process would cease to function if you increased c by a billionth of a percent? Or one of the other Planck units? Processes involved in the functioning of both neurons and transistors don’t count, because then there’s no difference to account for.
Nitpick: c is a dimensioned quantity, so changes in it aren’t necessarily meaningful.
*Blink.*
*Reads Wikipedia.*
Would I be correct in thinking that one would need to modify the relationship of c to some other constant (the physics equation that represent some physical law?) for the change to be meaningful? I may be failing to understand the idea of dimension.
Thank you for the excuse to learn more math, by the way.
Yes, you would be correct, at least in terms of our current knowledge.
In fact, it’s not that unusual to choose units so that you can set c = 1 (ie, to make it unitless). This way units of time and units of distance are the same kind, velocities are dimensionless geometric quantities, etc...
You might want to think of “c” not so much as a speed as a conversion factor between distance type units and time type units.
That isn’t really the idea. It would have to interfere with the development of a baby enough for its brain not to work out properly as an adult, though—I figure.
Artificial wombs
Don’t currently exist. I’m not sure that’s a strong argument.
Myers has always had a tendency to attack other people’s arguments like enemy soldiers. A good example is his take on evolutionary psychology, which he hates so much it is actually funny.
He also claims to have desecrated a consecrated host (the sacramental wafers Catholics consider to be the body of Jesus). That will show those evil theists how a good, rational person behaves!
Myers’ thesis is that you are not going to figure out by brute-force physical simulation how the genome gives rise to the organism, knowing just the genomic sequence. On every scale—molecule, cell, tissue, organism—there are very complicated boundary conditions at work. You have to do experimental biology, observe those boundary conditions, and figure out what role they play. I predict he would be a lot more sympathetic if Kurzweil was talking about AIs figuring out the brain by doing experimental biology, rather than just saying genomic sequence + laws of physics will get us there.
And he is quite possibly correct. However, that has nothing at all to do with what Kurzweil said.
I predict he would be more sympathetic if he just made the effort to figure out what Kurzweil said. But, of course, we all know there is no chance of that, so “conjecture” might be a better word than “predict”.
Myers doesn’t have an argument against Kurzweil’s estimate of the brain’s complexity. But his skepticism about Kurzweil’s timescale can be expressed in terms of the difficulty of searching large spaces. Let’s say it does take a million lines of code to simulate the brain. Where is the argument that we can produce the right million lines of code within twenty years? The space of million-line programs is very large.
I agree, both regarding timescale, and regarding reason for timescale difficulties.
As I understand Kurzweil, he is saying that we will build the AI, not by finding the program for development and simulating it, but rather by scanning the result of the development and duplicating it in a different medium. The only relevance of that hypothetical million-line program is that it effectively puts a bound on the scanning and manufacturing tolerances that we need to achieve. Well, while it is probably true in general that we don’t need to get the wiring exactly right on all of the trillions of neurons, there may well be some where the exact right embryonic wiring is crucial to success. And, since we don’t yet have or understand that million-line program that somehow gets the wiring right reliably, we probably won’t get them right ourselves. At least not at first.
It feels a little funny to find myself making here an argument right out of Bill Dembski’s playbook. No free lunch! Needle in a haystack. Only way to search that space is by exhaustion. Well, we shall see what we shall see.
I agree, but at the same time, I wish biologists would learn more information theory, since their focus should be identifying the information flows going on, as this is what will lead us to a comprehensible model of human development and functionality.
(I freely admit I don’t have years in the trenches, so this may be a naive view, but if my experience with any other scientific turf war is any guide, this is important advice.)
This was cited to me in a blog discussion as “schoolboy biology EY gets wrong” (he said something similar, apparently).
Personal libraries.