Honest question—is there some specific technical sense in which you are using “complex”? Colloquially, complex just means a thing consisting of many parts. Any neural network is “a thing consisting of many parts”, and I can generally add arbitrarily many “parts” by changing the number of layers or neurons-per-layer or whatever at initialization time.
I don’t think this is what you mean, though. You mean something like architectural complexity, though I think the word “architectural” is a weasel word here that lets you avoid explaining what exactly is missing from, e.g., AlphaGo Zero. I think by “complex” you mean something like “a thing consisting of many distinct sub-things, with the critical functional details spread across many levels of components and their combinations”. Or perhaps “the design instructions for the thing cannot be efficiently compressed”. This is the sense in which the brain is more “complex” than the kidneys.
(Although, the design instructions for the brain can be efficiently compressed, and indeed brains are made from surprisingly simple instructions. A developed, adult brain can’t be efficiently compressed, true, but that’s not a fair comparison. A blank-slate initialized AlphaGo network, and that same network after training on ten million games of Go, are not the same artifact.)
Other words aside from “complex” and “architecture” that I think you could afford to taboo for the sake of clarity are “simple” and “general”. Is the idea of a neural network “simple”? Is a convnet “general”? Is MCTS a “simple, general” algorithm or a “complex, narrow” one? These are bad questions because all those words must be defined relative to some standard that is not provided. What problem are you trying to solve, what class of techniques are you regarding as relevant for comparison? A convnet is definitely “complex and narrow” compared to linear regression, as a mathematical technique. AlphaGo Zero is highly “complex and narrow” relative to a vanilla convnet.
If your answer to “how complex and how specific a technique do you think we’re missing?” is always “more complex and more specific than whatever Deepmind just implemented”, then we should definitely stop using those words.
Perhaps “size of compiled program” would be one way to make a crude complexity estimate. But I definitely would like to be able to better define this metric.
In any case, I don’t think the concept of software complexity is meaningless or especially nebulous. A program with a great many different bespoke modules, which all interact in myriad ways, and are in turn full of details and special-cases and so on, is complex. A program that’s just a basic fundamental core algorithm with a bit of implementation detail is simple.
I do agree that the term “complexity” is often used in unhelpful ways; a common example is the claim that the brain must be astronomically complex purely on the basis of it having so many trillions of connections. Well, a bucket of water has about 6x10^26 hydrogen bonds, but who cares? This is clearly not a remotely useful model of complexity.
I do think learned complexity makes the problem of defining complexity in general harder, since that training data can’t count for nothing. Otherwise, you could claim the interpreter is the program, and the program you feed into it is really the training data. So clearly, the simpler and more readily available the training data, the less complexity it adds. And the cheapest simplest training data of all would be that generated from self-play.
>Although, the design instructions for the brain can be efficiently compressed, and indeed brains are made from surprisingly simple instructions.
Can you elaborate on this? If this is based on the size of the functional genome, can you assure me that the prenatal environment, or simply biochemistry in general, offers no significant additional capabilitiy here?
I’m reminded of gwern’s hypothetical interpreter that takes a list of integers and returns corresponding frames of Pirates of the Caribbean (for which I can’t now find a cite… I don’t think I imagined this?). Clearly the possibility of such an interpreter does not demonstrate that the numbers 0 through 204480 are generically all that’s needed to encode Pirates of the Caribbean infull.
I think that’s a good way of framing it. Imagine it’s the far future, long after AI is a completely solved problem. Just for fun, somebody writes the smallest possible fully general seed AI in binary code. How big is that program? I’m going to guess it’s not bigger than 1 GB. The human genome is ~770 MB. Yes, it runs on “chemistry”, but that laws of physics underpinning chemistry/physics actually don’t take that many bytes to specify. Certainly not hundreds of megabytes.
Maybe a clearer question would be, how many bytes do you need to beam to aliens, in order for them to grow a human? The details of the structure of the embryonic cell, the uterus, the umbilical cord, the mother’s body, etc., are mostly already encoded in the genome, because a genome contains the instructions for copying itself via reproduction. Maybe you end up sending a few hundred more megabytes of instructions as metadata for unpacking and running the genome, but not more than that.
Still, though, genomes are bloated. I’ll bet you can build an intelligence on much less than 770 MB. 98.8% of the genome definitely has nothing to do with the secret sauce of having a powerful general intelligence. We know this because we share that much of our genome with chimps. Yes, you need a body to have a brain, so there’s a boring sense in which you need the whole genome to build a brain, but this argument doesn’t apply to AIs, which don’t need to rely on ancient legacy biology.
Honest question—is there some specific technical sense in which you are using “complex”? Colloquially, complex just means a thing consisting of many parts. Any neural network is “a thing consisting of many parts”, and I can generally add arbitrarily many “parts” by changing the number of layers or neurons-per-layer or whatever at initialization time.
I don’t think this is what you mean, though. You mean something like architectural complexity, though I think the word “architectural” is a weasel word here that lets you avoid explaining what exactly is missing from, e.g., AlphaGo Zero. I think by “complex” you mean something like “a thing consisting of many distinct sub-things, with the critical functional details spread across many levels of components and their combinations”. Or perhaps “the design instructions for the thing cannot be efficiently compressed”. This is the sense in which the brain is more “complex” than the kidneys.
(Although, the design instructions for the brain can be efficiently compressed, and indeed brains are made from surprisingly simple instructions. A developed, adult brain can’t be efficiently compressed, true, but that’s not a fair comparison. A blank-slate initialized AlphaGo network, and that same network after training on ten million games of Go, are not the same artifact.)
Other words aside from “complex” and “architecture” that I think you could afford to taboo for the sake of clarity are “simple” and “general”. Is the idea of a neural network “simple”? Is a convnet “general”? Is MCTS a “simple, general” algorithm or a “complex, narrow” one? These are bad questions because all those words must be defined relative to some standard that is not provided. What problem are you trying to solve, what class of techniques are you regarding as relevant for comparison? A convnet is definitely “complex and narrow” compared to linear regression, as a mathematical technique. AlphaGo Zero is highly “complex and narrow” relative to a vanilla convnet.
If your answer to “how complex and how specific a technique do you think we’re missing?” is always “more complex and more specific than whatever Deepmind just implemented”, then we should definitely stop using those words.
Perhaps “size of compiled program” would be one way to make a crude complexity estimate. But I definitely would like to be able to better define this metric.
In any case, I don’t think the concept of software complexity is meaningless or especially nebulous. A program with a great many different bespoke modules, which all interact in myriad ways, and are in turn full of details and special-cases and so on, is complex. A program that’s just a basic fundamental core algorithm with a bit of implementation detail is simple.
I do agree that the term “complexity” is often used in unhelpful ways; a common example is the claim that the brain must be astronomically complex purely on the basis of it having so many trillions of connections. Well, a bucket of water has about 6x10^26 hydrogen bonds, but who cares? This is clearly not a remotely useful model of complexity.
I do think learned complexity makes the problem of defining complexity in general harder, since that training data can’t count for nothing. Otherwise, you could claim the interpreter is the program, and the program you feed into it is really the training data. So clearly, the simpler and more readily available the training data, the less complexity it adds. And the cheapest simplest training data of all would be that generated from self-play.
>Although, the design instructions for the brain can be efficiently compressed, and indeed brains are made from surprisingly simple instructions.
Can you elaborate on this? If this is based on the size of the functional genome, can you assure me that the prenatal environment, or simply biochemistry in general, offers no significant additional capabilitiy here?
I’m reminded of gwern’s hypothetical interpreter that takes a list of integers and returns corresponding frames of Pirates of the Caribbean (for which I can’t now find a cite… I don’t think I imagined this?). Clearly the possibility of such an interpreter does not demonstrate that the numbers 0 through 204480 are generically all that’s needed to encode Pirates of the Caribbean in full.
I think that’s a good way of framing it. Imagine it’s the far future, long after AI is a completely solved problem. Just for fun, somebody writes the smallest possible fully general seed AI in binary code. How big is that program? I’m going to guess it’s not bigger than 1 GB. The human genome is ~770 MB. Yes, it runs on “chemistry”, but that laws of physics underpinning chemistry/physics actually don’t take that many bytes to specify. Certainly not hundreds of megabytes.
Maybe a clearer question would be, how many bytes do you need to beam to aliens, in order for them to grow a human? The details of the structure of the embryonic cell, the uterus, the umbilical cord, the mother’s body, etc., are mostly already encoded in the genome, because a genome contains the instructions for copying itself via reproduction. Maybe you end up sending a few hundred more megabytes of instructions as metadata for unpacking and running the genome, but not more than that.
Still, though, genomes are bloated. I’ll bet you can build an intelligence on much less than 770 MB. 98.8% of the genome definitely has nothing to do with the secret sauce of having a powerful general intelligence. We know this because we share that much of our genome with chimps. Yes, you need a body to have a brain, so there’s a boring sense in which you need the whole genome to build a brain, but this argument doesn’t apply to AIs, which don’t need to rely on ancient legacy biology.