Perhaps “size of compiled program” would be one way to make a crude complexity estimate. But I definitely would like to be able to better define this metric.
In any case, I don’t think the concept of software complexity is meaningless or especially nebulous. A program with a great many different bespoke modules, which all interact in myriad ways, and are in turn full of details and special-cases and so on, is complex. A program that’s just a basic fundamental core algorithm with a bit of implementation detail is simple.
I do agree that the term “complexity” is often used in unhelpful ways; a common example is the claim that the brain must be astronomically complex purely on the basis of it having so many trillions of connections. Well, a bucket of water has about 6x10^26 hydrogen bonds, but who cares? This is clearly not a remotely useful model of complexity.
I do think learned complexity makes the problem of defining complexity in general harder, since that training data can’t count for nothing. Otherwise, you could claim the interpreter is the program, and the program you feed into it is really the training data. So clearly, the simpler and more readily available the training data, the less complexity it adds. And the cheapest simplest training data of all would be that generated from self-play.
>Although, the design instructions for the brain can be efficiently compressed, and indeed brains are made from surprisingly simple instructions.
Can you elaborate on this? If this is based on the size of the functional genome, can you assure me that the prenatal environment, or simply biochemistry in general, offers no significant additional capabilitiy here?
I’m reminded of gwern’s hypothetical interpreter that takes a list of integers and returns corresponding frames of Pirates of the Caribbean (for which I can’t now find a cite… I don’t think I imagined this?). Clearly the possibility of such an interpreter does not demonstrate that the numbers 0 through 204480 are generically all that’s needed to encode Pirates of the Caribbean infull.
I think that’s a good way of framing it. Imagine it’s the far future, long after AI is a completely solved problem. Just for fun, somebody writes the smallest possible fully general seed AI in binary code. How big is that program? I’m going to guess it’s not bigger than 1 GB. The human genome is ~770 MB. Yes, it runs on “chemistry”, but that laws of physics underpinning chemistry/physics actually don’t take that many bytes to specify. Certainly not hundreds of megabytes.
Maybe a clearer question would be, how many bytes do you need to beam to aliens, in order for them to grow a human? The details of the structure of the embryonic cell, the uterus, the umbilical cord, the mother’s body, etc., are mostly already encoded in the genome, because a genome contains the instructions for copying itself via reproduction. Maybe you end up sending a few hundred more megabytes of instructions as metadata for unpacking and running the genome, but not more than that.
Still, though, genomes are bloated. I’ll bet you can build an intelligence on much less than 770 MB. 98.8% of the genome definitely has nothing to do with the secret sauce of having a powerful general intelligence. We know this because we share that much of our genome with chimps. Yes, you need a body to have a brain, so there’s a boring sense in which you need the whole genome to build a brain, but this argument doesn’t apply to AIs, which don’t need to rely on ancient legacy biology.
Perhaps “size of compiled program” would be one way to make a crude complexity estimate. But I definitely would like to be able to better define this metric.
In any case, I don’t think the concept of software complexity is meaningless or especially nebulous. A program with a great many different bespoke modules, which all interact in myriad ways, and are in turn full of details and special-cases and so on, is complex. A program that’s just a basic fundamental core algorithm with a bit of implementation detail is simple.
I do agree that the term “complexity” is often used in unhelpful ways; a common example is the claim that the brain must be astronomically complex purely on the basis of it having so many trillions of connections. Well, a bucket of water has about 6x10^26 hydrogen bonds, but who cares? This is clearly not a remotely useful model of complexity.
I do think learned complexity makes the problem of defining complexity in general harder, since that training data can’t count for nothing. Otherwise, you could claim the interpreter is the program, and the program you feed into it is really the training data. So clearly, the simpler and more readily available the training data, the less complexity it adds. And the cheapest simplest training data of all would be that generated from self-play.
>Although, the design instructions for the brain can be efficiently compressed, and indeed brains are made from surprisingly simple instructions.
Can you elaborate on this? If this is based on the size of the functional genome, can you assure me that the prenatal environment, or simply biochemistry in general, offers no significant additional capabilitiy here?
I’m reminded of gwern’s hypothetical interpreter that takes a list of integers and returns corresponding frames of Pirates of the Caribbean (for which I can’t now find a cite… I don’t think I imagined this?). Clearly the possibility of such an interpreter does not demonstrate that the numbers 0 through 204480 are generically all that’s needed to encode Pirates of the Caribbean in full.
I think that’s a good way of framing it. Imagine it’s the far future, long after AI is a completely solved problem. Just for fun, somebody writes the smallest possible fully general seed AI in binary code. How big is that program? I’m going to guess it’s not bigger than 1 GB. The human genome is ~770 MB. Yes, it runs on “chemistry”, but that laws of physics underpinning chemistry/physics actually don’t take that many bytes to specify. Certainly not hundreds of megabytes.
Maybe a clearer question would be, how many bytes do you need to beam to aliens, in order for them to grow a human? The details of the structure of the embryonic cell, the uterus, the umbilical cord, the mother’s body, etc., are mostly already encoded in the genome, because a genome contains the instructions for copying itself via reproduction. Maybe you end up sending a few hundred more megabytes of instructions as metadata for unpacking and running the genome, but not more than that.
Still, though, genomes are bloated. I’ll bet you can build an intelligence on much less than 770 MB. 98.8% of the genome definitely has nothing to do with the secret sauce of having a powerful general intelligence. We know this because we share that much of our genome with chimps. Yes, you need a body to have a brain, so there’s a boring sense in which you need the whole genome to build a brain, but this argument doesn’t apply to AIs, which don’t need to rely on ancient legacy biology.