If you actually look at the genome, we’ve got about 30,000 genes in here. Most of our 750 megabytes of DNA is repetitive and almost certainly junk, as best we understand it.
This is false. Just because we do not know what role a lot of DNA performs does not mean it is ‘almost certainly junk’. There is far more DNA that is critical than just the 30,000 gene coding regions. You also have: genetic switches, regulation of gene expression, transcription factor binding sites, operators, enhancers, splice sites, DNA packaging sites, etc. Even in cases where the DNA isn’t currently ‘in use’ that DNA may be critical to the ongoing stability of our genome over multiple generations or have other unknown functions.
Your objections are correct, but Eliezer’s statement is still true. The elements you list, as far as I know, take up even less space than the coding regions. (If a section of DNA is serving a useful purpose, but would be just as useful if it was replaced with a random sequence of the same length, I think it’s fair to call it junk.)
Comparison with the mouse genome shows at least 5% of the human genome is under selective pressure, whereas only something like 2% has a purpose that we’ve discovered. But at the same time, there’s a lot that we’re pretty sure really is junk.
If a section of DNA is serving a useful purpose, but would be just as useful if it was replaced with a random sequence of the same length, I think it’s fair to call it junk.
Unless this is a standard definition for describing DNA, I do not agree that such DNA is ‘junk’. If the DNA serves a purpose it is not junk. There was a time when it was believed (as many still do) that the nucleus was mostly a disorganized package of DNA and associated ‘stuff’. However, it is becoming increasing clear that it is highly structured and that structure is critical for proper cell regulation including epigenetics.
If it can be shown that outright removal of most of our DNA does not have adverse affects I would agree with the junk description. However, I am not aware that this has been shown in humans (or human cell lines at least).
I think the term “junk” has fallen out of favour. Fair enough, let’s taboo that word.
If a section of DNA is serving a useful purpose, but would be just as useful if it was replaced with a random sequence of the same length, it contains no useful information—or at least, no more than it takes to say “a megabase of arbitrary DNA goes here”. The context is roughly “how much information does it take to express a brain?” It’s true that we can’t completely ignore those regions unless we’re confident that they could be completely removed, but they only add O(1) complexity instead of O(n).
In the context of “what is the minimal amount of information it takes to build a human brain,” I can agree that there is some amount of compressibility in our genome. However, our genome is a lot like spaghetti code where it is very hard to tell what individual bits do and what long range effects a change may have.
Do we know how much of the human genome can definitely be replaced with random code without problem?
In addition, do we know how much information is contained in the structure of a cell? You can’t just put the DNA of our genome in water and expect to get a brain. Our DNA resides in an enormously complex sea of nano machines and structures. You need some combination of both to get a brain.
Honestly, I think the important take away is that there are probably a number of deep or high level insights that we need to figure out. Whether it’s 75 mb, 750 mb, or a petabyte doesn’t really matter if most of that information just describes machine parts or functions (e.g., a screw, a bolt, a wheel, etc.). Simple components often take up a lot of information. Frankly, I think 1 mb containing 1000 deep insights at maximum compression would be far more difficult to comprehend than a petabyte containing loads of parts descriptions and only 10 deep insights.
This is false. Just because we do not know what role a lot of DNA performs does not mean it is ‘almost certainly junk’. There is far more DNA that is critical than just the 30,000 gene coding regions. You also have: genetic switches, regulation of gene expression, transcription factor binding sites, operators, enhancers, splice sites, DNA packaging sites, etc. Even in cases where the DNA isn’t currently ‘in use’ that DNA may be critical to the ongoing stability of our genome over multiple generations or have other unknown functions.
Your objections are correct, but Eliezer’s statement is still true. The elements you list, as far as I know, take up even less space than the coding regions. (If a section of DNA is serving a useful purpose, but would be just as useful if it was replaced with a random sequence of the same length, I think it’s fair to call it junk.)
Comparison with the mouse genome shows at least 5% of the human genome is under selective pressure, whereas only something like 2% has a purpose that we’ve discovered. But at the same time, there’s a lot that we’re pretty sure really is junk.
Unless this is a standard definition for describing DNA, I do not agree that such DNA is ‘junk’. If the DNA serves a purpose it is not junk. There was a time when it was believed (as many still do) that the nucleus was mostly a disorganized package of DNA and associated ‘stuff’. However, it is becoming increasing clear that it is highly structured and that structure is critical for proper cell regulation including epigenetics.
If it can be shown that outright removal of most of our DNA does not have adverse affects I would agree with the junk description. However, I am not aware that this has been shown in humans (or human cell lines at least).
I think the term “junk” has fallen out of favour. Fair enough, let’s taboo that word.
If a section of DNA is serving a useful purpose, but would be just as useful if it was replaced with a random sequence of the same length, it contains no useful information—or at least, no more than it takes to say “a megabase of arbitrary DNA goes here”. The context is roughly “how much information does it take to express a brain?” It’s true that we can’t completely ignore those regions unless we’re confident that they could be completely removed, but they only add O(1) complexity instead of O(n).
In the context of “what is the minimal amount of information it takes to build a human brain,” I can agree that there is some amount of compressibility in our genome. However, our genome is a lot like spaghetti code where it is very hard to tell what individual bits do and what long range effects a change may have.
Do we know how much of the human genome can definitely be replaced with random code without problem?
In addition, do we know how much information is contained in the structure of a cell? You can’t just put the DNA of our genome in water and expect to get a brain. Our DNA resides in an enormously complex sea of nano machines and structures. You need some combination of both to get a brain.
Honestly, I think the important take away is that there are probably a number of deep or high level insights that we need to figure out. Whether it’s 75 mb, 750 mb, or a petabyte doesn’t really matter if most of that information just describes machine parts or functions (e.g., a screw, a bolt, a wheel, etc.). Simple components often take up a lot of information. Frankly, I think 1 mb containing 1000 deep insights at maximum compression would be far more difficult to comprehend than a petabyte containing loads of parts descriptions and only 10 deep insights.