Eliezer Yudkowsky comments on Neuroscience basics for LessWrongians

Eliezer Yudkowsky 23 Jul 2012 22:58 UTC
16 points
Actually, we can guess that a piece of DNA is nonfunctional if it seems to have undergone neutral evolution (roughly, accumulation of functionally equivalent mutations) at a rate which implies that it was not subject to any noticeable positive selection pressure over evolutionary time. Leaving aside transposons, repetition, and so on, that’s a main part of how we know that large amounts of junk DNA really are junk.
- dekelron 26 Jul 2012 10:19 UTC
  13 points
  Parent
  There are pieces of DNA that preserve function, but undergo neutral evolution. A recent nature article found a not-protein-coding piece of DNA that is necessary for development (by being transcribed into RNA), that had underwent close to neutral evolution from zebrafish to human, but maintained functional conservation. That is, taking the human transcript and inserting it into zebrafish spares it from death, indicating that (almost) completely different DNA performs the same function, and that using simple conservation of non-neutral evolution we probably can’t detect it.
  - philh 26 Jul 2012 13:46 UTC
    4 points
    Parent
    I’m having trouble working out the experimental conditions here. I take it they replaced a sequence of zebrafish DNA with its human equivalent, which seemed to have been undergoing nearly neutral selection, and didn’t observe developmental defects. But what was the condition where they did observe defects? If they just removed that section of DNA, that could suggest that some sequence is needed there but its contents are irrelevant. If they replaced it with a completely different section of DNA that seems like it would be a lot more surprising.
    - zslastman 27 Jul 2012 14:20 UTC
      7 points
      Parent
      You are correct—given the information above it is possible (though unlikely) that the DNA was just there as a spacer between two other things and its content was irrelevant. However the study controlled for this—they also mutated the zebrafish DNA in specific places and were able to induce identical defects as with the deletion.
      
      What’s happening here is that the DNA is transcribed into non protein-coding RNA. This RNA’s function and behavior will be determined by, but impossible to predict from, it’s sequence—you’re dealing not only with the physical process of molecular folding, which is intractable, but with its interactions with everything else in the cell, which is intractability squared. So there is content there but it’s unreadable to us and thus appears unconstrained. If we had a very large quantum computer we could perhaps find the 3d structure “encoded’ by it and its interaction partners, and would see the conservation of this 3d structure from fish to human.
      - philh 27 Jul 2012 21:28 UTC
        4 points
        Parent
        That’s interesting. I guess my next question is, how confident are we that this sequence has been undergoing close-to-neutral selection?
        
        I ask because if it has been undergoing close-to-neutral selection, that implies that almost all possible mutations in that region are fitness-neutral. (Which is why my thoughts turned to “something is necessary, but it doesn’t matter what”. When you call that unlikely, is that because there’s no known mechanism for it, or you just don’t think there was sufficient evidence for the hypothesis, or something else?) But… according to this study they’re not, which leaves me very confused. This doesn’t even feel like I just don’t know enough, it feels like something I think I know is wrong.
        dekelron 28 Jul 2012 10:34 UTC
        2 points
        Parent
        
        if it has been undergoing close-to-neutral selection, that implies that almost all possible mutations in that region are fitness-neutral.
        
        There is no “neutral” evolution, as all DNA sequences are subject to several constraints, such as maintaining GC content and preventing promoters) from popping out needlessly. There is also large variability of mutation rates along different DNA regions. Together, this results in high variance of “neutral” mutation rate, and because of huge genome, making it (probably) impossible to detect even regions having quarter of neutral mutation rate. I think this is the case here.
        
        This extends what zslastsman written regarding structure.
        zslastman 28 Jul 2012 10:09 UTC
        2 points
        Parent
        We can’t be totally confident. I’d guess that if you did a sensitive test of fitness (you’d need a big fish tank and a lot of time) you’d find the human sequence didn’t rescue the deletion perfectly. They’ve done this recently in c elegans—looking at long term survival in the population level, and they find a huge number of apparantly harmless mutations are very detrimental at the population level.
        
        The reason I’d say it was unlikely is just that spacers of that kind aren’t common (I don’t know of any that aren’t inside genes). If there were to sequences on either side that needed to bend around to eachother to make contact, it could be plausible, but since they selected by epigenetic marks, rather than sequence conservation, it would be odd and novel if they’d managed to perfectly delete such a spacer (actually it would be very interesting of itself.)
        
        I think you are being confused by two things 1) The mutation I said they made was deliberately targeted to a splice site, which are constrained (though you can’t use them to identify sequences because they are very small, and so occur randomly outside functional sequence all the time) 2) You are thinking too simplistically about sequence constraint. RNA folds by wrapping up and forming helices with itself, so the effect of a mutation is dependent on the rest of the sequence. Each mutation releases constraint on other base pairs, and introduces it to others. So as this sequence wanders through sequence space it does so in a way that preserves relationships, not absolute sequence. From it’s current position in sequence space, many mutations would be detrimental. But those residues may get the chance to mutate later on, when other residues have relieved them. This applies to proteins as well by the way. Proteins are far more conserved in 3d shape than in 2d sequence.
    - dekelron 26 Jul 2012 22:05 UTC
      2 points
      Parent
      The DNA in the zebrafish was deleted, and the human version was inserted later, without affecting the main DNA (probably using a “plasmid”). Without the human DNA “insert”, there was a developmental defect. with either the human DNA insert or the original zebrafish DNA (as an insert), there was no developmental defect, leading to the conclusion that the human version is functionally equivalent to the zebrafish version.
      - A1987dM 27 Jul 2012 14:17 UTC
        0 points
        Parent
        How do we know whether, by replacing the insert with a random sequence of base pairs the same length, there would be no developmental defect either?
        dekelron 28 Jul 2012 11:04 UTC
        0 points
        Parent
        There are several complications addressed in the article, which I did not describe. Anyway, using a “control vector” is considered trivial, and I believe they checked this.
- zslastman 24 Jul 2012 15:22 UTC
  8 points
  Parent
  That’s true of protein coding sequence, but things are a little bit more difficult for regulatory DNA because
  
  1)Regulatory DNA is under MUCH less sequence constraint—the relevant binding proteins are not individually fussy about their binding sites
  
  2)Regulatory Networks have a lot of redundancy
  
  3)Regulatory Mutations can be much more easily compensated for by other mutations—because we’re dealing with analog networks, rather than strings of amino acids.
  
  Regulatory evolution is an immature field but it seems that an awful lot of change can occur in a short time. The literature is full of sequences that have an experimentally provable activity (put them on a plasmid with a reporter gene and off it goes) and yet show no conservation between species. There’s probably a lot more functional sequence that won’t just work on it’s own on a plasmid, or show a noticable effect from knockouts. It may be that regulatory networks are composed of a continous distribution from a few constrained elements with strong effects down to lots of unconstrained weak ones. The latter will be very, very difficult to distinguish from Junk DNA.
  - Strilanc 25 Jul 2012 7:46 UTC
    7 points
    Parent
    Data with lots of redundancy does, in a certain sense, contain a lot of junk. Junk that, although it helps reliably transmit the data, doesn’t change the meaning of the data (or doesn’t change it by much).
    - A1987dM 25 Jul 2012 9:36 UTC
      2 points
      Parent
      Yeah. What’s relevant to this discussion is complexity, not number of base pairs.
- RobertLumley 28 Jul 2012 19:05 UTC
  0 points
  Parent
  
  Actually, we can guess that a piece of DNA is nonfunctional if it seems to have undergone neutral evolution (roughly, accumulation of functionally equivalent mutations) at a rate which implies that it was not subject to any noticeable positive selection pressure over evolutionary time.
  
  This actually isn’t necessarily true. If there is a section of the genome A that needs to act on another section of the genome C with section B in between, and A needs to act on C with a precise (or relatively so) genomic distance between them, B can neutrally evolve, even though it’s still necessary for the action of A on C, since it provides the spacing.
  - Baughn 2 Sep 2012 14:47 UTC
    1 point
    Parent
    Thus, serving a purely structural function.
    
    In that case the complexity in bits of B, for length N, becomes log2(N) instead of 2*N. It’s not quite 0, but it’s a lot closer.
- Mitchell_Porter 24 Jul 2012 1:29 UTC
  0 points
  Parent
  The only definitively nonfunctional DNA is that which has been deleted. “Nonfunctional DNA” is temporarily inactive legacy code which may at any time be restored to an active role.
  - philh 24 Jul 2012 1:53 UTC
    3 points
    Parent
    In the context “how complicated is a human brain?”, DNA which is currently inactive does not count towards the answer.
    
    That said (by which I mean “what follows doesn’t seem relevant now that I’ve realised the above, but I already wrote it”),
    
    Is inactive DNA more likely to be restored to an active role than to get deleted? I’m not sure it makes sense to consider it functional just because it might start doing something again. When you delete a file from your hard disk, it could theoretically be restored until the disk space is actually repurposed; but if you actually wanted the file around, you just wouldn’t have deleted it. That’s not a great analogy, but...
    
    My gut says that any large section of inactive DNA is more likely to become corrupted than to become reactivated. A corruption is pretty much any mutation in that section, whereas I imagine reactivating it would require one of a small number of specific mutations.
    
    Counterpoint: a corruption has only a small probability of becoming fixed in the population; if reactivation is helpful, that still only has a small probability of becoming fixed, but it’s a much higher small probability.
    
    Counter-counterpoint: no particular corruption would need to be fixed in the whole population. If there are several corruptions at independent 10% penetration each, a reactivating mutation will have a hard time becoming fixed.
    - Mitchell_Porter 24 Jul 2012 2:09 UTC
      4 points
      Parent
      Here’s the concept I wanted: evolutionary capacitance.