Steven Byrnes comments on LeCun’s “A Path Towards Autonomous Machine Intelligence” has an unsolved technical alignment problem

Steven Byrnes 12 Nov 2023 1:34 UTC
LW: 4 AF: 2
0
AF
The “similar reason as why I personally am not trying to get heroin right now” is “Example 2” here (including the footnote), or a bit more detail in Section 9.5 here. I don’t think that involves an idiosyncratic anti-heroin intrinsic cost function.
The question “What is the intrinsic cost in a human brain” is a topic in which I have a strong personal interest. See Section 2 here and links therein. “Why don’t humans have an alignment problem” is sorta painting the target around the arrow I think? Anyway, if you radically enhanced human intelligence and let those super-humans invent every possible technology, I’m not too sure what you would get (assuming they don’t blow each other to smithereens). Maybe that’s OK though? Hard to say. Our distant ancestors would think that we have awfully weird lifestyles and might strenuously object to it, if they could have a say.
Maybe the view of alignment pessimists is that the paradigmatic human brain’s intrinsic cost is intractably complex.
Speaking for myself, I think the human brain’s intrinsic-cost-like-thing is probably hundreds of lines of pseudocode, or maybe low thousands, certainly not millions. (And the part that’s relevant for AGI is just a fraction of that.) Unfortunately, I also think nobody knows what those lines are. I would feel better if they did. That wouldn’t be enough to make me “optimistic” overall, but it would certainly be a step in the right direction. (Other things can go wrong too.)
- [deactivated] 12 Nov 2023 1:44 UTC
  1 point
  0
  AF Parent
  ...I think the human brain’s intrinsic-cost-like-thing is probably hundreds of lines of pseudocode, or maybe low thousands, certainly not millions. (And the part that’s relevant for AGI is just a fraction of that.) Unfortunately, I also think nobody knows what those lines are. I would feel better if they did.
  So, the human brain’s pseudo-intrinsic cost is not intractably complex, on your view, but difficult to extract.
  - Steven Byrnes 12 Nov 2023 1:57 UTC
    LW: 2 AF: 2
    0
    AF Parent
    I would say “the human brain’s intrinsic-cost-like-thing is difficult to figure out”. I’m not sure what you mean by “…difficult to extract”. Extract from what?
    - [deactivated] 12 Nov 2023 2:04 UTC
      1 point
      0
      AF Parent
      Extract from the brain into, say, weights in an artificial neural network, lines of code, a natural language “constitution”, or something of that nature.
      - Steven Byrnes 12 Nov 2023 3:30 UTC
        LW: 2 AF: 2
        0
        AF Parent
        “Extract from the brain” how? A human brain has like 100 billion neurons and 100 trillion synapses, and they’re generally very difficult to measure, right? (I do think certain neuroscience experiments would be helpful.) Or do you mean something else?
        [deactivated] 12 Nov 2023 4:02 UTC
        1 point
        0
        AF Parent
        I meant “extract” more figuratively than literally. For example, GPT-4 seems to have acquired some ability to do moral reasoning in accordance with human values. This is one way to (very indirectly) “extract” information from the human brain.
        Steven Byrnes 12 Nov 2023 12:13 UTC
        LW: 3 AF: 2
        0
        AF Parent
        GPT-4 is different from APTAMI. I’m not aware of any method that starts with movies of humans, or human-created internet text, or whatever, and then does some kind of ML, and winds up with a plausible human brain intrinsic cost function. If you have an idea for how that could work, then I’m skeptical, but you should tell me anyway. :)