Charlie Steiner comments on Framing AI Childhoods

Charlie Steiner 6 Sep 2022 23:54 UTC
LW: 10 AF: 5
4
AF
We successfully chisel out aligned kids because we understand their inductive biases well
>.>
Interpretation of this emoji: “Press X to doubt.”
- TurnTrout 12 Sep 2022 23:51 UTC
  LW: 2 AF: 2
  0
  AF Parent
  I’m interested in why you doubt this? I can imagine various interpretations of the quote which I doubt, and some which are less doubtful-to-me.
  - Charlie Steiner 13 Sep 2022 0:13 UTC
    LW: 7 AF: 2
    3
    AF Parent
    The reason babies grow up into people that share our values has very little to do with our understanding of their inductive biases (i.e. most of the work is done by gene-environment interactions with parts of the environment that aren’t good at predicting the child’s moral development). The primary meaning of this comment is pointing out that a particular statement about children is wrong in a kind-of-funny way.
    I have this sort of humorous image of someone raising a child, saying “Phew, thank goodness I had a good understanding of my child’s inductive biases, I never could have gotten them to have similar values to me just by passing on half of my genes to them and raising them in an environment similar to the one I grew up in.”
    - TurnTrout 20 Sep 2022 3:32 UTC
      LW: 10 AF: 5
      6
      AF Parent
      I agree that similar environments are important, but I don’t see why you think they explain most of the outcomes. What’s an example of a “gene-environment interaction with parts of the environment that aren’t good at predicting the child’s moral development”?
      Like, what it feels like to understand human inductive biases isn’t to think “Gee, I understand inductive biases!”. It’s more like: “I see that my son just scowled after agreeing to clean his room. This provides evidence about his internal motivational composition, even though I can’t do interpretability and read off his brain state. Given what I know about human physiology and people like him, I predict that while he might comply now (‘in training’), at which point I will reward him with snacks. However, in the future he will probably not generalize to actually cleaning his room when asked when I’m less able to impose external punishments and rewards.”
      I also claim that a human would be relatively easy to reshape into caring about baby-eaters if you used a competent and unethical (psychological) shaping scheme and hidden brain stimulation reward devices, such that you super strongly reinforced the human when they indicate they just thought positive or charitable thoughts about baby eaters, or agree that the baby eaters deserve moral consideration. I think you could probably pull this off within a week.
      Now, we haven’t actually observed this. But insofar as you agree with my prediction, it’s not due to an environment-gene interaction. This is what it feels like to understand inductive biases: Being able to correctly predict how to inculcate target values in another agent, without already having done it experimentally.
      - Charlie Steiner 20 Sep 2022 5:08 UTC
        LW: 16 AF: 9
        4
        AF Parent
        
        What’s an example of a “gene-environment interaction with parts of the environment that aren’t good at predicting the child’s moral development”?
        
        Mimicking adult behavior even when the adult isn’t paying any attention to the child (and children with different genes having slightly different sorts of mimicry). Automatically changing purity norms in response to disease and perceived disease risk. Having a different outlook on the world if you always had plenty of food growing up. Children of athletic parents often being athletic too, which changes how they relate to their environment and changes their eventual lifestyle. Being genetically predisposed to alcoholism and then becoming an alcoholic. Learning a tonal language and then having a different relationship with music.
        
        I’m not saying parents have no power. If you paid a bunch of parents to raise their children to love playing the piano, you’d probably get a significantly elevated rate of that value among the kids (guess: ~35% compared to a base rate of ~3%). Effortful and deliberate parenting works at a rate distinguishable from chance. My claim is more that almost all of value transmission isn’t like that, that you can raise kids without deliberately imparting your (or any) values and they’ll still end up pretty similar to you.
        TurnTrout 26 Sep 2022 22:46 UTC
        LW: 8 AF: 5
        2
        AF Parent
        I think these are great counterpoints. Thanks for making them.
        I still buy “the helicopter parent ‘outer alignment’ training regime is unwise for ‘aligning’ kids” and that deliberate parenting is better than chance. But possibly/probably not the primary factor. I haven’t yet read much data here so my views feel relatively unconstrained, beyond my “common sense.”
        I think there’s an additional consideration with AI, though: We control the reward circuitry. If lots of variance in kid-alignment is due to genetic variation in reward circuitry or learning hyperparameters or whatever, then we also control that with AI, that is also part of understanding AI inductive biases.