Steven Byrnes comments on [Intro to brain-like-AGI safety] 1. What’s the problem & Why work on it now?

Steven Byrnes 27 Jan 2022 14:21 UTC
LW: 15 AF: 7
AF
Thanks!
If I’m not mistaken, the things you brought up are at too low a level to be highly relevant for safety, in my opinion. I guess this series will mostly be at Marr’s “computational level” whereas you’re asking “algorithm-level” questions, more or less. I’ll be talking a lot about things vaguely like “what is the loss function” and much less about things vaguely like “how is the loss function minimized”.
For example, I think you can train a neural Turing machine with supervised learning, or you can train a neural Turing machine with RL. That distinction would be very relevant for safety, along with the question of exactly how these supervisory signals (error signals in SL / rewards in RL) are being calculated. By contrast, the question of whether it’s a neural Turing machine versus a predictive coding network or whatever is less relevant for safety, in my opinion. It’s not totally irrelevant, and those kinds of things will come up from time to time in future posts, but it’s mostly irrelevant.
Y’know, there’s a thing that I call “The gory details of the neocortical algorithm”. This thing is the very last thing that I hope we figure out about the brain, as I think those details are extremely helpful for building AGI but only slightly helpful for AGI safety. Needless to say, “the “gory details of the neocortical algorithm” are the exact thing that practically everyone in computational neuroscience and ML spends all their time trying to figure out! :-P I’m in no position to slow that tidal wave, but at least I’ll set a good example by talking about it approximately as little as I can get away with, while meanwhile enthusiastically promoting work on more safety-helping topics within comp neuro. I haven’t always done that in the past, but better late than never. :-)
- glazgogabgolab 28 Jan 2022 0:58 UTC
  1 point
  AF Parent
  Hey Steve, I might be wrong here but I don’t think Jon’s question was specifically about what architectures you’d be talking about. I think he was asking more specifically about how to classify something as Brain-like-AGI for the purposes of your upcoming series.
  The way I read your answer makes it sound like the safety considerations you’ll be discussing depend more on whether the NTM is trained via SL or RL rather than whether it neatly contains all your (soon to be elucidated) Brain-like-AGI properties.
  Though that might actually have been what you meant so I probably should have asked for clarification before I presumptively answered Jon for you.
  - Steven Byrnes 28 Jan 2022 2:33 UTC
    3 points
    AF Parent
    the safety considerations you’ll be discussing depend more on whether the NTM is trained via SL or RL rather than whether it neatly contains all your (soon to be elucidated) Brain-like-AGI properties.
    I’m confused; this statement makes it sound like “whether it’s trained via SL or RL” is NOT a possible candidate for a “brain-like-AGI property”. Why can’t it be? Or maybe I’m reading too much into your wording.
    - glazgogabgolab 28 Jan 2022 23:35 UTC
      3 points
      AF Parent
      Oops, strangely enough I just wasn’t thinking about that possibility. It’s obvious now, but I assumed that SL vs RL would be a minor consideration, despite the many words you’ve already written on reward.