Intro to Brain-Like-AGI Safety

Suppose we someday build an Artificial General Intelligence algorithm using similar principles of learning and cognition as the human brain. How would we use such an algorithm safely?

I will argue that this is an open technical problem, and my goal in this post series is to bring readers with no prior knowledge all the way up to the front-line of unsolved problems as I see them.

If this whole thing seems weird or stupid, you should start right in on Post #1, which contains definitions, background, and motivation. Then Posts #2#7 are mainly neuroscience, and Posts #8#15 are more directly about AGI safety, ending with a list of open questions and advice for getting involved in the field.

(Thanks to Beth Barnes & the Centre For Effective Altruism Donor Lottery Program for financial support. Thanks to the following people for critical comments on drafts: Adam Marblestone, Linda Linsefors, Justis Mills, Charlie Steiner, Maksym Taran, Adam Scholl, Aysja Johnson, Adam Shimi, Cameron Berg, Jacob Cannell, Oliver Daniels-Koch.)

(Series was revised July 2024—see changelog at the bottom of each post.)

[In­tro to brain-like-AGI safety] 1. What’s the prob­lem & Why work on it now?

[In­tro to brain-like-AGI safety] 2. “Learn­ing from scratch” in the brain

[In­tro to brain-like-AGI safety] 3. Two sub­sys­tems: Learn­ing & Steering

[In­tro to brain-like-AGI safety] 4. The “short-term pre­dic­tor”

[In­tro to brain-like-AGI safety] 5. The “long-term pre­dic­tor”, and TD learning

[In­tro to brain-like-AGI safety] 6. Big pic­ture of mo­ti­va­tion, de­ci­sion-mak­ing, and RL

[In­tro to brain-like-AGI safety] 7. From hard­coded drives to fore­sighted plans: A worked example

[In­tro to brain-like-AGI safety] 8. Take­aways from neuro 1/​2: On AGI development

[In­tro to brain-like-AGI safety] 9. Take­aways from neuro 2/​2: On AGI motivation

[In­tro to brain-like-AGI safety] 10. The al­ign­ment problem

[In­tro to brain-like-AGI safety] 11. Safety ≠ al­ign­ment (but they’re close!)

[In­tro to brain-like-AGI safety] 12. Two paths for­ward: “Con­trol­led AGI” and “So­cial-in­stinct AGI”

[In­tro to brain-like-AGI safety] 13. Sym­bol ground­ing & hu­man so­cial instincts

[In­tro to brain-like-AGI safety] 14. Con­trol­led AGI

[In­tro to brain-like-AGI safety] 15. Con­clu­sion: Open prob­lems, how to help, AMA