Note also that there’s (at least) two ways to do this, which I need to write a post about (or let me know if you want to review my draft). One way is like “be a Nanny AI and protect the «boundaries» of humans”, another way that’s like “mind your own business and you will automatically not cause any problems for anyone else”. The former is more like Davidad’s approach (at least as of earlier this year), the latter is more like Mark Miller’s thoughts on AI safety and security.
«Boundaries»/membranes.
Eg: «Boundaries» for formalizing an MVP morality
Also: see the recap in Formalizing «Boundaries» with Markov blankets + Criticism of this approach
Note also that there’s (at least) two ways to do this, which I need to write a post about (or let me know if you want to review my draft). One way is like “be a Nanny AI and protect the «boundaries» of humans”, another way that’s like “mind your own business and you will automatically not cause any problems for anyone else”. The former is more like Davidad’s approach (at least as of earlier this year), the latter is more like Mark Miller’s thoughts on AI safety and security.