Raemon comments on What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)

Raemon 11 Apr 2021 3:45 UTC
LW: 11 AF: 8
AF
Curated. I appreciated this post for a combination of:
- laying out several concrete stories about how AI could lead to human extinction
- layout out a frame for how think about those stories (while acknowledging other frames one could apply to the story)
- linking to a variety of research, with more thoughts what sort of further research might be helpful.
I also wanted to highlight this section:
Finally, should also mention that I agree with Tom Dietterich’s view (dietterich2019robust) that we should make AI safer to society by learning from high-reliability organizations (HROs), such as those studied by social scientists Karlene Roberts, Gene Rochlin, and Todd LaPorte (roberts1989research, roberts1989new, roberts1994decision, roberts2001systems, rochlin1987self, laporte1991working, laporte1996high). HROs have a lot of beneficial agent-agnostic human-implemented processes and control loops that keep them operating. Again, Dietterich himself is not as yet a proponent of existential safety concerns, however, to me this does not detract from the correctness of his perspective on learning from the HRO framework to make AI safer.
Which is a thing I think I once heard Critch talk about, but which I don’t think had been discussed much on LessWrong, and which I’d be interested in seeing more thoughts and distillation of.