Daniel Kokotajlo comments on An overview of 11 proposals for building safe advanced AI

Daniel Kokotajlo 11 Dec 2021 14:01 UTC
LW: 25 AF: 16
AF
This post is the best overview of the field so far that I know of. I appreciate how it frames things in terms of outer/inner alignment and training/performance competitiveness—it’s very useful to have a framework with which to evaluate proposals and this is a pretty good framework I think.
Since it was written, this post has been my go-to reference both for getting other people up to speed on what the current AI alignment strategies look like (even though this post isn’t exhaustive). Also, I’ve referred back to it myself several times. I learned a lot from it.
I hope that this post grows into something more extensive and official—maybe an Official Curated List of Alignment Proposals, Summarized and Evaluated with Commentary and Links. Such a list could be regularly updated and would be very valuable for several reasons, some of which I mentioned in this comment.
What links here?