If you’re new to the AI Alignment research field, we recommend four great introductory sequences that cover several different paradigms of thought within the field. Get started reading them and feel free to leave comments with any questions you have.
The introductory sequences are:
Embedded Agency by Scott Garrabrant and Abram Demski of MIRI
Following that, you might want to begin writing up some of your thoughts and sharing them on LessWrong to get feedback.
I think it would be great to update this section. For example, it could link to the AGI Safety Fundamentals curriculum which has a wealth of valuable readings not on this list. And there are other courses that it would be good for newcomers to know about as well, such as MLAB.
Why am I suggesting this? This FAQ was the first place I found with clear advice when I was first getting interested in AI alignment in late 2021, and I took it quite seriously/literally. The very first alignment research I tried to read was the illustrated Embedded Agency sequence, because that was at the top of the above list. While I came to later appreciate Embedded Agency, I found this sequence (particularly the illustrated version which features prominently in the link above, as opposed to the text version) to be a confusing introduction to alignment. I also wasn’t immediately aware of anything important there was to read outside of the 4 texts linked above, while I now feel like there’s a lot!
It’s just one data point of user testing on this FAQ, but something to consider.
That section is even more outdated now. There’s nothing on interpretability, Paul’s work now extends far beyond IDA, etc. In my opinion it should link to some other guide.
Yeah, does sure seem like we should update something here. I am planning to spend more time on AIAF stuff soon, but until then, if someone has a drop-in paragraph, I would probably lightly edit it and then just use whatever you send me/post here.
Yeah, definitely agree. I’ve been meaning to update this for a while, but haven’t gotten around to this. Lots of good stuff has been published in the last 1.5 years!
I think it would be great to update this section. For example, it could link to the AGI Safety Fundamentals curriculum which has a wealth of valuable readings not on this list. And there are other courses that it would be good for newcomers to know about as well, such as MLAB.
Why am I suggesting this? This FAQ was the first place I found with clear advice when I was first getting interested in AI alignment in late 2021, and I took it quite seriously/literally. The very first alignment research I tried to read was the illustrated Embedded Agency sequence, because that was at the top of the above list. While I came to later appreciate Embedded Agency, I found this sequence (particularly the illustrated version which features prominently in the link above, as opposed to the text version) to be a confusing introduction to alignment. I also wasn’t immediately aware of anything important there was to read outside of the 4 texts linked above, while I now feel like there’s a lot!
It’s just one data point of user testing on this FAQ, but something to consider.
That section is even more outdated now. There’s nothing on interpretability, Paul’s work now extends far beyond IDA, etc. In my opinion it should link to some other guide.
Yeah, does sure seem like we should update something here. I am planning to spend more time on AIAF stuff soon, but until then, if someone has a drop-in paragraph, I would probably lightly edit it and then just use whatever you send me/post here.
Yeah, definitely agree. I’ve been meaning to update this for a while, but haven’t gotten around to this. Lots of good stuff has been published in the last 1.5 years!
I’m new here and this comment is an ideal place to begin. Thanks!