After talking with a lot of people in Alignment, I think there is still a lot of good to be done for idea diffusion at the object/technical level. We seem to have done a lot of outreach presenting the philosophical arguments, but less so on the technical ground.
Good idea, but the speaker schedule doesn’t seem to reflect this stated goal. Going down the list:
Holden’s “Most Important Century” is not object-level technical alignment work, it’s meta-level content about why AI safety is important
Carlsmith’s is also not object-level alignment work, it’s meta-level content about why AI safety is (or isn’t?) important
Kaplan’s “Scaling Laws in Neural Networks” is also presumably meta-level content about why AI safety is important, not object-level alignment work
Hadfield-Mennel’s “Normative Information for AI systems” I have not heard of, but does sound like object-level alignment work based on the title
Christiano’s “Eliciting Latent Knowledge” is definitely object-level alignment work
Cotra’s “AI Timeline and Alignment Risk” is meta-level content about why AI safety is important, not object-level alignment work
Hendryks’ talk is TBA, so don’t know about that one
So, out of 7 talks listed, 2 are clearly about object-level technical alignment work, and 4 are clearly not about object-level technical alignment work.
Also, I note that almost half the speakers are from OpenPhil, an organization which (to my understanding) directly employed zero object-level technical alignment researchers as of a couple months ago. I do hear some of them have decided to try object-level work recently, in order to better understand it, but that’s a pretty recent development and the object-level work isn’t really the point of that exercise.
Good idea, but the speaker schedule doesn’t seem to reflect this stated goal. Going down the list:
Holden’s “Most Important Century” is not object-level technical alignment work, it’s meta-level content about why AI safety is important
Carlsmith’s is also not object-level alignment work, it’s meta-level content about why AI safety is (or isn’t?) important
Kaplan’s “Scaling Laws in Neural Networks” is also presumably meta-level content about why AI safety is important, not object-level alignment work
Hadfield-Mennel’s “Normative Information for AI systems” I have not heard of, but does sound like object-level alignment work based on the title
Christiano’s “Eliciting Latent Knowledge” is definitely object-level alignment work
Cotra’s “AI Timeline and Alignment Risk” is meta-level content about why AI safety is important, not object-level alignment work
Hendryks’ talk is TBA, so don’t know about that one
So, out of 7 talks listed, 2 are clearly about object-level technical alignment work, and 4 are clearly not about object-level technical alignment work.
Also, I note that almost half the speakers are from OpenPhil, an organization which (to my understanding) directly employed zero object-level technical alignment researchers as of a couple months ago. I do hear some of them have decided to try object-level work recently, in order to better understand it, but that’s a pretty recent development and the object-level work isn’t really the point of that exercise.