Thanks for this post! This seems like a really great way of visually representing how these different hypotheses, arguments, approaches, and scenarios interconnect. (I also think it’d be cool to see posts on other topics which use a similar approach!)
It seems that AGI timelines aren’t explicitly discussed here. (“Discontinuity to AGI” is mentioned, but I believe that’s a somewhat distinct matter.) Was that a deliberate choice?
It does seem like several of the hypotheses/arguments mentioned here would feed into or relate to beliefs about timelines—in particular, Discontinuity to AGI, Discontinuity from AGI, and Recursive self-improvement, ML scales to AGI, and Deep insights needed (or maybe not that last one, as that means “needed” for alignment in particular). But I don’t think beliefs about timelines would be fully accounted for by those hypotheses/arguments—beliefs about timelines could also involve cruxes like whether “Intelligence is a huge collection of specific things”) or whether “There’ll be another AI winter before AGI” could also play a role.
I’m not sure to what extent beliefs about timelines (aside from beliefs about discontinuity) would influence which of the approaches people should/would take, out of the approaches you list. But I imagine that beliefs that timelines are quite short might motivate work on ML or prosaic alignment rather than (Near) proof-level assurance of alignment or Foundational or “deconfusion” research. This would be because people might then think the latter approaches would take too long, such that our only shot (given these people’s beliefs) is doing ML or prosaic alignment and hoping that’s enough. (See also.)
And it seems like beliefs about timelines would feed into decisions about other approaches you don’t mention, like opting for investment or movement-building rather than direct, technical work. (That said, it seems reasonable for this post’s scope to just be what a person should do once they have decided to work on AI alignment now.)
I’d also like to see more posts that do this sort of “mapping”. I think that mapping AI risk arguments is too neglected—more discussion and examples in this post by Gyrodiot. I’m continuing to work collaboratively in this area in my spare time, and I’m excited that more people are getting involved.
We weren’t trying to fully account for AGI timelines—our choice of scope was based on a mix of personal interest and importance. I know people currently working on posts similar to this that will go in-depth on timelines, discontinuity, paths to AGI, the nature of intelligence, etc. which I’m excited about!
I agree with all your points. You’re right that this post’s scope does not include broader alternatives for reducing AI risk. It was not even designed to guide what people should work on, though it can serve that purpose. We were really just trying to clearly map out some of the discourse, as a starting point and example for future work.
Thanks for this post! This seems like a really great way of visually representing how these different hypotheses, arguments, approaches, and scenarios interconnect. (I also think it’d be cool to see posts on other topics which use a similar approach!)
It seems that AGI timelines aren’t explicitly discussed here. (“Discontinuity to AGI” is mentioned, but I believe that’s a somewhat distinct matter.) Was that a deliberate choice?
It does seem like several of the hypotheses/arguments mentioned here would feed into or relate to beliefs about timelines—in particular, Discontinuity to AGI, Discontinuity from AGI, and Recursive self-improvement, ML scales to AGI, and Deep insights needed (or maybe not that last one, as that means “needed” for alignment in particular). But I don’t think beliefs about timelines would be fully accounted for by those hypotheses/arguments—beliefs about timelines could also involve cruxes like whether “Intelligence is a huge collection of specific things”) or whether “There’ll be another AI winter before AGI” could also play a role.
I’m not sure to what extent beliefs about timelines (aside from beliefs about discontinuity) would influence which of the approaches people should/would take, out of the approaches you list. But I imagine that beliefs that timelines are quite short might motivate work on ML or prosaic alignment rather than (Near) proof-level assurance of alignment or Foundational or “deconfusion” research. This would be because people might then think the latter approaches would take too long, such that our only shot (given these people’s beliefs) is doing ML or prosaic alignment and hoping that’s enough. (See also.)
And it seems like beliefs about timelines would feed into decisions about other approaches you don’t mention, like opting for investment or movement-building rather than direct, technical work. (That said, it seems reasonable for this post’s scope to just be what a person should do once they have decided to work on AI alignment now.)
It’s great to hear your thoughts on the post!
I’d also like to see more posts that do this sort of “mapping”. I think that mapping AI risk arguments is too neglected—more discussion and examples in this post by Gyrodiot. I’m continuing to work collaboratively in this area in my spare time, and I’m excited that more people are getting involved.
We weren’t trying to fully account for AGI timelines—our choice of scope was based on a mix of personal interest and importance. I know people currently working on posts similar to this that will go in-depth on timelines, discontinuity, paths to AGI, the nature of intelligence, etc. which I’m excited about!
I agree with all your points. You’re right that this post’s scope does not include broader alternatives for reducing AI risk. It was not even designed to guide what people should work on, though it can serve that purpose. We were really just trying to clearly map out some of the discourse, as a starting point and example for future work.