Richard_Ngo comments on Richard Ngo’s Shortform

Richard_Ngo 9 Aug 2023 20:39 UTC
LW: 41 AF: 18
2
AF
Five clusters of alignment researchers
Very broadly speaking, alignment researchers seem to fall into five different clusters when it comes to thinking about AI risk:
1. MIRI cluster. Think that P(doom) is very high, based on intuitions about instrumental convergence, deceptive alignment, etc. Does work that’s very different from mainstream ML. Central members: Eliezer Yudkowsky, Nate Soares.
2. Structural risk cluster. Think that doom is more likely than not, but not for the same reasons as the MIRI cluster. Instead, this cluster focuses on systemic risks, multi-agent alignment, selective forces outside gradient descent, etc. Often work that’s fairly continuous with mainstream ML, but willing to be unusually speculative by the standards of the field. Central members: Dan Hendrycks, David Krueger, Andrew Critch.
3. Constellation cluster. More optimistic than either of the previous two clusters. Focuses more on risk from power-seeking AI than the structural risk cluster, but does work that is more speculative or conceptually-oriented than mainstream ML. Central members: Paul Christiano, Buck Shlegeris, Holden Karnofsky. (Named after Constellation coworking space.)
4. Prosaic cluster. Focuses on empirical ML work and the scaling hypothesis, is typically skeptical of theoretical or conceptual arguments. Short timelines in general. Central members: Dario Amodei, Jan Leike, Ilya Sutskever.
5. Mainstream cluster. Alignment researchers who are closest to mainstream ML. Focuses much less on backchaining from specific threat models and more on promoting robustly valuable research. Typically more concerned about misuse than misalignment, although worried about both. Central members: Scott Aaronson, David Bau.
Remember that any such division will be inherently very lossy, and please try not to overemphasize the differences between the groups, compared with the many things they agree on.
Depending on how you count alignment researchers, the relative size of each of these clusters might fluctuate, but on a gut level I think I treat all of them as roughly the same size.
What links here?