Vika comments on (My understanding of) What Everyone in Technical Alignment is Doing and Why

Vika 13 Sep 2022 16:21 UTC
LW: 10 AF: 5
4
AF
Thanks Thomas for the helpful overview post! Great to hear that you found the AGI ruin opinions survey useful.
I agree with Rohin’s summary of what we’re working on. I would add “understanding / distilling threat models” to the list, e.g. “refining the sharp left turn” and “will capabilities generalize more”.
Some corrections for your overall description of the DM alignment team:
- I would count ~20-25 FTE on the alignment + scalable alignment teams (this does not include the AGI strategy & governance team)
- I would put DM alignment in the “fairly hard” bucket (p(doom) = 10-50%) for alignment difficulty, and the “mixed” bucket for “conceptual vs applied”
What links here?
- RobertM's comment on AI #27: Portents of Gemini by Zvi (1 Sep 2023 23:52 UTC; 8 points)
- Thomas Larsen 9 Oct 2022 15:55 UTC
  3 points
  0
  Parent
  Sorry for the late response, and thanks for your comment, I’ve edited the post to reflect these.
  - Vika 9 Oct 2022 16:00 UTC
    2 points
    0
    Parent
    No worries! Thanks a lot for updating the post