Robustness to scale is still one of my primary explanations for why MIRI-style alignment research is useful, and why alignment work in general should be front-loaded. I am less sure about this specific post as an introduction to the concept (since I had it before the post, and don’t know if anyone got it from this post), but think that the distillation of concepts floating around meatspace to clear reference works is one of the important functions of LW.
(5 upvotes from a few AF users suggests this post probably should be nominated by an additional AF person, but unsure. I do apologize again for not having better nomination-endorsement-UI.
I think this post may have been relevant to my own thinking, but I’m particularly interested in how relevant the concept has been to other people who think professionally about alignment)
Robustness to scale is still one of my primary explanations for why MIRI-style alignment research is useful, and why alignment work in general should be front-loaded. I am less sure about this specific post as an introduction to the concept (since I had it before the post, and don’t know if anyone got it from this post), but think that the distillation of concepts floating around meatspace to clear reference works is one of the important functions of LW.
(5 upvotes from a few AF users suggests this post probably should be nominated by an additional AF person, but unsure. I do apologize again for not having better nomination-endorsement-UI.
I think this post may have been relevant to my own thinking, but I’m particularly interested in how relevant the concept has been to other people who think professionally about alignment)
I think that the terms introduced by this post are great and I use them all the time