fyi your phrasing here is different from what I initially interpreted “make AI safety seem legit”.
like there’s maybe a few things someone might mean if they say “they’re working on AI Alignment research”
they are pushing forward the state of the art of deep alignment understanding
they are orienting to the existing field of alignment research / upskilling
they are conveying to other AI researchers “here is what the field of alignment is important and why”
they are trying to make AI alignment feel high status, so that they feel safe in their career and social network, while also getting to feel important
(and of course people can be doing a mixture of the above, or 5th options I didn’t lisT)
I interpreted you initially as saying #4, but it sounds like you/Rohin here are talking about #3. There are versions of #3 that are secretly just #4 without much theory-of-change, but, idk, I think Rohin’s stated goal here is just pretty reasonable and definitely something I want in my overall AI Alignment Field portfolio. I agree you should avoid accidentally conflating it with #1.
(i.e. this seems related to a form of research-debt, albeit focused on bridging the gap between one field and another, rather than improving intra-field research debt)
Yep, I am including 3 in this. I also think this is something pretty reasonable for someone in the field to do, but when most of your field is doing that I think quite crazy and bad things happen, and also it’s very easy to slip into doing 4 instead.
fyi your phrasing here is different from what I initially interpreted “make AI safety seem legit”.
like there’s maybe a few things someone might mean if they say “they’re working on AI Alignment research”
they are pushing forward the state of the art of deep alignment understanding
they are orienting to the existing field of alignment research / upskilling
they are conveying to other AI researchers “here is what the field of alignment is important and why”
they are trying to make AI alignment feel high status, so that they feel safe in their career and social network, while also getting to feel important
(and of course people can be doing a mixture of the above, or 5th options I didn’t lisT)
I interpreted you initially as saying #4, but it sounds like you/Rohin here are talking about #3. There are versions of #3 that are secretly just #4 without much theory-of-change, but, idk, I think Rohin’s stated goal here is just pretty reasonable and definitely something I want in my overall AI Alignment Field portfolio. I agree you should avoid accidentally conflating it with #1.
(i.e. this seems related to a form of research-debt, albeit focused on bridging the gap between one field and another, rather than improving intra-field research debt)
Yep, I am including 3 in this. I also think this is something pretty reasonable for someone in the field to do, but when most of your field is doing that I think quite crazy and bad things happen, and also it’s very easy to slip into doing 4 instead.