It depends on overall probability distibution. Previously Eliezer thought something like that p(doom|trying to solve alignment) = 50% and p(doom|trying to solve AI ban without alignment) = 99% an then updated to p(doom|trying to solve alignment) = 99% and p(doom|trying to solve AI ban without alignment) = 95%, which makes solving AI ban even if pretty much doomed but worthwhile. But if you are, say, Alex Turner, you could start with the same probabilities, but update towards p(doom|trying to solve alignment) = 10%, which makes publishing papers on steering vectors very reasonable.
The other reasons:
I expect majority of policy people to be on EA forum, maybe I am wrong;
Kat Woods has large twitter thread about how posting on Twitter is much more useful than posting on LW/AF/EAF in terms of public outreach.
Seems reasonable except that Eliezer’s p(doom | trying to solve alignment) in early 2023 was much higher than 50%, probably more like 98%. AGI Ruin was published in June 2022 and drafts existed since early 2022. MIRI leadership had been pretty pessimistic ever since AlphaGo in 2016 and especially since their research agenda collapsed in 2019.
It depends on overall probability distibution. Previously Eliezer thought something like that p(doom|trying to solve alignment) = 50% and p(doom|trying to solve AI ban without alignment) = 99% an then updated to p(doom|trying to solve alignment) = 99% and p(doom|trying to solve AI ban without alignment) = 95%, which makes solving AI ban even if pretty much doomed but worthwhile. But if you are, say, Alex Turner, you could start with the same probabilities, but update towards p(doom|trying to solve alignment) = 10%, which makes publishing papers on steering vectors very reasonable.
The other reasons:
I expect majority of policy people to be on EA forum, maybe I am wrong;
Kat Woods has large twitter thread about how posting on Twitter is much more useful than posting on LW/AF/EAF in terms of public outreach.
Seems reasonable except that Eliezer’s p(doom | trying to solve alignment) in early 2023 was much higher than 50%, probably more like 98%. AGI Ruin was published in June 2022 and drafts existed since early 2022. MIRI leadership had been pretty pessimistic ever since AlphaGo in 2016 and especially since their research agenda collapsed in 2019.
I am talking about belief state in ~2015, because everyone was already skeptical about policy approach at that time.