In general, AI safety researchers focus way too much on scenarios where there’s enough political will to adopt safety techniques that are seriously costly and inconvenient. There’s a couple reasons for this.
Firstly, AI company staff are disincentivized from making their companies look reckless, and if they give accurate descriptions of the amount of delay that the companies will tolerate, it will sound like they’re saying the company is reckless.
Secondly, safety-concerned people outside AI companies feel weird about openly discussing the possibility of AI companies only adopting cheap risk mitigations, because they’re scared of moving the Overton window and they’re hoping for a door-in-the-face dynamic.
So people focus way more on safety cases and other high-assurance safety strategies than is deserved given how likely they seem. I think that these dynamics have skewed discourse enough that a lot of “AI safety people” (broadly interpreted) have pretty bad models here.
Some tweets I wrote that are relevant to this post: