I really like this idea, since in an important sense these are accident risks: we don’t intend for AI to cause existential catastrophe but it might if we make mistakes (and we make mistakes by default). I get why some folks in the safety space might not like this framing because accidents imply there’s some safe default path and accidents are deviations from that when in fact “accidents” are the default thing AI do and we have to thread a narrow path to get good outcomes, but seems like a reasonable way to move the conversation forward with the general public even if the technical details of it are wrong. If the goal is to get people to care about AI doing bad things despite our best efforts to the contrary, framing that as an accident seems like the best conceptual handle most folks have readily available.
I really like this idea, since in an important sense these are accident risks: we don’t intend for AI to cause existential catastrophe but it might if we make mistakes (and we make mistakes by default). I get why some folks in the safety space might not like this framing because accidents imply there’s some safe default path and accidents are deviations from that when in fact “accidents” are the default thing AI do and we have to thread a narrow path to get good outcomes, but seems like a reasonable way to move the conversation forward with the general public even if the technical details of it are wrong. If the goal is to get people to care about AI doing bad things despite our best efforts to the contrary, framing that as an accident seems like the best conceptual handle most folks have readily available.