Some auto insurance companies use “collision” instead of “accident” for what transpired, to avoid unintended connotation, separately from assigning responsibility/fault. That part is based on following the letter of the law. In the AI case a better term might be “disaster”, which does not have the connotation of the term “accident”.
Seems to me that this is building in too much content / will have the wrong connotations. If an ML researcher hears about “recklessness risk”, they’re not unlikely to go “oh, well I don’t feel ‘reckless’ at my day job, so I’m off the hook”.
Locating the issue in the cognition of the developer is probably helpful in some contexts, but it has the disadvantage that (a) people will reflect on their cognition, not notice “negligent-feeling thoughts”, and conclude that accident risk is low; and (b) it encourages people to take the eye off the ball, focusing on psychology (and arguments about whose psychology is X versus Y) rather than focusing on properties of the AI itself.
“Accident risk” is maybe better just because it’s vaguer. The main problem I see with it isn’t “this sounds like it’s letting the developers off the hook” (since when do we assume that all accidents are faultless?). Rather, I think the problem with “accident” is that it sounds minor.
Accidentally breaking a plate is an “accident”. Accidentally destroying a universe is… something a bit worse than that.
I really don’t think the distinction is meaningful or useful in almost any situation. I think if people want to make something like this distinction they should just be more clear about exactly what they are talking about.
How about the distinction between (A) “An AGI kills every human, and the people who turned on the AGI didn’t want that to happen” versus (B) “An AGI kills every human, and the people who turned on the AGI did want that to happen”?
I’m guessing that you’re going to say “That’s not a useful distinction because (B) is stupid. Obviously nobody is talking about (B)”. In which case, my response is “The things that are obvious to you and me are not necessarily obvious to people who are new to thinking carefully about AGI x-risk.”
…And in particular, normal people sometimes seem to have an extraordinarily strong prior that “when people are talking about x-risk, it must be (B) and not (A), because (A) is weird sci-fi stuff and (B) is a real thing that could happen”, even after the first 25 times that I insist that I’m really truly talking about (A).
So I do think drawing a distinction between (A) and (B) is a very useful thing to be able to do. What terminology would you suggest for that?
How about the distinction between (A) “An AGI kills every human, and the people who turned on the AGI didn’t want that to happen” versus (B) “An AGI kills every human, and the people who turned on the AGI did want that to happen”?
I think the misuse vs. accident dichotomy is clearer when you don’t focus exclusively on “AGI kills every human” risks. (E.g., global totalitarianism risks strike me as small but non-negligible if we solve the alignment problem. Larger are risks that fall short of totalitarianism but still involve non-morally-humble developers damaging humanity’s long-term potential.)
The dichotomy is really just “AGI does sufficiently bad stuff, and the developers intended this” versus “AGI does sufficiently bad stuff, and the developers didn’t intend this”. The terminology might be non-ideal, but the concepts themselves are very natural.
It’s basically the same concept as “conflict disaster” versus “mistake disaster”. If something falls into both category to a significant extent (e.g., someone tries to become dictator but fails to solve alignment), then it goes in the “accident risk” bucket, because it doesn’t actually matter what you wanted to do with the AI if you’re completely unable to achieve that goal. The dynamics and outcome will end up looking basically the same as other accidents.
“Concrete Problems in AI Safety” used this distinction to make this point, and I think it was likely a useful simplification in that context. I generally think spelling it out is better, and I think people will pattern match your concerns onto the “the sci-fi scenario where AI spontaneously becomes conscious, goes rogue, and pursues its own goal” or “boring old robustness problems” if you don’t invoke structural risk. I think structural risk plays a crucial role in the arguments, and even if you think things that look more like pure accidents are more likely, I think the structural risk story is more plausible to more people and a sufficient cause for concern.
A natural misconception lots of normies have is that the primary risks from AI come from bad actors using it explicitly to do evil things, rather than bad actors being unable to align AIs at all and that causing clippy to run wild. I would like to distinguish between these two scenarios and accident vs. misuse risk is an obvious way to do that.
So what terminology do you want to use to make this distinction then?
Some auto insurance companies use “collision” instead of “accident” for what transpired, to avoid unintended connotation, separately from assigning responsibility/fault. That part is based on following the letter of the law. In the AI case a better term might be “disaster”, which does not have the connotation of the term “accident”.
If someone deliberately misuses AI to kill lots of people, that’s a “disaster” too.
Sure is, separates the description of what happens from assigning responsibility, which I assume is what the OP wanted.
Instead of “accident”, we could say “gross negligence” or “recklessness” for catastrophic risk from AI misalignment.
Seems to me that this is building in too much content / will have the wrong connotations. If an ML researcher hears about “recklessness risk”, they’re not unlikely to go “oh, well I don’t feel ‘reckless’ at my day job, so I’m off the hook”.
Locating the issue in the cognition of the developer is probably helpful in some contexts, but it has the disadvantage that (a) people will reflect on their cognition, not notice “negligent-feeling thoughts”, and conclude that accident risk is low; and (b) it encourages people to take the eye off the ball, focusing on psychology (and arguments about whose psychology is X versus Y) rather than focusing on properties of the AI itself.
“Accident risk” is maybe better just because it’s vaguer. The main problem I see with it isn’t “this sounds like it’s letting the developers off the hook” (since when do we assume that all accidents are faultless?). Rather, I think the problem with “accident” is that it sounds minor.
Accidentally breaking a plate is an “accident”. Accidentally destroying a universe is… something a bit worse than that.
Fair point.
If the issue with “accident” is that it sounds minor*, then one could say “catastrophic accident risk” or similar.
*I’m not fully bought into this as the main issue, but supposing that it is...
I really don’t think the distinction is meaningful or useful in almost any situation. I think if people want to make something like this distinction they should just be more clear about exactly what they are talking about.
How about the distinction between (A) “An AGI kills every human, and the people who turned on the AGI didn’t want that to happen” versus (B) “An AGI kills every human, and the people who turned on the AGI did want that to happen”?
I’m guessing that you’re going to say “That’s not a useful distinction because (B) is stupid. Obviously nobody is talking about (B)”. In which case, my response is “The things that are obvious to you and me are not necessarily obvious to people who are new to thinking carefully about AGI x-risk.”
…And in particular, normal people sometimes seem to have an extraordinarily strong prior that “when people are talking about x-risk, it must be (B) and not (A), because (A) is weird sci-fi stuff and (B) is a real thing that could happen”, even after the first 25 times that I insist that I’m really truly talking about (A).
So I do think drawing a distinction between (A) and (B) is a very useful thing to be able to do. What terminology would you suggest for that?
I think the misuse vs. accident dichotomy is clearer when you don’t focus exclusively on “AGI kills every human” risks. (E.g., global totalitarianism risks strike me as small but non-negligible if we solve the alignment problem. Larger are risks that fall short of totalitarianism but still involve non-morally-humble developers damaging humanity’s long-term potential.)
The dichotomy is really just “AGI does sufficiently bad stuff, and the developers intended this” versus “AGI does sufficiently bad stuff, and the developers didn’t intend this”. The terminology might be non-ideal, but the concepts themselves are very natural.
It’s basically the same concept as “conflict disaster” versus “mistake disaster”. If something falls into both category to a significant extent (e.g., someone tries to become dictator but fails to solve alignment), then it goes in the “accident risk” bucket, because it doesn’t actually matter what you wanted to do with the AI if you’re completely unable to achieve that goal. The dynamics and outcome will end up looking basically the same as other accidents.
By “intend” do you mean that they sought that outcome / selected for it?
Or merely that it was a known or predictable outcome of their behavior?
I think “unintentional” would already probably be a better term in most cases.
“Concrete Problems in AI Safety” used this distinction to make this point, and I think it was likely a useful simplification in that context. I generally think spelling it out is better, and I think people will pattern match your concerns onto the “the sci-fi scenario where AI spontaneously becomes conscious, goes rogue, and pursues its own goal” or “boring old robustness problems” if you don’t invoke structural risk. I think structural risk plays a crucial role in the arguments, and even if you think things that look more like pure accidents are more likely, I think the structural risk story is more plausible to more people and a sufficient cause for concern.
RE (A): A known side-effect is not an accident.
A natural misconception lots of normies have is that the primary risks from AI come from bad actors using it explicitly to do evil things, rather than bad actors being unable to align AIs at all and that causing clippy to run wild. I would like to distinguish between these two scenarios and accident vs. misuse risk is an obvious way to do that.