Sometimes people say “look at these past accidents; in these cases there were giant bureaucracies that didn’t care about safety at all, therefore we should be pessimistic about about AI safety”. I think this is backwards, and that you should actually conclude the reverse: this is evidence that problems tend to be easy, and therefore we should be optimistic about AI safety.
It’s easiest to see with a Bayesian treatment. Let’s say we start completely uncertain about what fraction of people will care about problems, i.e. uniform distribution over [0, 100]%. In what worlds do I expect to see accidents where giant bureaucracies don’t care about safety? Almost all of them—even if 90% of people care about safety, there will still be some cases where people didn’t care and accidents happened; and of course we’d hear about them if so (and not hear about the cases where accidents didn’t happen). You can get a strong update against 99.9999% and higher, but by the time you’re at 90% the update seems pretty weak. Given how much selection there is, I think even the update against 99% is relatively weak. So really you just don’t learn much about how careful people will be by looking at our accident track record (unless you can also quantify the denominator of how many “potential accidents” there could have been).
However, it feels pretty notable to me that the vast majority of accidents I hear about in detail are ones where it seems like there were a bunch of obvious mistakes and the accidents would have been prevented had there been a decision-maker who cared (enough) about safety. And unlike the previous paragraph, I do expect to hear about accidents that we couldn’t have prevented, so I don’t have to worry about selection bias. So it seems like I should conclude that usually problems are pretty easy, and “all we have to do” is make sure people care. (One counterargument is that problems look obvious only in hindsight; at the time the obvious mistakes may not have been obvious.)
Examples of accidents that fit this pattern: the Challenger crash, the Boeing 737-MAX issues, everything in Engineering a Safer World, though admittedly the latter category suffers from some selection bias.
Sometimes people say “look at these past accidents; in these cases there were giant bureaucracies that didn’t care about safety at all, therefore we should be pessimistic about about AI safety”. I think this is backwards, and that you should actually conclude the reverse: this is evidence that problems tend to be easy, and therefore we should be optimistic about AI safety.
This is not just one man’s modus ponens—the key issue is the selection effect.
It’s easiest to see with a Bayesian treatment. Let’s say we start completely uncertain about what fraction of people will care about problems, i.e. uniform distribution over [0, 100]%. In what worlds do I expect to see accidents where giant bureaucracies don’t care about safety? Almost all of them—even if 90% of people care about safety, there will still be some cases where people didn’t care and accidents happened; and of course we’d hear about them if so (and not hear about the cases where accidents didn’t happen). You can get a strong update against 99.9999% and higher, but by the time you’re at 90% the update seems pretty weak. Given how much selection there is, I think even the update against 99% is relatively weak. So really you just don’t learn much about how careful people will be by looking at our accident track record (unless you can also quantify the denominator of how many “potential accidents” there could have been).
However, it feels pretty notable to me that the vast majority of accidents I hear about in detail are ones where it seems like there were a bunch of obvious mistakes and the accidents would have been prevented had there been a decision-maker who cared (enough) about safety. And unlike the previous paragraph, I do expect to hear about accidents that we couldn’t have prevented, so I don’t have to worry about selection bias. So it seems like I should conclude that usually problems are pretty easy, and “all we have to do” is make sure people care. (One counterargument is that problems look obvious only in hindsight; at the time the obvious mistakes may not have been obvious.)
Examples of accidents that fit this pattern: the Challenger crash, the Boeing 737-MAX issues, everything in Engineering a Safer World, though admittedly the latter category suffers from some selection bias.