Some people are advocating for safety cases– the idea that companies should be required to show that risks drop below acceptable levels.
This approach is used in safety engineering fields.
But AI is different from the safety engineering fields. For example, in AI we have adversarial risks.
Therefore we shouldn’t support safety cases.
I think this misunderstands the case for safety cases, or at least only argues against one particular justification for safety cases.
Here’s how I think about safety cases (or really any approach in which a company needs to present evidence that their practices keep risks below acceptable levels):
AI systems pose major risks. A lot of risks stem from race dynamics and competitive pressures.
If companies were required to demonstrate that they kept risks below acceptable levels, this would incentivize a lot more safety research and curb some of the dangerous properties of race dynamics.
Other fields also have similar setups, and we should try to learn from them when relevant. Of course, AI development will also have some unique properties so we’ll have to adapt the methods accordingly.
I’d be curious to hear more about why you think safety cases fail to work when risks are adversarial (at first glance, it doesn’t seem like it should be too difficult to adapt the high-level safety case approach).
I’m also curious if you have any alternatives that you prefer. I currently endorse the claim “safety cases are better than status quo” but I’m open to the idea that maybe “Alternative approach X is better than both safety cases and status quo.”
Yeah, in your linked paper you write “In high-stakes industries, risk management practices often require affirmative evidence that risks are kept below acceptable thresholds.” This is right. But my understanding is that this is not true of industries that deal with adversarial high-stakes situations. So I don’t think you should claim that your proposal is backed up by standard practice. See here for a review of the possibility of using safety cases in adversarial situations.
Here’s how I understand your argument:
Some people are advocating for safety cases– the idea that companies should be required to show that risks drop below acceptable levels.
This approach is used in safety engineering fields.
But AI is different from the safety engineering fields. For example, in AI we have adversarial risks.
Therefore we shouldn’t support safety cases.
I think this misunderstands the case for safety cases, or at least only argues against one particular justification for safety cases.
Here’s how I think about safety cases (or really any approach in which a company needs to present evidence that their practices keep risks below acceptable levels):
AI systems pose major risks. A lot of risks stem from race dynamics and competitive pressures.
If companies were required to demonstrate that they kept risks below acceptable levels, this would incentivize a lot more safety research and curb some of the dangerous properties of race dynamics.
Other fields also have similar setups, and we should try to learn from them when relevant. Of course, AI development will also have some unique properties so we’ll have to adapt the methods accordingly.
I’d be curious to hear more about why you think safety cases fail to work when risks are adversarial (at first glance, it doesn’t seem like it should be too difficult to adapt the high-level safety case approach).
I’m also curious if you have any alternatives that you prefer. I currently endorse the claim “safety cases are better than status quo” but I’m open to the idea that maybe “Alternative approach X is better than both safety cases and status quo.”
Yeah, in your linked paper you write “In high-stakes industries, risk management practices often require affirmative evidence that risks are kept below acceptable thresholds.” This is right. But my understanding is that this is not true of industries that deal with adversarial high-stakes situations. So I don’t think you should claim that your proposal is backed up by standard practice. See here for a review of the possibility of using safety cases in adversarial situations.