To me, the introduction made it sound a little bit like the specifics of applying safety cases to AI systems have not been studied
This is a good point. In retrospect, I should have written a related work section to cover these. My focus was mostly on AI systems that have only existed for ~ a year and future AI systems, so I didn’t spend much time reading safety cases literature specifically related to AI systems (though perhaps there are useful insights that transfer over).
The reason the “nebulous requirements” aren’t explicitly stated is that when you make a safety case you assure the safety of a system against specific relevant hazards for the system you’re assuring. These are usually identified by performing a HAZOP analysis or similar. Not all AI systems have the same list of hazards, so its obviously dubious to expect you can list requirements a priori.
My impression is that there is still a precedent for fairly detailed guidelines that describe how safety cases are assessed in particular industries and how hazards should be analyzed. For example, see the UK’s Safety Assessment Principles for Nuclear Facilities. I don’t think anything exists like this for evaluating risks from advanced AI agents.
I agree, however, that not everyone who mentions that developers should provide ‘safety evidence’ should need to specify in detail what this could look like.
I hear what you’re saying. I probably should have made the following distinction:
A technology in the abstract (e.g. nuclear fission, LLMs)
A technology deployed to do a thing (e.g. nuclear in a power plant, LLM used for customer service)
The question I understand you to be asking is essentially how do we make safety cases for AI agents generally? I would argue that’s more situation 1 than 2, and as I understand it safety cases are basically only ever applied to case 2. The nuclear facilities document you linked definitely is 2.
So yeah, admittedly the document you were looking for doesn’t exist, but that doesn’t really surprise me. If you started looking for narrowly scoped safety principles for AI systems you start finding them everywhere. For example, a search for “artificial intelligence” on the ISO website results in 73 standards .
Just a few relevant standards, though I admit, standards are exceptionally boring (also many aren’t public, which is dumb):
UL 4600 standard for autonomous vehicles
ISO/IEC TR 5469 standard for ai safety stuff generally (this one is decently interesting)
ISO/IEC 42001 this one covers what you do if you set up a system that uses AI
This is a good point. In retrospect, I should have written a related work section to cover these. My focus was mostly on AI systems that have only existed for ~ a year and future AI systems, so I didn’t spend much time reading safety cases literature specifically related to AI systems (though perhaps there are useful insights that transfer over).
My impression is that there is still a precedent for fairly detailed guidelines that describe how safety cases are assessed in particular industries and how hazards should be analyzed. For example, see the UK’s Safety Assessment Principles for Nuclear Facilities. I don’t think anything exists like this for evaluating risks from advanced AI agents.
I agree, however, that not everyone who mentions that developers should provide ‘safety evidence’ should need to specify in detail what this could look like.
I hear what you’re saying. I probably should have made the following distinction:
A technology in the abstract (e.g. nuclear fission, LLMs)
A technology deployed to do a thing (e.g. nuclear in a power plant, LLM used for customer service)
The question I understand you to be asking is essentially how do we make safety cases for AI agents generally? I would argue that’s more situation 1 than 2, and as I understand it safety cases are basically only ever applied to case 2. The nuclear facilities document you linked definitely is 2.
So yeah, admittedly the document you were looking for doesn’t exist, but that doesn’t really surprise me. If you started looking for narrowly scoped safety principles for AI systems you start finding them everywhere. For example, a search for “artificial intelligence” on the ISO website results in 73 standards .
Just a few relevant standards, though I admit, standards are exceptionally boring (also many aren’t public, which is dumb):
UL 4600 standard for autonomous vehicles
ISO/IEC TR 5469 standard for ai safety stuff generally (this one is decently interesting)
ISO/IEC 42001 this one covers what you do if you set up a system that uses AI
You also might find this paper a good read: https://ieeexplore.ieee.org/document/9269875
This makes sense. Thanks for the resources!