FrancescaG comments on Toward Safety Cases For AI Scheming

FrancescaG 1 Nov 2024 11:09 UTC
1 point
0
Great addition to thinking on safety cases.
I’m curious about the decision to include the time constraint ‘Deployment does not pose unacceptable risk within its 1 year lifetime’, as I’ve been considering how safety case arguments remain relevant post deployment.
Is this intended to convey that the system will be withdrawn/ updated in a year or to help reduce the scope of what is being considered in terms of feasibility of inability and control arguments by looking at a short risk evaluation timeline?
On the topic of the scope of the Harm Inability arguments in the paper:
‘A Harm inability argument could be absolute or limited to a given deployment setting’
Given the vast amounts of deployment settings, the developers of an AI model could provide absolute arguments with accompanying warnings/ limitations based on the evaluation outputs (along the lines of usage guidelines for licensed medicines). Organisations deploying the models in novel settings (also higher risk like medical, finance etc) could supplement these with arguments specific to their deployment setting. Any thoughts on this?