I still agree with a lot of that post and am still essentially operating on it.
I also think that it’s interesting to read the comments because at the time the promise of those who thought my post was wrong was that Anthropic’s RSP would get better and that this was only the beginning. With RSP V2 being worse and less specific than RSP V1, it’s clear that this was overoptimistic.
Now, risk management in AI has also gone a lot more mainstream than it was a year ago, in large parts thanks to the UK AISI who started operating on it. People have also started using more probabilities, for instance in safety cases paper, which this post advocated for.
With SaferAI, my organization, we’re still continuing to work on moving the field closer from traditional risk management and ensuring that we don’t reinvent the wheel when there’s no need to. There should be releases going in that direction over the coming months.
Overall, if I look back on my recommendations, I think they’re still quite strong. “Make the name less misleading” hasn’t been executed on but other names than RSPs have started being used, such as Frontier AI Safety Commitments, which is a strong improvement from my “Voluntary safety commitments” suggestion.
My recommendation about what RSPs are and aren’t are also solid. My worry that the current commitments in RSPs would be pushed in policy was basically right: it’s been used in many policy conversations as an anchor for what to do and what not to do.
Finally, the push for risk management in policy that I wanted to see happen has mostly happened. This is great news.
The main thing that misses from this post is the absence of prediction of RSP launching the debate about what should be done and at what levels. This is overall a good effect which has happened, and would probably have happened several months after if not for the publication of RSPs. The fact that it was done in a voluntary commitment context is unfortunate, because it levels down everything, but I still think this effect was significant.
I still agree with a lot of that post and am still essentially operating on it.
I also think that it’s interesting to read the comments because at the time the promise of those who thought my post was wrong was that Anthropic’s RSP would get better and that this was only the beginning. With RSP V2 being worse and less specific than RSP V1, it’s clear that this was overoptimistic.
Now, risk management in AI has also gone a lot more mainstream than it was a year ago, in large parts thanks to the UK AISI who started operating on it. People have also started using more probabilities, for instance in safety cases paper, which this post advocated for.
With SaferAI, my organization, we’re still continuing to work on moving the field closer from traditional risk management and ensuring that we don’t reinvent the wheel when there’s no need to. There should be releases going in that direction over the coming months.
Overall, if I look back on my recommendations, I think they’re still quite strong. “Make the name less misleading” hasn’t been executed on but other names than RSPs have started being used, such as Frontier AI Safety Commitments, which is a strong improvement from my “Voluntary safety commitments” suggestion.
My recommendation about what RSPs are and aren’t are also solid. My worry that the current commitments in RSPs would be pushed in policy was basically right: it’s been used in many policy conversations as an anchor for what to do and what not to do.
Finally, the push for risk management in policy that I wanted to see happen has mostly happened. This is great news.
The main thing that misses from this post is the absence of prediction of RSP launching the debate about what should be done and at what levels. This is overall a good effect which has happened, and would probably have happened several months after if not for the publication of RSPs. The fact that it was done in a voluntary commitment context is unfortunate, because it levels down everything, but I still think this effect was significant.