I feel like there are two things going on here:
Anthropic considers itself the expert on AI safety and security and believes that it can develop better SSPs than the California government.
Anthropic thinks that the California government is too political and does not have the expertise to effectively regulate frontier labs.
But, what they propose in return just seems to be at odds with their stated purpose and view of the future. If AGI is 2-3 years away then various governmental bodies need to be creating administration around AI safety now rather than in 2-3 years time, when it will take another 2-3 years to create the administrative organizations.
The idea that Anthropic or OpenAI or DeepMind should get to decide, on their own, the appropriate safety and security measures for frontier models, seems unrealistic. It’s going to end up being a set of regulations created by a government body—and Anthropic is probably better off participating in that process than trying to oppose its operation at the start.
I feel like some of this just comes from an unrealistic view of the future, where they don’t seem to understand that as AGI approaches, in certain respects they become less influential and important and not more influential and important—as AI ceases to be a niche thing, other power structures in society will exert more influence on its operation and distribution,
Just a collection of other thoughts:
Why did Anthropic decide that deciding not to classify the new model as ASL-3 is a CEO / RSO decision rather than a board of directors or LTBT decision? Both of those would be more independent.
My guess is that it’s because the feeling was that the LTBT would either have insufficient knowledge or would be too slow; it would be interesting to get confirmation though.
Haven’t gotten to how the RSO is chosen but if the RSO is appointed by the CEO / Board then I think there are insufficient checks and balances; RSO should be on a 3 year non-renewable, non-terminable contract basis or something similar.
The document doesn’t feel portable because it feels very centered around Anthropic and the transition from ASL-2 to ASL-3. It just doesn’t feel like something that someone meant to be portable. In fact, it feels more like a high-level commentary on the ASL-2 to ASL-3 transition at Anthropic. The original RSP felt more like something that could have been cleaned up into an industry standard (OAI’s original preparedness framework does a better job with this honestly).
The reference to existing security frameworks is helpful but it just seems like a grab bag (the reference to SOC2 seems sort of out of place, for instance; NIST 800-53 should be a much higher standard? also, if SOC2, why not ISO 27001?)
I think they removed the requirement to define ASL-4 before training an ASL-3 model?
Also:
I feel like the introduction is written around trying to position the document positively with regulators.
I’m quite interested in what led to this approach and what parts of the company were involved with writing the document this way. The original version had some of this—but it wasn’t as forward—and didn’t feel as polished in this regard.
Open with Positive Framing
Emphasize Anthropic’s Leadership
Emphasize Importance of Not Overregulating
Emphasize Innovation (Again, Don’t Overregulate)
Emphasize Anthropic’s Leadership (Again) / Industry Self-Regulation
Don’t Regulate Now (Again)
We Care About Other Things You Care About (like Misinformation)