Campbell Hutcheson comments on Anthropic rewrote its RSP

Campbell Hutcheson 17 Oct 2024 21:21 UTC
4 points
0
Just a collection of other thoughts:
- Why did Anthropic decide that deciding not to classify the new model as ASL-3 is a CEO / RSO decision rather than a board of directors or LTBT decision? Both of those would be more independent.
  - My guess is that it’s because the feeling was that the LTBT would either have insufficient knowledge or would be too slow; it would be interesting to get confirmation though.
  - Haven’t gotten to how the RSO is chosen but if the RSO is appointed by the CEO / Board then I think there are insufficient checks and balances; RSO should be on a 3 year non-renewable, non-terminable contract basis or something similar.
- The document doesn’t feel portable because it feels very centered around Anthropic and the transition from ASL-2 to ASL-3. It just doesn’t feel like something that someone meant to be portable. In fact, it feels more like a high-level commentary on the ASL-2 to ASL-3 transition at Anthropic. The original RSP felt more like something that could have been cleaned up into an industry standard (OAI’s original preparedness framework does a better job with this honestly).
- The reference to existing security frameworks is helpful but it just seems like a grab bag (the reference to SOC2 seems sort of out of place, for instance; NIST 800-53 should be a much higher standard? also, if SOC2, why not ISO 27001?)
- I think they removed the requirement to define ASL-4 before training an ASL-3 model?
Also:
I feel like the introduction is written around trying to position the document positively with regulators.
I’m quite interested in what led to this approach and what parts of the company were involved with writing the document this way. The original version had some of this—but it wasn’t as forward—and didn’t feel as polished in this regard.
Open with Positive Framing
As frontier AI models advance, we believe they will bring about transformative benefits for our society and economy. AI could accelerate scientific discoveries, revolutionize healthcare, enhance our education system, and create entirely new domains for human creativity and innovation.
Emphasize Anthropic’s Leadership
In September 2023, we released our Responsible Scaling Policy (RSP), a first-of-its-kind public commitment
Emphasize Importance of Not Overregulating
This policy reflects our view that risk governance in this rapidly evolving domain should be proportional, iterative, and exportable.
Emphasize Innovation (Again, Don’t Overregulate)
By implementing safeguards that are proportional to the nature and extent of an AI model’s risks, we can balance innovation with safety, maintaining rigorous protections without unnecessarily hindering progress.
Emphasize Anthropic’s Leadership (Again) / Industry Self-Regulation
To demonstrate that it is possible to balance innovation with safety, we must put forward our proof of concept: a pragmatic, flexible, and scalable approach to risk governance. By sharing our approach externally, we aim to set a new industry standard that encourages widespread adoption of similar frameworks.
Don’t Regulate Now (Again)
In the long term, we hope that our policy may oer relevant insights for regulation. In the meantime, we will continue to share our findings with policymakers.
We Care About Other Things You Care About (like Misinformation)
Our Usage Policy sets forth our standards for the use of our products, including prohibitions on using our models to spread misinformation, incite violence or hateful behavior, or engage in fraudulent or abusive practices