People representing Anthropic argued against government-required RSPs. I don’t think I can share the details of the specific room where that happened, because it will be clear who I know this from.
Anthropic ppl had also said approximately this publicly. Saying that it’s too soon to make the rules, since we’d end up mispecifying due to ignorance of tomorrow’s models.
There’s a big difference between regulation which says roughly “you must have something like an RSP”, and regulation which says “you must follow these specific RSP-like requirements”, and I think Mikhail is talking about the latter.
I personally think the former is a good idea, and thus supported SB-1047 along with many other lab employees. It’s also pretty clear to me that locking in circa-2023 thinking about RSPs would have been a serious mistake, and so I (along with many others) am generally against very specific regulations because we expect they would on net increase catastrophic risk.
When do you think would be a good time to lock in regulation? I personally doubt RSP-style regulation would even help, but the notion that now is too soon/risks locking in early sketches, strikes me as in some tension with e.g. Anthropic trying to automate AI research ASAP, Dario expecting ASL-4 systems between 2025—the current year!—and 2028, etc.
Here I am on record supporting SB-1047, along with many of my colleagues. I will continue to support specific proposed regulations if I think they would help, and oppose them if I think they would be harmful; asking “when” independent of “what” doesn’t make much sense to me and doesn’t seem to follow from anything I’ve said.
My claim is not “this is a bad time”, but rather “given the current state of the art, I tend to support framework/liability/etc regulations, and tend to oppose more-specific/exact-evals/etc regulations”. Obviously if the state of the art advanced enough that I thought the latter would be better for overall safety, I’d support them, and I’m glad that people are working on that.
AFAIK Anthropic has not unequivocally supported the idea of “you must have something like an RSP” or even SB-1047 despite many employees, indeed, doing so.
As you may be aware, several weeks ago Anthropic submitted a Support if Amended letter regarding SB 1047, in which we suggested a series of amendments to the bill. … In our assessment the new SB 1047 is substantially improved, to the point where we believe its benefits likely outweigh its costs.
...
We see the primary benefits of the bill as follows:
Developing SSPs and being honest with the public about them. The bill mandates the adoption of safety and security protocols (SSPs), flexible policies for managing catastrophic risk that are similar to frameworks adopted by several of the most advanced developers of AI systems, including Anthropic, Google, and OpenAI. However, some companies have still not adopted these policies, and others have been vague about them. Furthermore, nothing prevents companies from making misleading statements about their SSPs or about the results of the tests they have conducted as part of their SSPs. It is a major improvement, with very little downside, that SB 1047 requires companies to adopt some SSP (whose details are up to them) and to be honest with the public about their SSP-related practices and findings.
...
We believe it is critical to have some framework for managing frontier AI systems that
roughly meets [requirements discussed in the letter]. As AI systems become more powerful, it’s
crucial for us to ensure we have appropriate regulations in place to ensure their safety.
“we believe its benefits likely outweigh its costs” is “it was a bad bill and now it’s likely net-positive”, not exactly unequivocally supporting it. Compare that even to the language in calltolead.org.
Edit: AFAIK Anthropic lobbied against SSP-like requirements in private.
My guess is it’s referring to Anthropic’s position on SB 1047, or Dario’s and Jack Clark’s statements that it’s too early for strong regulation, or how Anthropic’s policy recommendations often exclude RSP-y stuff (and when they do suggest requiring RSPs, they would leave the details up to the company).
What is this referring to?
People representing Anthropic argued against government-required RSPs. I don’t think I can share the details of the specific room where that happened, because it will be clear who I know this from.
Ask Jack Clark whether that happened or not.
Anthropic ppl had also said approximately this publicly. Saying that it’s too soon to make the rules, since we’d end up mispecifying due to ignorance of tomorrow’s models.
There’s a big difference between regulation which says roughly “you must have something like an RSP”, and regulation which says “you must follow these specific RSP-like requirements”, and I think Mikhail is talking about the latter.
I personally think the former is a good idea, and thus supported SB-1047 along with many other lab employees. It’s also pretty clear to me that locking in circa-2023 thinking about RSPs would have been a serious mistake, and so I (along with many others) am generally against very specific regulations because we expect they would on net increase catastrophic risk.
When do you think would be a good time to lock in regulation? I personally doubt RSP-style regulation would even help, but the notion that now is too soon/risks locking in early sketches, strikes me as in some tension with e.g. Anthropic trying to automate AI research ASAP, Dario expecting ASL-4 systems between 2025—the current year!—and 2028, etc.
Here I am on record supporting SB-1047, along with many of my colleagues. I will continue to support specific proposed regulations if I think they would help, and oppose them if I think they would be harmful; asking “when” independent of “what” doesn’t make much sense to me and doesn’t seem to follow from anything I’ve said.
My claim is not “this is a bad time”, but rather “given the current state of the art, I tend to support framework/liability/etc regulations, and tend to oppose more-specific/exact-evals/etc regulations”. Obviously if the state of the art advanced enough that I thought the latter would be better for overall safety, I’d support them, and I’m glad that people are working on that.
AFAIK Anthropic has not unequivocally supported the idea of “you must have something like an RSP” or even SB-1047 despite many employees, indeed, doing so.
To quote from Anthropic’s letter to Govenor Newsom,
“we believe its benefits likely outweigh its costs” is “it was a bad bill and now it’s likely net-positive”, not exactly unequivocally supporting it. Compare that even to the language in calltolead.org.
Edit: AFAIK Anthropic lobbied against SSP-like requirements in private.
My guess is it’s referring to Anthropic’s position on SB 1047, or Dario’s and Jack Clark’s statements that it’s too early for strong regulation, or how Anthropic’s policy recommendations often exclude RSP-y stuff (and when they do suggest requiring RSPs, they would leave the details up to the company).
SB1047 was mentioned separately so I assumed it was something else. Might be the other ones, thanks for the links.