If one of the effects of instituting a responsible scaling policy was that Anthropic moved from the stance of not meaningfully pushing the frontier to “it’s okay to push the frontier so long as we deem it safe,” this seems like a pretty important shift that was not well communicated. I, for one, did not interpret Anthropic’s RSP as a statement that they were now okay with advancing state of the art, nor did many others; I think that’s because the RSP did not make it clear that they were updating this position. Like, with hindsight I can see how the language in the RSP is consistent with pushing the frontier. But I think the language is also consistent with not pushing it. E.g., when I was operating under the assumption that Anthropic had committed to this, I interpreted the RSP as saying “we’re aiming to scale responsibly to the extent that we scale at all, which will remain at or behind the frontier.”
Attempting to be forthright about this would, imo, look like a clear explanation of Anthropic’s previous stance relative to the new one they were adopting, and their reasons for doing so. To the extent that they didn’t feel the need to do this, I worry that it’s because their previous stance was more of a vibe, and therefore non-binding. But if they were using that vibe to gain resources (funding, talent, etc.), then this seems quite bad to me. It shouldn’t both be the case that they benefit from ambiguity but then aren’t held to any of the consequences of breaking it. And indeed, this makes me pretty wary of other trust/deferral based support that people currently give to Anthropic. I think that if the RSP in fact indicates a departure from their previous stance of not meaningfully pushing the frontier, then this is a negative update about Anthropic holding to the spirit of their commitments.
If one of the effects of instituting a responsible scaling policy was that Anthropic moved from the stance of not meaningfully pushing the frontier to “it’s okay to push the frontier so long as we deem it safe,” this seems like a pretty important shift that was not well communicated. I, for one, did not interpret Anthropic’s RSP as a statement that they were now okay with advancing state of the art, nor did many others; I think that’s because the RSP did not make it clear that they were updating this position. Like, with hindsight I can see how the language in the RSP is consistent with pushing the frontier. But I think the language is also consistent with not pushing it. E.g., when I was operating under the assumption that Anthropic had committed to this, I interpreted the RSP as saying “we’re aiming to scale responsibly to the extent that we scale at all, which will remain at or behind the frontier.”
Attempting to be forthright about this would, imo, look like a clear explanation of Anthropic’s previous stance relative to the new one they were adopting, and their reasons for doing so. To the extent that they didn’t feel the need to do this, I worry that it’s because their previous stance was more of a vibe, and therefore non-binding. But if they were using that vibe to gain resources (funding, talent, etc.), then this seems quite bad to me. It shouldn’t both be the case that they benefit from ambiguity but then aren’t held to any of the consequences of breaking it. And indeed, this makes me pretty wary of other trust/deferral based support that people currently give to Anthropic. I think that if the RSP in fact indicates a departure from their previous stance of not meaningfully pushing the frontier, then this is a negative update about Anthropic holding to the spirit of their commitments.