And to elaborate a little bit (based on my own understanding, not what they told me) their RSP sort of says the opposite. To avoid a “race to the bottom” they base the decision to deploy a model on what harm it can cause, regardless of what models other companies have released. So if someone else releases a model with potentially dangerous capabilities, Anthropic can’t/won’t use that as cover to release something similar that they wouldn’t have released otherwise. I’m not certain whether this is the best approach, but I do think it’s coherent.
Note that ASLs are defined by risk relative to baseline, excluding other advanced AI systems. This means that a model that initially merits ASL-3 containment and deployment measures for national security reasons might later be reduced to ASL-2 if defenses against national security risks (such as biological or cyber defenses) advance, or if dangerous information becomes more widely available. However, to avoid a “race to the bottom”, the latter should not include the effects of other companies’ language models; just because other language models pose a catastrophic risk does not mean it is acceptable for ours to.
And to elaborate a little bit (based on my own understanding, not what they told me) their RSP sort of says the opposite. To avoid a “race to the bottom” they base the decision to deploy a model on what harm it can cause, regardless of what models other companies have released. So if someone else releases a model with potentially dangerous capabilities, Anthropic can’t/won’t use that as cover to release something similar that they wouldn’t have released otherwise. I’m not certain whether this is the best approach, but I do think it’s coherent.
Yep:
Source