I mean, I agree that humanity theoretically knows how to implement these sorts of security commitments, so the current conditions should always be possible for Anthropic to unblock with enough time and effort, but the commitment to the sequencing that they have to happen before Anthropic has a model that is ASL-3 means that there are situations where Anthropic commits to pause scaling until the security commitments are met. I agree with you that this is a relatively weak commitment in terms of a scaling pause, though to be fair I don’t actually think simply having (but not deploying) a just-barely-ASL-3 model poses much of a risk, so I think it does make sense from a risk-based perspective why most of the commitments are around deployment and security. That being said, even if a just-barely-ASL-3 model doesn’t pose an existential risk, so long as ASL-3 is defined only with a lower bound rather than also an upper bound, it’s obviously the case that eventually it will contain models that pose a potential existential risk, so I agree that a lot is tied up in the upcoming definition of ASL-4. Regardless, it is still the case that Anthropic has already committed to a scaling pause under certain circumstances.
Regardless, it is still the case that Anthropic has already committed to a scaling pause under certain circumstances.
I disagree that this is an accurate summary, or like, it’s only barely denotatively true but not connotatively.
I do think it’s probably best to let this discussion rest, not because it’s not important, but because I do think actually resolving this kind of semantic dispute in public comments like this is really hard, and I think it’s unlikely either of us will change their mind here, and we’ve both made our points. I appreciate you responding to my comments.
I think that there’s a reasonable chance that the current security commitments will lead Anthropic to pause scaling (though I don’t know whether Anthropic would announce publicly if they paused internally). Maybe a Manifold market on this would be a good idea.
Looks good—the only thing I would change is that I think this should probably resolve in the negative only once Anthropic has reached ASL-4, since only then will it be clear whether at any point there was a security-related pause during ASL-3.
I mean, I agree that humanity theoretically knows how to implement these sorts of security commitments, so the current conditions should always be possible for Anthropic to unblock with enough time and effort, but the commitment to the sequencing that they have to happen before Anthropic has a model that is ASL-3 means that there are situations where Anthropic commits to pause scaling until the security commitments are met. I agree with you that this is a relatively weak commitment in terms of a scaling pause, though to be fair I don’t actually think simply having (but not deploying) a just-barely-ASL-3 model poses much of a risk, so I think it does make sense from a risk-based perspective why most of the commitments are around deployment and security. That being said, even if a just-barely-ASL-3 model doesn’t pose an existential risk, so long as ASL-3 is defined only with a lower bound rather than also an upper bound, it’s obviously the case that eventually it will contain models that pose a potential existential risk, so I agree that a lot is tied up in the upcoming definition of ASL-4. Regardless, it is still the case that Anthropic has already committed to a scaling pause under certain circumstances.
I disagree that this is an accurate summary, or like, it’s only barely denotatively true but not connotatively.
I do think it’s probably best to let this discussion rest, not because it’s not important, but because I do think actually resolving this kind of semantic dispute in public comments like this is really hard, and I think it’s unlikely either of us will change their mind here, and we’ve both made our points. I appreciate you responding to my comments.
I think that there’s a reasonable chance that the current security commitments will lead Anthropic to pause scaling (though I don’t know whether Anthropic would announce publicly if they paused internally). Maybe a Manifold market on this would be a good idea.
That seems cool! I made a market here:
Feel free to suggest edits about the operationalization or other things before people start trading.
Looks good—the only thing I would change is that I think this should probably resolve in the negative only once Anthropic has reached ASL-4, since only then will it be clear whether at any point there was a security-related pause during ASL-3.
That seems reasonable. Edited the description (I can’t change when trading on the market closes, but I think that should be fine).