Does anyone know why Anthropic doesn’t want models with powerful cyber capabilities to be classified as “dual-use foundation models?”
In its BIS comment, Anthropic proposes a new definition of dual-use foundation model that excludes cyberoffensive capabilities. This also comes up in TechNet’s response (TechNet is a trade association that Anthropic is a part of).
Does anyone know why Anthropic doesn’t want the cyber component of the definition to remain? (I don’t think they cover this in the comment).
---
More details– the original criteria for “dual-use foundation model” proposed by BIS are:
(1) Substantially lowering the barrier of entry for non-experts to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear (CBRN) weapons; (2) Enabling powerful offensive cyber operations through automated vulnerability discovery and exploitation against a wide range of potential targets of cyberattacks; or (3) Permitting the evasion of human control or oversight through means of deception or obfuscation.
Anthropic’s definition includes criteria #1 and #3 in its definition but excludes criterion #2.
(Separately, Anthropic argues that dual-use foundation models should be defined as those that pose catastrophic risks as opposed to serious risks to national security. This is important too, but I’m less confused about why Anthropic wants this.)
Wild speculation: they also have a sort of we’re-watching-but-unsure provision about cyber operations capability in their most recent RSP update. In it, they say in part that “it is also possible that by the time these capabilities are reached, there will be evidence that such a standard is not necessary (for example, because of the potential use of similar capabilities for defensive purposes).” Perhaps they’re thinking that automated vulnerability discovery is at least plausibly on-net-defensive-balance-favorable*, and so they aren’t sure it should be regulated as closely, even if in still in some informal sense “dual use” ?
Again, WILD speculation here.
*A claim that is clearly seen as plausible by, e.g., the DARPA AI Grand Challenge effort.
Does anyone know why Anthropic doesn’t want models with powerful cyber capabilities to be classified as “dual-use foundation models?”
In its BIS comment, Anthropic proposes a new definition of dual-use foundation model that excludes cyberoffensive capabilities. This also comes up in TechNet’s response (TechNet is a trade association that Anthropic is a part of).
Does anyone know why Anthropic doesn’t want the cyber component of the definition to remain? (I don’t think they cover this in the comment).
---
More details– the original criteria for “dual-use foundation model” proposed by BIS are:
(1) Substantially lowering the barrier of entry for non-experts to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear (CBRN) weapons;
(2) Enabling powerful offensive cyber operations through automated vulnerability discovery and exploitation against a wide range of potential targets of cyberattacks; or
(3) Permitting the evasion of human control or oversight through means of deception or obfuscation.
Anthropic’s definition includes criteria #1 and #3 in its definition but excludes criterion #2.
(Separately, Anthropic argues that dual-use foundation models should be defined as those that pose catastrophic risks as opposed to serious risks to national security. This is important too, but I’m less confused about why Anthropic wants this.)
Wild speculation: they also have a sort of we’re-watching-but-unsure provision about cyber operations capability in their most recent RSP update. In it, they say in part that “it is also possible that by the time these capabilities are reached, there will be evidence that such a standard is not necessary (for example, because of the potential use of similar capabilities for defensive purposes).” Perhaps they’re thinking that automated vulnerability discovery is at least plausibly on-net-defensive-balance-favorable*, and so they aren’t sure it should be regulated as closely, even if in still in some informal sense “dual use” ?
Again, WILD speculation here.
*A claim that is clearly seen as plausible by, e.g., the DARPA AI Grand Challenge effort.