We also recently implemented a non-compliance reporting policy that allows employees to anonymously report concerns to our Responsible Scaling Officer about our implementation of our RSP.
This seems great, and I think it would be valuable for other labs to adopt a similar system.
What about whistleblowing or anonymous reporting to governments? If an Anthropic employee was so concerned about RSP implementation (or more broadly about models that had the potential to cause major national or global security threats), where would they go in the status quo?
If Anthropic is supportive of this kind of mechanism, it might be good to explicitly include this (e.g., “We also recently implemented non-compliance reporting policy that allows employees to anonymously report concerns to specific officials at the [Department of Homeland Security, Department of Commerce].”)
What about whistleblowing or anonymous reporting to governments? If an Anthropic employee was so concerned about RSP implementation (or more broadly about models that had the potential to cause major national or global security threats), where would they go in the status quo?
That really seems more like a question for governments than for Anthropic! For example, the SEC or IRS whistleblower programs operate regardless of what companies puport to “allow”, and I think it’d be cool if the AISI had something similar.
If I was currently concerned about RSP implementation per se (I’m not), it’s not clear why the government would get involved in a matter of voluntary commitments by a private organization. If there was some concern touching on the White House committments, Bletchley declaration, Seoul declaration, etc., then I’d look up the appropriate monitoring body; if in doubt the Commerce whistleblower office or AISI seem like reasonable starting points.
That really seems more like a question for governments than for Anthropic
+1. I do want governments to take this question seriously. It seems plausible to me that Anthropic (and other labs) could play an important role in helping governments improve its ability to detect/process information about AI risks, though.
it’s not clear why the government would get involved in a matter of voluntary commitments by a private organization
Makes sense. I’m less interested in a reporting system that’s like “tell the government that someone is breaking an RSP” and more interested in a reporting system that’s like “tell the government if you are worried about an AI-related national security risk, regardless of whether or not this risk is based on a company breaking its voluntary commitments.”
My guess is that existing whistleblowing programs are the best bet right now, but it’s unclear to me whether they are staffed by people who understand AI risks well enough to know how to interpret/process/escalate such information (assuming the information ought to be escalated).
This seems great, and I think it would be valuable for other labs to adopt a similar system.
What about whistleblowing or anonymous reporting to governments? If an Anthropic employee was so concerned about RSP implementation (or more broadly about models that had the potential to cause major national or global security threats), where would they go in the status quo?
If Anthropic is supportive of this kind of mechanism, it might be good to explicitly include this (e.g., “We also recently implemented non-compliance reporting policy that allows employees to anonymously report concerns to specific officials at the [Department of Homeland Security, Department of Commerce].”)
That really seems more like a question for governments than for Anthropic! For example, the SEC or IRS whistleblower programs operate regardless of what companies puport to “allow”, and I think it’d be cool if the AISI had something similar.
If I was currently concerned about RSP implementation per se (I’m not), it’s not clear why the government would get involved in a matter of voluntary commitments by a private organization. If there was some concern touching on the White House committments, Bletchley declaration, Seoul declaration, etc., then I’d look up the appropriate monitoring body; if in doubt the Commerce whistleblower office or AISI seem like reasonable starting points.
+1. I do want governments to take this question seriously. It seems plausible to me that Anthropic (and other labs) could play an important role in helping governments improve its ability to detect/process information about AI risks, though.
Makes sense. I’m less interested in a reporting system that’s like “tell the government that someone is breaking an RSP” and more interested in a reporting system that’s like “tell the government if you are worried about an AI-related national security risk, regardless of whether or not this risk is based on a company breaking its voluntary commitments.”
My guess is that existing whistleblowing programs are the best bet right now, but it’s unclear to me whether they are staffed by people who understand AI risks well enough to know how to interpret/process/escalate such information (assuming the information ought to be escalated).