It seems to me like the strongest case for SB1047 is that it’s a transparency bill. As Zvi noted, it’s probably good for governments and for the world to be able to example the Safety and Security Protocols of frontier AI companies.
But there are also some pretty important limitations. I think a lot of the bill’s value (assuming it passes) will be determined by how it’s implemented and whether or not there are folks in government who are able to put pressure on labs to be specific/concrete in their SSPs.
More thoughts below:
Transparency as an emergency preparedness technique
I often think in an emergency preparedness frame– if there was a time-sensitive threat, how would governments be able to detect the threat & make sure information about the threat was triaged/handled appropriately? It seems like governments are more likely to notice time-sensitive threats in a world where there’s more transparency, and forcing frontier AI companies to write/publish SSPs seems good from that angle.
In my model, a lot of risk comes from the government taking too long to react– either so long that an existential catastrophe actually occurs or so long that by the time major intervention occurs, ASL-4+ models have been developed with poor security, and now it’s ~impossible to do anything except continue to race (“otherwise the other people with ASL4+ models will cause a catastrophe”.) Efforts to get the government to understand the state of risks and intervene before ASL4+ models seem very important from that perspective. It seems to me like SSPs could accomplish this by (a) giving the government useful information and (b) making it “someone’s job” to evaluate the state of SSPs + frontier AI risks.
Limitation: Companies can write long and nice-sounding documents that avoid specificity and concreteness
The most notable limitation, IMO, is that it’s generally pretty easy for powerful companies to evade being fully transparent. Sometimes, people champion things like RSPs or the Seoul Commitments as these major breakthroughs in transparency. Although I do see these as steps in the right direction, their value should not be overstated. For example, even the “best” RSPs (OpenAI’s and Anthropic’s) are rather vague about how decisions will actually be made. Anthropic’s RSP essentially says “Company leadership will ultimately determine whether something is too risky and whether the safeguards are adequate” (with the exception of some specifics around security). OpenAI’s does a bit better IMO (from a transparency perspective) by spelling out the kinds of capabilities that they would consider risky, but they still provide company leadership ~infinite freedom RE determining whether or not safeguards are adequate.
Incentives for transparency are relatively weak, and the costs of transparency can be high. In Sam Bowman’s recent post, he mentions that detailed commitments (and we can extend this to detailed SSPs) can commit companies to “needlessly costly busy work.” A separate but related frame is that race dynamics mean that companies can’t afford to make detailed commitments. If I’m in charge of an AI company, I’d generally like to have some freedom/flexibility/wiggle room in how I make decisions, interpret evidence, conduct evaluations, decide whether or not to keep scaling, and make judgments around safety and security.
In other words, we should expect that at least some (maybe all) of the frontier AI companies will try to write SSPs that sound really nice but provide minimal concrete details. The incentives to be concrete/specific are not strong, and we already have some evidence from seeing RSPs/PFs (and note again that I think that the other companies were even less detailed and concrete in their documents.)
Potential solutions: Government capacity & whistleblower mechanisms
So what do we do about this? Are there ways to make SSPs actually promote transparency? If the government is able to tell that some companies are being vague/misleading in their SSPs, this could inspire further investigations/inquiries. We’ve already seen several Congresspeople send letters to frontier AI companies requesting more details about security procedures, whistleblower protections, and other safety/security topics.
So I think there are two things that can help: government capacity and whistleblower mechanisms.
Government capacity. The FMD was cut, but perhaps the Board of Frontier Models could provide this oversight. At the very least, the Board could provide an audience for the work of people like @Zach Stein-Perlman and @Zvi– people who might actually read through a complicated 50+ page SSP with corporate niceties but be able to distill what’s really going on, what’s missing, what’s misleading, etc.
Whistleblower mechanisms. SB1047 provides a whistleblower mechanism & whistleblower protections (note: I see these as separate things and I personally think mechanisms are more important). Every frontier AI company has to have a platform through which employees (and contractors, I think?) are able to report if they believe the company is being misleading in its SSPs. This seems like a great accountability tool (though of course it relies on the whistleblower mechanism being implemented properly & relies on some degree of government capacity RE knowing how to interpret whistleblower reports.)
The final thing I’ll note is that I think the idea of full shutdown protocols is quite valuable. From an emergency preparedness standpoint, it seems quite good for governments to be asking “under what circumstances do you think a full shutdown is required” and “how would we actually execute/verify a full shutdown.”
It seems to me like the strongest case for SB1047 is that it’s a transparency bill. As Zvi noted, it’s probably good for governments and for the world to be able to example the Safety and Security Protocols of frontier AI companies.
But there are also some pretty important limitations. I think a lot of the bill’s value (assuming it passes) will be determined by how it’s implemented and whether or not there are folks in government who are able to put pressure on labs to be specific/concrete in their SSPs.
More thoughts below:
Transparency as an emergency preparedness technique
I often think in an emergency preparedness frame– if there was a time-sensitive threat, how would governments be able to detect the threat & make sure information about the threat was triaged/handled appropriately? It seems like governments are more likely to notice time-sensitive threats in a world where there’s more transparency, and forcing frontier AI companies to write/publish SSPs seems good from that angle.
In my model, a lot of risk comes from the government taking too long to react– either so long that an existential catastrophe actually occurs or so long that by the time major intervention occurs, ASL-4+ models have been developed with poor security, and now it’s ~impossible to do anything except continue to race (“otherwise the other people with ASL4+ models will cause a catastrophe”.) Efforts to get the government to understand the state of risks and intervene before ASL4+ models seem very important from that perspective. It seems to me like SSPs could accomplish this by (a) giving the government useful information and (b) making it “someone’s job” to evaluate the state of SSPs + frontier AI risks.
Limitation: Companies can write long and nice-sounding documents that avoid specificity and concreteness
The most notable limitation, IMO, is that it’s generally pretty easy for powerful companies to evade being fully transparent. Sometimes, people champion things like RSPs or the Seoul Commitments as these major breakthroughs in transparency. Although I do see these as steps in the right direction, their value should not be overstated. For example, even the “best” RSPs (OpenAI’s and Anthropic’s) are rather vague about how decisions will actually be made. Anthropic’s RSP essentially says “Company leadership will ultimately determine whether something is too risky and whether the safeguards are adequate” (with the exception of some specifics around security). OpenAI’s does a bit better IMO (from a transparency perspective) by spelling out the kinds of capabilities that they would consider risky, but they still provide company leadership ~infinite freedom RE determining whether or not safeguards are adequate.
Incentives for transparency are relatively weak, and the costs of transparency can be high. In Sam Bowman’s recent post, he mentions that detailed commitments (and we can extend this to detailed SSPs) can commit companies to “needlessly costly busy work.” A separate but related frame is that race dynamics mean that companies can’t afford to make detailed commitments. If I’m in charge of an AI company, I’d generally like to have some freedom/flexibility/wiggle room in how I make decisions, interpret evidence, conduct evaluations, decide whether or not to keep scaling, and make judgments around safety and security.
In other words, we should expect that at least some (maybe all) of the frontier AI companies will try to write SSPs that sound really nice but provide minimal concrete details. The incentives to be concrete/specific are not strong, and we already have some evidence from seeing RSPs/PFs (and note again that I think that the other companies were even less detailed and concrete in their documents.)
Potential solutions: Government capacity & whistleblower mechanisms
So what do we do about this? Are there ways to make SSPs actually promote transparency? If the government is able to tell that some companies are being vague/misleading in their SSPs, this could inspire further investigations/inquiries. We’ve already seen several Congresspeople send letters to frontier AI companies requesting more details about security procedures, whistleblower protections, and other safety/security topics.
So I think there are two things that can help: government capacity and whistleblower mechanisms.
Government capacity. The FMD was cut, but perhaps the Board of Frontier Models could provide this oversight. At the very least, the Board could provide an audience for the work of people like @Zach Stein-Perlman and @Zvi– people who might actually read through a complicated 50+ page SSP with corporate niceties but be able to distill what’s really going on, what’s missing, what’s misleading, etc.
Whistleblower mechanisms. SB1047 provides a whistleblower mechanism & whistleblower protections (note: I see these as separate things and I personally think mechanisms are more important). Every frontier AI company has to have a platform through which employees (and contractors, I think?) are able to report if they believe the company is being misleading in its SSPs. This seems like a great accountability tool (though of course it relies on the whistleblower mechanism being implemented properly & relies on some degree of government capacity RE knowing how to interpret whistleblower reports.)
The final thing I’ll note is that I think the idea of full shutdown protocols is quite valuable. From an emergency preparedness standpoint, it seems quite good for governments to be asking “under what circumstances do you think a full shutdown is required” and “how would we actually execute/verify a full shutdown.”