@Zach Stein-Perlman@ryan_greenblatt feel free to ignore but I’d be curious for one of you to explain your disagree react. Feel free to articulate some of the ways in which you think I might be underestimating the costliness of transparency requirements.
(My current estimate is that whistleblower mechanisms seem very easy to maintain, reporting requirements for natsec capabilities seem relatively easy insofar as most of the information is stuff you already planned to collect, and even many of the more involved transparency ideas (e.g., interview programs) seem like they could be implemented with pretty minimal time-cost.)
Requirements which just involve informing one part of the government (say US AISI) in ways which don’t cost much personnel time mostly have the effect of potentially making the government much more aware at some point. I think this is probably good, but presumably labs would prefer to retain flexibility and making the government more aware can go wrong (from the lab’s perspective) in ways other than safety focused regulation. (E.g., causing labs to be merged as part of a national program to advance capabilities more quickly.)
Whistleblower requirements with teeth can have information leakage concerns. (Internal only whistleblowering policies like Anthropic’s don’t have this issue, but also have basically no teeth for the company overall.)
Any sort of public discussion (about e.g. threat models, government involvement, risks) can have various PR and reputational cost. (Idk if you were counting this under transparency.)
To the extent you expect to disagree with third party inspectors about safety (and they aren’t totally beholden to you), this might end up causing issues for you later.
I’m not claiming that “reasonable” labs shouldn’t do various types of transparency unilaterially, but I don’t think the main cost is in making safety focused regulation more likely.
Agree that the public stuff has immediate effects that could be costly. (Hiding stuff from the public, refraining from discussing important concerns publicly, or developing a reputation for being kinda secretive/sus can also be costly; seems like an overall complex thing to model IMO.)
Sharing info with government could increase the chance of a leak, especially if security isn’t great. I expect the most relevant info is info that wouldn’t be all-that-costly if leaked (e.g., the government doesn’t need OpenAI to share its secret sauce/algorithmic secrets. Dangerous capability eval results leaking or capability forecasts leaking seem less costly, except from a “maybe people will respond by demanding more govt oversight” POV.
I think all-in-all I still see the main cost as making safety regulation more likely, but I’m more uncertain now, and this doesn’t seem like a particularly important/decision-relevant point. Will edit the OG comment to language that I endorse with more confidence.
@Zach Stein-Perlman @ryan_greenblatt feel free to ignore but I’d be curious for one of you to explain your disagree react. Feel free to articulate some of the ways in which you think I might be underestimating the costliness of transparency requirements.
(My current estimate is that whistleblower mechanisms seem very easy to maintain, reporting requirements for natsec capabilities seem relatively easy insofar as most of the information is stuff you already planned to collect, and even many of the more involved transparency ideas (e.g., interview programs) seem like they could be implemented with pretty minimal time-cost.)
It depends on the type of transparency.
Requirements which just involve informing one part of the government (say US AISI) in ways which don’t cost much personnel time mostly have the effect of potentially making the government much more aware at some point. I think this is probably good, but presumably labs would prefer to retain flexibility and making the government more aware can go wrong (from the lab’s perspective) in ways other than safety focused regulation. (E.g., causing labs to be merged as part of a national program to advance capabilities more quickly.)
Whistleblower requirements with teeth can have information leakage concerns. (Internal only whistleblowering policies like Anthropic’s don’t have this issue, but also have basically no teeth for the company overall.)
Any sort of public discussion (about e.g. threat models, government involvement, risks) can have various PR and reputational cost. (Idk if you were counting this under transparency.)
To the extent you expect to disagree with third party inspectors about safety (and they aren’t totally beholden to you), this might end up causing issues for you later.
I’m not claiming that “reasonable” labs shouldn’t do various types of transparency unilaterially, but I don’t think the main cost is in making safety focused regulation more likely.
TY. Some quick reactions below:
Agree that the public stuff has immediate effects that could be costly. (Hiding stuff from the public, refraining from discussing important concerns publicly, or developing a reputation for being kinda secretive/sus can also be costly; seems like an overall complex thing to model IMO.)
Sharing info with government could increase the chance of a leak, especially if security isn’t great. I expect the most relevant info is info that wouldn’t be all-that-costly if leaked (e.g., the government doesn’t need OpenAI to share its secret sauce/algorithmic secrets. Dangerous capability eval results leaking or capability forecasts leaking seem less costly, except from a “maybe people will respond by demanding more govt oversight” POV.
I think all-in-all I still see the main cost as making safety regulation more likely, but I’m more uncertain now, and this doesn’t seem like a particularly important/decision-relevant point. Will edit the OG comment to language that I endorse with more confidence.