I think I basically agree with “current models don’t seem very helpful for bioterror” and as far as I can tell, “current papers don’t seem to do the controlled experiments needed to legibly learn that much either way about the usefulness of current models” (though generically evaluating bio capabilities prior to actual usefulness still seems good to me).
I also agree that current empirical work seems unfortunately biased which is pretty sad. (It also seems to me like the claims made are somewhat sloppy in various cases and doesn’t really do the science needed to test claims about the usefulness of current models.)
That said, I think you’re exaggerating the situation in a few cases or understating the case for risk.
You need to also compare the good that open source AI would do against the likelihood and scale of the increased biorisk. The 2001 anthrax attacks killed 5 people; if open source AI accelerated the cure for several forms of cancer, then even a hundred such attacks could easily be worth it. Serious deliberation about the actual costs of criminalizing open source AI—deliberations that do not rhetorically minimize such costs, shrink from looking at them, or emphasize “other means” of establishing the same goal that in fact would only do 1% of the good—would be necessary for a policy paper to be a policy paper and not a puff piece.
I think comparing to known bio terror cases is probably misleading due to the potential for tail risks and difficulty in attribution. In particular, consider covid. It seems reasonably likely that covid was an accidental lab leak (though attribution is hard) and it also seems like it wouldn’t have been that hard to engineer covid in a lab. And the damage from covid is clearly extremely high. Much higher than the anthrax attacks you mention. People in biosecurity think that the tails are more like billions dead or the end of civilization. (I’m not sure if I believe them, the public object level cases for this don’t seem that amazing due to info-hazard concerns.)
Further, suppose that open-sourced AI models could assist substantially with curing cancer. In that world, what probability would you assign to these AIs also assisting substantially with bioterror? It seems pretty likely to me that things are dual use in this way. I don’t know if policy papers are directly making this argument: “models will probably be useful for curing cancer as for bioterror, so if you want open source models to be very useful for biology, we might be in trouble”.
This citation and funding pattern leads me to consider two potential hypothesis:
What about the hypothesis (I wrote this based on the first hypothesis you wrote):
Open Philanthropy thought based on priors and basic reasoning that it’s pretty likely that LLMs (if they’re a big deal at all) would be very useful for bioterrorists. Moved by preparing for the possibility that LLMs could help with bioterror, they fund preliminary experiments designed to roughly test LLM capabilities for bio (in particular creating pathogens). Then, we’d be ready to test future LLMs for bio capabilities. Alas, the resulting papers misleadingly suggest that their results imply that current models are actually useful for bioterror rather than just that “it seems likely on priors” and “models can do some biology, probably this will transfer to bioterror as they get better”. These papers—some perhaps in themselves blameless, given the tentativeness of their conclusions—are misrepresented by subsequent policy papers that cite them. Thus, due to no one’s intent, insufficiently justified concerns about current open-source AI are propagated to governance orgs, which recommend banning open source based on this research. Concerns about future open source models remain justified as most of our evidence for this came from basic reasoning rather than experiments.
Edit: see comment from elifland below. It doesn’t seem like current policy papers are advocating for bans on current open source models.
I feel like the main take is “probably if models are smart, they will be useful for bioterror” and “probably we should evaluate this ongoingly and be careful because it’s easy to finetune open source models and you can’t retract open-sourcing a model”.
Thus, due to no one’s intent, insufficiently justified concerns about current open-source AI are propagated to governance orgs, which recommend banning open source based on this research.
The recommendation that current open-source models should be banned is not present in the policy paper, being discussed, AFAICT. The paper’s recommendations are pictured below:
Edited to add: there is a specific footnote that says “Note that we do not claim that existing models are already too risky. We also do not make any predictions about how risky the next generation of models will be. Our claim is that developers need to assess the risks and be willing to not open-source a model if the risks outweigh the benefits” on page 31
To be clear, Kevin Esvelt is the author of the “Dual-use biotechnology”paper, which the policy paper cites, but he is not the author of the policy paper.
Exactly. I’m getting frustrated when we talk about risks from AI systems with the open source or e/acc communities. The open source community seems to consistently assume the case that the concerns are about current AI systems and the current systems are enough to lead to significant biorisk. Nobody serious is claiming this and it‘s not what I’m seeing in any policy document or paper. And this difference in starting points between the AI Safety community and open source community pretty much makes all the difference.
Sometimes I wonder if the open source community is making this assumption on purpose because it is rhetorically useful to say “oh you think a little chatbot that has the same information as a library would cause a global disaster?” It’s a common tactic these days, downplay the capabilities of the AI system we’re talking about and then make it seem ridiculous to regulate. If I’m being charitable, my guess is that they assume that the bar for “enforce safety measures when stakes are sufficiently high” will be considerably lower than what makes sense/they’d prefer OR they want to wait until the risk is here, demonstrated and obvious until we do anything.
As someone who is pro-open-source, I do think that “AI isn’t useful for making bioweapons” is ultimately a losing argument, because AI is increasingly helpful at doing many different things, and I see no particular reason that the making-of-bioweapons would be an exception. However, that’s also true of many other technologies: good luck making your bioweapon without electric lighting, paper, computers, etc. It wouldn’t be reasonable to ban paper just because it’s handy in the lab notebook in a bioweapons lab.
What would be more persuasive is some evidence that AI is relatively more useful for making bioweapons than it is for doing things in general. It’s a bit hard for me to imagine that being the case, so if it turned out to be true, I’d need to reconsider my viewpoint.
What would be more persuasive is some evidence that AI is relatively more useful for making bioweapons than it is for doing things in general.
I see little reason to use that comparison rather than “will [category of AI models under consideration] improve offense (in bioterrorism, say) relative to defense?”
The open source community seems to consistently assume the case that the concerns are about current AI systems and the current systems are enough to lead to significant biorisk. Nobody serious is claiming this
I see a lot of rhetorical equivocation between risks from existing non-frontier AI systems, and risks from future frontier or even non-frontier AI systems. Just this week, an author of the new “Will releasing the weights of future large language models grant widespread access to pandemic agents?” paper was asserting that everyone on Earth has been harmed by the release of Llama2 (via increased biorisks, it seems). It is very unclear to me which future systems the AIS community would actually permit to be open-sourced, and I think that uncertainty is a substantial part of the worry from open-weight advocates.
I’m happy to see that comment being disagreed with. I think I could say they aren’t a truly serious person after saying that comment (I think the paper is fine), but let’s say that’s one serious person suggesting something vaguely to what I said above.
And I’m also frustrated at people within the AI Safety community who are either ambiguous about which models they are talking about (leads to posts like this and makes consensus harder). Even worse if it’s on purpose for rhetorical reasons.
Noted! I think there is substantial consensus within the AIS community on a central claim that the open-sourcing of certain future frontier AI systems might unacceptably increase biorisks. But I think there is not much consensus on a lot of other important claims, like about for which (future or even current) AI systems open-sourcing is acceptable and for which ones open-sourcing unacceptably increases biorisks.
I agree it would be nice to have strong categories or formalism pinning down which future systems would be safe to open source, but it seems an asymmetry in expected evidence to treat a non-consensus on systems which don’t exist yet as a pro-open-sourcing position. I think it’s fair to say there is enough of a consensus that we don’t know which future systems would be safe and so need more work to determine this before irreversible proliferation.
I think it’s quite possible that open source LLMs above the capability of GPT-4 will be banned within the next two years on the grounds of biorisk.
The White House Executive Order requests a government report on the costs and benefits of open source frontier models and recommended policy actions. It also requires companies to report on the steps they take to secure model weights. These are the kinds of actions the government would take if they were concerned about open source models and thinking about banning them.
This seems like a foreseeable consequence of many of the papers above, and perhaps the explicit goal.
As an addition—Anthropic’s RSP already has GPT-4 level models already locked up behind safety level 2.
Given that they explicitly want their RSPs to be a model for laws and regulations, I’d be only mildly surprised if we got laws banning open source even at GPT-4 level. I think many people are actually shooting for this.
If that’s what they are shooting for, I’d be happy to push them to be explicit about this if they haven’t already.
Would like to be explicit about how they expect biorisk to happen at that level of capability, but I think at least some of them will keep quiet about this for ‘infohazard reasons’ (that was my takeaway from one of the Dario interviews).
Nuclear Threat Initiative has a wonderfully detailed report on AI biorisk, in which they more or less recommend that AI models which pose biorisks should not be open sourced:
Access controls for AI models. A promising approach for many types of models is the use of APIs that allow users to provide inputs and receive outputs without access to the underlying model. Maintaining control of a model ensures that built-in technical safeguards are not removed and provides opportunities for ensuring user legitimacy and detecting any potentially malicious or accidental misuse by users.
It seems reasonably likely that covid was an accidental lab leak (though attribution is hard) and it also seems like it wouldn’t have been that hard to engineer covid in a lab.
Seems like a positive update on human-caused bioterrorism, right? It’s so easy to let stuff leak that covid accidentally gets out, and it might even have been easy to engineer, but (apparently) no one engineered it, nor am I aware of this kind of intentional bioterrorism happening in other places. People apparently aren’t doing it. See Gwern’s Terrorism is not effective.
Maybe smart LLMs come out. I bet people still won’t be doing it.
So what’s the threat model? One can say “tail risks”, but—as OP points out—how much do LLMs really accelerate people’s ability to deploy dangerous pathogens, compared to current possibilities? And what off-the-cuff probabilities are we talking about, here?
>Much higher than the anthrax attacks you mention. People in biosecurity think that the tails are more like billions dead or the end of civilization. (I’m not sure if I believe them, the public object level cases for this don’t seem that amazing due to info-hazard concerns.)
As a biologist who has thought about these kinds of things (and participated in a forecasting group about them), I agree. (And there are very good reasons for not making the object-level cases public!)
In particular, consider covid. It seems reasonably likely that covid was an accidental lab leak (though attribution is hard) and it also seems like it wouldn’t have been that hard to engineer covid in a lab. And the damage from covid is clearly extremely high. Much higher than the anthrax attacks you mention. I people in biosecurity think that the tails are more like billions dead or the end of civilization. (I’m not sure if I believe them, the public object level cases for this don’t seem that amazing due to info-hazard concerns.)
I agree that if future open source models contribute substantially to the risk of something like covid, that would be a component in a good argument for banning them.
I’m dubious—haven’t seen much evidence—that covid itself is evidence that future open source models would so contribute? Given that—to the best of my very limited knowledge—the research being conducted was pretty basic (knowledgewise) but rather expensive (equipment and timewise), so that an LLM wouldn’t have removed a blocker. (I mean, that’s why it came from a US and Chinese-government sponsored lab for whom resources were not an issue, no?) If there is an argument to this effect, 100% agree it is relevant. But I haven’t looked into the sources of Covid for years anyhow, so I’m super fuzzy on this.
Further, suppose that open-source’d AI models could assist substantially with curing cancer. In that world, what probability would you assign to these AIs also assisting substantially with bioterror?
Fair point. Certainly more than in the other world.
I do think that your story is a reasonable mean between the two, with less intentionality, which is a reasonable prior for organizations in general.
I think the prior of “we should evaluate thing ongoingly and be careful about LLMs” when contrasted with “we are releasing this information on how to make plagues in raw form into the wild every day with no hope of retracting it right now” simply is an unjustified focus of one’s hypothesis on LLMs causing dangers, against all the other things in the world more directly contributing to the problem. I think a clear exposition of why I’m wrong about this would be more valuable than any of the experiments I’ve outlined.
I think I basically agree with “current models don’t seem very helpful for bioterror” and as far as I can tell, “current papers don’t seem to do the controlled experiments needed to legibly learn that much either way about the usefulness of current models” (though generically evaluating bio capabilities prior to actual usefulness still seems good to me).
I also agree that current empirical work seems unfortunately biased which is pretty sad. (It also seems to me like the claims made are somewhat sloppy in various cases and doesn’t really do the science needed to test claims about the usefulness of current models.)
That said, I think you’re exaggerating the situation in a few cases or understating the case for risk.
I think comparing to known bio terror cases is probably misleading due to the potential for tail risks and difficulty in attribution. In particular, consider covid. It seems reasonably likely that covid was an accidental lab leak (though attribution is hard) and it also seems like it wouldn’t have been that hard to engineer covid in a lab. And the damage from covid is clearly extremely high. Much higher than the anthrax attacks you mention. People in biosecurity think that the tails are more like billions dead or the end of civilization. (I’m not sure if I believe them, the public object level cases for this don’t seem that amazing due to info-hazard concerns.)
Further, suppose that open-sourced AI models could assist substantially with curing cancer. In that world, what probability would you assign to these AIs also assisting substantially with bioterror? It seems pretty likely to me that things are dual use in this way. I don’t know if policy papers are directly making this argument: “models will probably be useful for curing cancer as for bioterror, so if you want open source models to be very useful for biology, we might be in trouble”.
What about the hypothesis (I wrote this based on the first hypothesis you wrote):
Edit: see comment from elifland below. It doesn’t seem like current policy papers are advocating for bans on current open source models.
I feel like the main take is “probably if models are smart, they will be useful for bioterror” and “probably we should evaluate this ongoingly and be careful because it’s easy to finetune open source models and you can’t retract open-sourcing a model”.
The recommendation that current open-source models should be banned is not present in the policy paper, being discussed, AFAICT. The paper’s recommendations are pictured below:
Edited to add: there is a specific footnote that says “Note that we do not claim that existing models are already too risky. We also do not make any predictions about how risky the next generation of models will be. Our claim is that developers need to assess the risks and be willing to not open-source a model if the risks outweigh the benefits” on page 31
Kevin Esvelt explicitly calls for not releasing future model weights.
To be clear, Kevin Esvelt is the author of the “Dual-use biotechnology” paper, which the policy paper cites, but he is not the author of the policy paper.
Exactly. I’m getting frustrated when we talk about risks from AI systems with the open source or e/acc communities. The open source community seems to consistently assume the case that the concerns are about current AI systems and the current systems are enough to lead to significant biorisk. Nobody serious is claiming this and it‘s not what I’m seeing in any policy document or paper. And this difference in starting points between the AI Safety community and open source community pretty much makes all the difference.
Sometimes I wonder if the open source community is making this assumption on purpose because it is rhetorically useful to say “oh you think a little chatbot that has the same information as a library would cause a global disaster?” It’s a common tactic these days, downplay the capabilities of the AI system we’re talking about and then make it seem ridiculous to regulate. If I’m being charitable, my guess is that they assume that the bar for “enforce safety measures when stakes are sufficiently high” will be considerably lower than what makes sense/they’d prefer OR they want to wait until the risk is here, demonstrated and obvious until we do anything.
As someone who is pro-open-source, I do think that “AI isn’t useful for making bioweapons” is ultimately a losing argument, because AI is increasingly helpful at doing many different things, and I see no particular reason that the making-of-bioweapons would be an exception. However, that’s also true of many other technologies: good luck making your bioweapon without electric lighting, paper, computers, etc. It wouldn’t be reasonable to ban paper just because it’s handy in the lab notebook in a bioweapons lab.
What would be more persuasive is some evidence that AI is relatively more useful for making bioweapons than it is for doing things in general. It’s a bit hard for me to imagine that being the case, so if it turned out to be true, I’d need to reconsider my viewpoint.
I see little reason to use that comparison rather than “will [category of AI models under consideration] improve offense (in bioterrorism, say) relative to defense?”
(explaining my disagree reaction)
I see a lot of rhetorical equivocation between risks from existing non-frontier AI systems, and risks from future frontier or even non-frontier AI systems. Just this week, an author of the new “Will releasing the weights of future large language models grant widespread access to pandemic agents?” paper was asserting that everyone on Earth has been harmed by the release of Llama2 (via increased biorisks, it seems). It is very unclear to me which future systems the AIS community would actually permit to be open-sourced, and I think that uncertainty is a substantial part of the worry from open-weight advocates.
I’m happy to see that comment being disagreed with. I think I could say they aren’t a truly serious person after saying that comment (I think the paper is fine), but let’s say that’s one serious person suggesting something vaguely to what I said above.
And I’m also frustrated at people within the AI Safety community who are either ambiguous about which models they are talking about (leads to posts like this and makes consensus harder). Even worse if it’s on purpose for rhetorical reasons.
Note that one of the main people pushing back against the comment you link is me, a member of the AI safety community.
Noted! I think there is substantial consensus within the AIS community on a central claim that the open-sourcing of certain future frontier AI systems might unacceptably increase biorisks. But I think there is not much consensus on a lot of other important claims, like about for which (future or even current) AI systems open-sourcing is acceptable and for which ones open-sourcing unacceptably increases biorisks.
I agree it would be nice to have strong categories or formalism pinning down which future systems would be safe to open source, but it seems an asymmetry in expected evidence to treat a non-consensus on systems which don’t exist yet as a pro-open-sourcing position. I think it’s fair to say there is enough of a consensus that we don’t know which future systems would be safe and so need more work to determine this before irreversible proliferation.
I think it’s quite possible that open source LLMs above the capability of GPT-4 will be banned within the next two years on the grounds of biorisk.
The White House Executive Order requests a government report on the costs and benefits of open source frontier models and recommended policy actions. It also requires companies to report on the steps they take to secure model weights. These are the kinds of actions the government would take if they were concerned about open source models and thinking about banning them.
This seems like a foreseeable consequence of many of the papers above, and perhaps the explicit goal.
As an addition—Anthropic’s RSP already has GPT-4 level models already locked up behind safety level 2.
Given that they explicitly want their RSPs to be a model for laws and regulations, I’d be only mildly surprised if we got laws banning open source even at GPT-4 level. I think many people are actually shooting for this.
If that’s what they are shooting for, I’d be happy to push them to be explicit about this if they haven’t already.
Would like to be explicit about how they expect biorisk to happen at that level of capability, but I think at least some of them will keep quiet about this for ‘infohazard reasons’ (that was my takeaway from one of the Dario interviews).
Nuclear Threat Initiative has a wonderfully detailed report on AI biorisk, in which they more or less recommend that AI models which pose biorisks should not be open sourced:
Seems like a positive update on human-caused bioterrorism, right? It’s so easy to let stuff leak that covid accidentally gets out, and it might even have been easy to engineer, but (apparently) no one engineered it, nor am I aware of this kind of intentional bioterrorism happening in other places. People apparently aren’t doing it. See Gwern’s Terrorism is not effective.
Maybe smart LLMs come out. I bet people still won’t be doing it.
So what’s the threat model? One can say “tail risks”, but—as OP points out—how much do LLMs really accelerate people’s ability to deploy dangerous pathogens, compared to current possibilities? And what off-the-cuff probabilities are we talking about, here?
>Much higher than the anthrax attacks you mention. People in biosecurity think that the tails are more like billions dead or the end of civilization. (I’m not sure if I believe them, the public object level cases for this don’t seem that amazing due to info-hazard concerns.)
As a biologist who has thought about these kinds of things (and participated in a forecasting group about them), I agree. (And there are very good reasons for not making the object-level cases public!)
Edited ending to be more tentative in response to critique.
I agree that if future open source models contribute substantially to the risk of something like covid, that would be a component in a good argument for banning them.
I’m dubious—haven’t seen much evidence—that covid itself is evidence that future open source models would so contribute? Given that—to the best of my very limited knowledge—the research being conducted was pretty basic (knowledgewise) but rather expensive (equipment and timewise), so that an LLM wouldn’t have removed a blocker. (I mean, that’s why it came from a US and Chinese-government sponsored lab for whom resources were not an issue, no?) If there is an argument to this effect, 100% agree it is relevant. But I haven’t looked into the sources of Covid for years anyhow, so I’m super fuzzy on this.
Fair point. Certainly more than in the other world.
I do think that your story is a reasonable mean between the two, with less intentionality, which is a reasonable prior for organizations in general.
I think the prior of “we should evaluate thing ongoingly and be careful about LLMs” when contrasted with “we are releasing this information on how to make plagues in raw form into the wild every day with no hope of retracting it right now” simply is an unjustified focus of one’s hypothesis on LLMs causing dangers, against all the other things in the world more directly contributing to the problem. I think a clear exposition of why I’m wrong about this would be more valuable than any of the experiments I’ve outlined.