Could a virologist actually tell you how to start a pandemic? The paper you’re discussing says they couldn’t.
In the post I claim that (a) a virologist could walk you through synthesizing 1918 flu and (b) one that could read and synthesize the literature could tell you how to create a devastating pandemic. I also think (c) some people already know how to create one but are very reasonably not publishing how. I don’t see the article contradicting this?
This would be far less costly than banning all open source LLMs.
I’m more pessimistic about being able to restrict BDTs than general LLMs, but I also think this would be very good.
Another part of the problem is that telling people how to cause pandemics is only one example of how AI systems can spread dangerous knowledge (in addition to their benefits!) and when you publish the weights of a model there’s no going back.
I’m more pessimistic about being able to restrict BDTs than general LLMs, but I also think this would be very good.
Why do you think so? LLMs seem far more useful to a far wider group of people than BDTs, so I would it to be easier to ban an application specific technology rather than a general one. The White House Executive Order requires mandatory reporting of AI trained on biological data of a lower FLOP count than for any other kind of data, meaning they’re concerned that AI + Bio models are particularly dangerous.
Restricting something that biologists are already doing would create a natural constituency of biologists opposed to your policy. But the same could be said of restricting open source LLMs—there are probably many more people using open source LLMs than using biological AI models.
Maybe bio policies will be harder to change because they’re more established, whereas open source LLMs are new and therefore a more viable target for policy progress?
I take the following quote from the paper as evidence that virologists today are incapable of identifying pandemic potential pathogens, even with funding and support from government agencies:
some research agencies actively support efforts to find or create new potential pandemic viruses and share their genome sequences in hopes of developing better defenses, their efforts have not yet succeeded in identifying credible examples.
We don’t yet know of any credible viruses that could cause new pandemics, but ongoing research projects aim to publicly identify them. Identifying a sequenced virus as pandemic-capable will allow >1,000 individuals to assemble it.
Perhaps these quotes are focusing on global catastrophic biorisks, which would be more destructive than typical pandemics. I think this is an important distinction: we might accept extreme sacrifices (e.g. state-mandated vaccination) to prevent a pandemic from killing billions, without being willing to accept those sacrifices to avoid COVID-19.
I’d be interested to read any other relevant sources here.
On the 80k podcast Kevin Esvelt gave a 5% chance of 1918 causing a pandemic if released today. In 1918 it killed ~50M of ~1.8B global population, so today that could be 225M. Possibly higher today since we’re more interconnected, possibly lower since existing immunity (though recall that we’re conditioning on it taking off as a pandemic). Then 5% of that is an expected “value” of 11M deaths, a bit more than half what we saw with covid. And 1918 deaths skewed much younger than covid, so probably a good bit worse in terms of expected life-years lost.
I think if we get to where someone can be reasonably confident that release of a specific pathogen would wipe out humanity risk would be non-linearly higher, since I think there are more committed people would see everyone dying as a valuable goal than see just mass death as a goal, but this is still high enough that I think the folks who reconstructed and then published the sequence were reckless and LLMs have already (or will soon) increase the danger further by bringing creation and release to within the abilities of more people.
It sounds like it was a hypothetical estimate, not a best guess. From the transcript:
if we suppose that the 1918 strain has only a 5% chance of actually causing a pandemic if it were to infect a few people today. And let’s assume...
Here’s another source which calculates that the annual probability of more than 100M influenza deaths is 0.01%, or that we should expect one such pandemic every 10,000 years. This seems to be fitted on historical data which does not include deliberate bioterrorism, so we should revise that estimate upwards, but I’m not sure the extent to which the estimate is driven by low probability of a dangerous strain being reintroduced vs. an expectation of low death count even with bioterrorism.
From my inside view, it would surprise me if no known pathogens are capable of causing pandemics! But it’s stated as fact in the executive summary of Delay, Detect, Defend and in the NTI report, so currently I’m inclined to trust it. I’m trying to build a better nuts and bolts understanding of biorisks so I’d be interested in any other data points here.
It sounds like it was a hypothetical estimate, not a best guess
Thanks for checking the transcript! I don’t know how seriously you want to take this but in conversation (in person) he said 5% was one of several different estimates he’d heard from virologists. This is a tricky area because it’s not clear we want a bunch of effort going into getting a really good estimate, since (a) if it turns out the probability is high then publicizing that fact likely means increasing the chance we get one and (b) building general knowledge on how to estimate the pandemic potential of viruses seems also likely net negative.
Here’s another source which calculates that the annual probability of more than 100M influenza deaths is 0.01% …
I think maybe we are talking about estimating different things? The 5% estimate was how likely you are to get a 1918 flu pandemic conditional on release.
A few experts believe that LLMs could already or soon will be able to generate ideas for simple variants of existing pathogens that could be more harmful than those that occur naturally, drawing on published research and other sources. Some experts also believe that LLMs will soon be able to access more specialized, open-source AI biodesign tools and successfully use them to generate a wide range of potential biological designs. In this way, the biosecurity implications of LLMs are linked with the capabilities of AI biodesign tools.
5% was one of several different estimates he’d heard from virologists.
Thanks, this is helpful. And I agree there’s a disanalogy between the 1918 hypothetical and the source.
it’s not clear we want a bunch of effort going into getting a really good estimate, since (a) if it turns out the probability is high then publicizing that fact likely means increasing the chance we get one and (b) building general knowledge on how to estimate the pandemic potential of viruses seems also likely net negative.
This seems like it might be overly cautious. Bioterrorism is already quite salient, especially with Rishi Sunak, the White House, and many mainstream media outlets speaking publicly about it. Even SecureBio is writing headline-grabbing papers about how AI can be used to cause pandemics. In that environment, I don’t think biologists and policymakers should refrain from gathering evidence about biorisks and how to combat them. The contribution to public awareness would be relatively small, and the benefits of a better understanding of the risks could lead to a net improvement in biosecurity.
For example, estimating the probability that known pathogens would cause 100M+ deaths if released is an extremely important question for deciding whether open source LLMs should be banned. If the answer is demonstrably yes, I’d expect the White House to significantly restrict open source LLMs within a year or two. This benefit would be far greater than the cost of raising the issue’s salience.
And from a new NTI report: “Furthermore, current LLMs are unlikely to generate toxin or pathogen designs that are not already described in the public literature, and it is likely they will only be able to do this in the future by incorporating more specialized AI biodesign tools.”
In the post I claim that (a) a virologist could walk you through synthesizing 1918 flu and (b) one that could read and synthesize the literature could tell you how to create a devastating pandemic. I also think (c) some people already know how to create one but are very reasonably not publishing how. I don’t see the article contradicting this?
I’m more pessimistic about being able to restrict BDTs than general LLMs, but I also think this would be very good.
Another part of the problem is that telling people how to cause pandemics is only one example of how AI systems can spread dangerous knowledge (in addition to their benefits!) and when you publish the weights of a model there’s no going back.
Why do you think so? LLMs seem far more useful to a far wider group of people than BDTs, so I would it to be easier to ban an application specific technology rather than a general one. The White House Executive Order requires mandatory reporting of AI trained on biological data of a lower FLOP count than for any other kind of data, meaning they’re concerned that AI + Bio models are particularly dangerous.
Restricting something that biologists are already doing would create a natural constituency of biologists opposed to your policy. But the same could be said of restricting open source LLMs—there are probably many more people using open source LLMs than using biological AI models.
Maybe bio policies will be harder to change because they’re more established, whereas open source LLMs are new and therefore a more viable target for policy progress?
I take the following quote from the paper as evidence that virologists today are incapable of identifying pandemic potential pathogens, even with funding and support from government agencies:
Corroborating this is Kevin Esvelt’s paper Delay, Detect, Defend, which says:
Perhaps these quotes are focusing on global catastrophic biorisks, which would be more destructive than typical pandemics. I think this is an important distinction: we might accept extreme sacrifices (e.g. state-mandated vaccination) to prevent a pandemic from killing billions, without being willing to accept those sacrifices to avoid COVID-19.
I’d be interested to read any other relevant sources here.
On the 80k podcast Kevin Esvelt gave a 5% chance of 1918 causing a pandemic if released today. In 1918 it killed ~50M of ~1.8B global population, so today that could be 225M. Possibly higher today since we’re more interconnected, possibly lower since existing immunity (though recall that we’re conditioning on it taking off as a pandemic). Then 5% of that is an expected “value” of 11M deaths, a bit more than half what we saw with covid. And 1918 deaths skewed much younger than covid, so probably a good bit worse in terms of expected life-years lost.
I think if we get to where someone can be reasonably confident that release of a specific pathogen would wipe out humanity risk would be non-linearly higher, since I think there are more committed people would see everyone dying as a valuable goal than see just mass death as a goal, but this is still high enough that I think the folks who reconstructed and then published the sequence were reckless and LLMs have already (or will soon) increase the danger further by bringing creation and release to within the abilities of more people.
It sounds like it was a hypothetical estimate, not a best guess. From the transcript:
Here’s another source which calculates that the annual probability of more than 100M influenza deaths is 0.01%, or that we should expect one such pandemic every 10,000 years. This seems to be fitted on historical data which does not include deliberate bioterrorism, so we should revise that estimate upwards, but I’m not sure the extent to which the estimate is driven by low probability of a dangerous strain being reintroduced vs. an expectation of low death count even with bioterrorism.
From my inside view, it would surprise me if no known pathogens are capable of causing pandemics! But it’s stated as fact in the executive summary of Delay, Detect, Defend and in the NTI report, so currently I’m inclined to trust it. I’m trying to build a better nuts and bolts understanding of biorisks so I’d be interested in any other data points here.
Thanks for checking the transcript! I don’t know how seriously you want to take this but in conversation (in person) he said 5% was one of several different estimates he’d heard from virologists. This is a tricky area because it’s not clear we want a bunch of effort going into getting a really good estimate, since (a) if it turns out the probability is high then publicizing that fact likely means increasing the chance we get one and (b) building general knowledge on how to estimate the pandemic potential of viruses seems also likely net negative.
I think maybe we are talking about estimating different things? The 5% estimate was how likely you are to get a 1918 flu pandemic conditional on release.
More from the NTI report:
Thanks, this is helpful. And I agree there’s a disanalogy between the 1918 hypothetical and the source.
This seems like it might be overly cautious. Bioterrorism is already quite salient, especially with Rishi Sunak, the White House, and many mainstream media outlets speaking publicly about it. Even SecureBio is writing headline-grabbing papers about how AI can be used to cause pandemics. In that environment, I don’t think biologists and policymakers should refrain from gathering evidence about biorisks and how to combat them. The contribution to public awareness would be relatively small, and the benefits of a better understanding of the risks could lead to a net improvement in biosecurity.
For example, estimating the probability that known pathogens would cause 100M+ deaths if released is an extremely important question for deciding whether open source LLMs should be banned. If the answer is demonstrably yes, I’d expect the White House to significantly restrict open source LLMs within a year or two. This benefit would be far greater than the cost of raising the issue’s salience.
And from a new NTI report: “Furthermore, current LLMs are unlikely to generate toxin or pathogen designs that are not already described in the public literature, and it is likely they will only be able to do this in the future by incorporating more specialized AI biodesign tools.”
https://www.nti.org/wp-content/uploads/2023/10/NTIBIO_AI_FINAL.pdf