Consider someone asking the open source de-censored equivalent of GPT-6 how to create a humanity-ending pandemic. I expect it would read virology papers, figure out what sort of engineered pathogen might be appropriate, walk you through all the steps in duping multiple biology-as-a-service organizations into creating it for you, and give you advice on how to release it for maximum harm.
This commits a common error in these scenarios: implicitly assuming that the only person in the entire world that has access to the LLM is a terrorist, and everyone else is basically on 2023 technology. Stated explicitly, it’s absurd, right? (We’ll call the open source de-censored equivalent of GPT-6 Llama-5, for brevity.)
If the terrorist has Llama-5, so do the biology-as-a-service orgs, so do law-enforcement agencies, etc. If the biology-as-a-service orgs are following your suggestion to screen for pathogens (which is sensible), their Llama-5 is going to say, ah, this is exactly what a terrorist would ask for if they were trying to trick us into making a pathogen. Notably, the defenders need a version that can describe the threat scenario, i.e. an uncensored version of the model!
In general, beyond just bioattack scenarios, any argument purporting to demonstrate dangers of open source LLMs must assume that the defenders also have access. Everyone having access is part of the point of open source, after all.
Edit: I might as well state my own intuition here that:
In the long run, equally increasing the intelligence of attacker and defender favors the defender.
In the short run, new attacks can be made faster than defense can be hardened against them.
If that’s the case, it argues for an approach similar to delayed disclosure policies in computer security: if a new model enables attacks against some existing services, give them early access and time to fix it, then proceed with wide release.
I’m not assuming that the only person with Llama 5 is the one intent on causing harm. Instead, I unfortunately think the sphere of biological attacks is, at least currently, much more favorable to attackers than defenders.
If the biology-as-a-service orgs are following your suggestion to screen for pathogens
I’m not sure we get to assume that? Screening is far from universal today, and not mandatory.
their Llama-5 is going to say, ah, this is exactly what a terrorist would ask for if they were trying to trick us into making a pathogen
This only works if the screener has enough of the genome at once that Llama 5 can figure out what it does, but this is easy to work around.
In general, beyond just bioattack scenarios, any argument purporting to demonstrate dangers of open source LLMs must assume that the defenders also have access
Sure!
If that’s the case, it argues for an approach similar to delayed disclosure policies in computer security: if a new model enables attacks against some existing services, give them early access and time to fix it, then proceed with wide release.
I don’t actually disagree with this! The problem is that the current state of biosecurity is so bad that we need to fix quite a few things first. Once we do have biology as a service KYC, good synthesis screening, restricted access to biological design tools, metagenomic surveillance, much better PPE, etc, then I don’t see Llama 5 as making us appreciably less safe from bioattacks. But that’s much more than 90d! I get deeper into this in Biosecurity Culture, Computer Security Culture.
This commits a common error in these scenarios: implicitly assuming that the only person in the entire world that has access to the LLM is a terrorist, and everyone else is basically on 2023 technology. Stated explicitly, it’s absurd, right? (We’ll call the open source de-censored equivalent of GPT-6 Llama-5, for brevity.)
If the terrorist has Llama-5, so do the biology-as-a-service orgs, so do law-enforcement agencies, etc. If the biology-as-a-service orgs are following your suggestion to screen for pathogens (which is sensible), their Llama-5 is going to say, ah, this is exactly what a terrorist would ask for if they were trying to trick us into making a pathogen. Notably, the defenders need a version that can describe the threat scenario, i.e. an uncensored version of the model!
In general, beyond just bioattack scenarios, any argument purporting to demonstrate dangers of open source LLMs must assume that the defenders also have access. Everyone having access is part of the point of open source, after all.
Edit: I might as well state my own intuition here that:
In the long run, equally increasing the intelligence of attacker and defender favors the defender.
In the short run, new attacks can be made faster than defense can be hardened against them.
If that’s the case, it argues for an approach similar to delayed disclosure policies in computer security: if a new model enables attacks against some existing services, give them early access and time to fix it, then proceed with wide release.
I’m not assuming that the only person with Llama 5 is the one intent on causing harm. Instead, I unfortunately think the sphere of biological attacks is, at least currently, much more favorable to attackers than defenders.
I’m not sure we get to assume that? Screening is far from universal today, and not mandatory.
This only works if the screener has enough of the genome at once that Llama 5 can figure out what it does, but this is easy to work around.
Sure!
I don’t actually disagree with this! The problem is that the current state of biosecurity is so bad that we need to fix quite a few things first. Once we do have biology as a service KYC, good synthesis screening, restricted access to biological design tools, metagenomic surveillance, much better PPE, etc, then I don’t see Llama 5 as making us appreciably less safe from bioattacks. But that’s much more than 90d! I get deeper into this in Biosecurity Culture, Computer Security Culture.