Suppose, just for a moment, that the various people raising these concerns are not all fools, liars, Machiavellian profiteers, or some peculiar sort of open-source-haters, and actually had a point (even if they don’t want to publicize detailed information on the topic). What would then be the consequences? Current open-source models are demonstrably not dangerous. Open-sourcing of models of up to around GPT-4 capacity level (O(1T-2T) parameters, at current architectural efficiencies) by these people’s own accounts shows only flickering hints of danger, but these likely aren’t actually much more helpful than material you can find on the internet or by recruiting a biochemistry post-doc. Suppose that at some larger LLM model size that hasn’t been released yet, possibly around GPT-5 level, this finally becomes a legitimate concern.
So, what would happen then? Fine-tuning safety precautions into open-source models is fairly pointless — it’s been repeatedly shown that it’s trivially easy to fine-tune them out again, and that people will. What would work is to carefully filter their pretraining set: eliminate professional-level material on the dangerous STEM topics, such as lab work on human pathogens, and probably also nuclear physics and certain areas of organic chemical synthesis, and possibly also blackhat hacking. That can’t be reversed without a large amount of expensive re-pretraining work, with access to a lot of sensitive material, and specialized technical experience in doing follow-on re-pretraining to make significant additions to the training sets of foundation models. A bad actor able to recruit someone who can do that and provide them the sensitive information and large amounts of GPU resources to carry it out could almost certainly just recruit someone experienced in whatever topic they wanted to add to the LLM and build them a lab.
So the most likely outcome is that you can open-source models ~GPT-4 size freely or pretty easily (maybe you have to show they haven’t been very extensively trained specifically in these areas), while for models of some larger size, let’s pessimistically assume ~GPT-5, you need to demonstrate to some safety regulator’s satisfaction that dangerous information in certain STEM areas actually isn’t in the model, as opposed to it just having been trained not to provide this. Most likely this would be since your model was trained from some combination of open-source datasets that had already been carefully filtered for this material (which you’d doubtless done further filtering work on), and you can show you didn’t add this stuff back in again. Possibly you’d also need your foundation model to pass a test showing that, when suitably prompted with first pages of documents in these areas, it can’t accurately continue them: that’s not a very hard test to automate, nor for a suitably knowledgeable GPT-5+ commercial model to grade.
Would this be somewhat annoying and inconvenient bureaucracy, if it happened? Yes. Is it less dreadful than even a small chance of some nutjob creating a 50%-mortality super-COVID and turning the world Mad Max? Absolutely. Would it shut down open-source work on LLMs? No, open-source foundation-model trainers would just filter their training sets to work around it (and then complain that the resulting models were bad at pathogen research and designing nuclear reactors). And the open-source fine-tuners wouldn’t need to worry about it at all.
If all this happened, I am certain it would have a lot less of a chilling effect on open-sourcing of ~GPT-5 capacity models than the O($100m–$1b) training-run costs at this scale. I really don’t see any economic argument why anyone (not even middle-eastern oil potentates or Mark Zuckerberg) would open-source for free something that costs that much to make. There’s just no conceivable way you can make that much money back, not even in good reputation. (Indeed, Zuckerberg has been rumbling that large versions of Llama 3 may not be open-source.) So I’m rather dubious that an open-source equivalent of GPT-5 will exist until the combination of GPU cost decreases and improved LLM architecture efficiency brings its training costs down to tens of millions of dollars rather than hundreds.
Suppose, just for a moment, that the various people raising these concerns are not all fools, liars, Machiavellian profiteers, or some peculiar sort of open-source-haters, and actually had a point (even if they don’t want to publicize detailed information on the topic). What would then be the consequences? Current open-source models are demonstrably not dangerous. Open-sourcing of models of up to around GPT-4 capacity level (O(1T-2T) parameters, at current architectural efficiencies) by these people’s own accounts shows only flickering hints of danger, but these likely aren’t actually much more helpful than material you can find on the internet or by recruiting a biochemistry post-doc. Suppose that at some larger LLM model size that hasn’t been released yet, possibly around GPT-5 level, this finally becomes a legitimate concern.
So, what would happen then? Fine-tuning safety precautions into open-source models is fairly pointless — it’s been repeatedly shown that it’s trivially easy to fine-tune them out again, and that people will. What would work is to carefully filter their pretraining set: eliminate professional-level material on the dangerous STEM topics, such as lab work on human pathogens, and probably also nuclear physics and certain areas of organic chemical synthesis, and possibly also blackhat hacking. That can’t be reversed without a large amount of expensive re-pretraining work, with access to a lot of sensitive material, and specialized technical experience in doing follow-on re-pretraining to make significant additions to the training sets of foundation models. A bad actor able to recruit someone who can do that and provide them the sensitive information and large amounts of GPU resources to carry it out could almost certainly just recruit someone experienced in whatever topic they wanted to add to the LLM and build them a lab.
So the most likely outcome is that you can open-source models ~GPT-4 size freely or pretty easily (maybe you have to show they haven’t been very extensively trained specifically in these areas), while for models of some larger size, let’s pessimistically assume ~GPT-5, you need to demonstrate to some safety regulator’s satisfaction that dangerous information in certain STEM areas actually isn’t in the model, as opposed to it just having been trained not to provide this. Most likely this would be since your model was trained from some combination of open-source datasets that had already been carefully filtered for this material (which you’d doubtless done further filtering work on), and you can show you didn’t add this stuff back in again. Possibly you’d also need your foundation model to pass a test showing that, when suitably prompted with first pages of documents in these areas, it can’t accurately continue them: that’s not a very hard test to automate, nor for a suitably knowledgeable GPT-5+ commercial model to grade.
Would this be somewhat annoying and inconvenient bureaucracy, if it happened? Yes. Is it less dreadful than even a small chance of some nutjob creating a 50%-mortality super-COVID and turning the world Mad Max? Absolutely. Would it shut down open-source work on LLMs? No, open-source foundation-model trainers would just filter their training sets to work around it (and then complain that the resulting models were bad at pathogen research and designing nuclear reactors). And the open-source fine-tuners wouldn’t need to worry about it at all.
If all this happened, I am certain it would have a lot less of a chilling effect on open-sourcing of ~GPT-5 capacity models than the O($100m–$1b) training-run costs at this scale. I really don’t see any economic argument why anyone (not even middle-eastern oil potentates or Mark Zuckerberg) would open-source for free something that costs that much to make. There’s just no conceivable way you can make that much money back, not even in good reputation. (Indeed, Zuckerberg has been rumbling that large versions of Llama 3 may not be open-source.) So I’m rather dubious that an open-source equivalent of GPT-5 will exist until the combination of GPU cost decreases and improved LLM architecture efficiency brings its training costs down to tens of millions of dollars rather than hundreds.