“The view, expressed by almost all competent atomic scientists, that there was no “secret” about how to build an atomic bomb was thus not only rejected by influential people in the U.S. political establishment, but was regarded as a treasonous plot.”
Robert Oppenheimer A Life at the Center, Ray Monk.
[This essay addresses the probability and existential risk of AI through the lens of national security, which the author believes is the most impactful way to address the issue. Thus the author restricts application of the argument to specific near term versions of Processes for Automating Scientific and Technological Advancement (PASTAs) and human-level AI.]
--
Are Advances in LLMs a National Security Risk?
“This is the number one thing keeping me up at night… reckless, rapid development. The pace is frightening… It is teaching itself more than we programmed it for.”
Kiersten Todt, Chief of Staff, CISA [1]
National security apparati around the world should see advances in LLMs as a risk that requires joint-diplomatic efforts, even with current adversaries. This paper explores why this may be the case.
The key steps to the argument are thus:
1. The cost of malicious and autonomous attacks is falling.
2. If the cost of defense is not falling at the same rate, then the current balance of forces between cybersecurity and cyberattack will favor attack, creating a cascade of vulnerabilities across critical systems.
3. The proliferation of these generative AI models (at least across governments).
4. Thus security and information sharing between public and private sectors will be essential for ensuring best practices in security and defense.
5. But the number of vulnerabilities is also increasing. Thus the potential for explosive escalation and/or destabilization of regimes will be great. Non-state actors will increasingly be able to operate in what were previously nation-state level activities.
6. Thus I conclude that capabilities monitoring and both public-private and joint-diplomatic efforts are likely required to protect citizen interests and prevent human suffering.
The Cost of Attacks is Falling
The cost of sophisticated attacks has been falling for some time. Cybersecurity insurance costs indicate that defense is losing to offense [2]. The market rates for insurance indicate that the liability is increasing faster than defensive protocols can be reliably implemented. LLMs accelerate this dynamic. Under some basic assumptions, they may accelerate it significantly.
First we should review the general abilities of current LLMs:
Current LLMs reduce the human labor and cognitive costs of programming by about 2x [3]. There is no reason to expect we have reached a plateau of what is possible for generative models.
“Is it, in the next year, going to automate all attacks on organizations? Can you give it a piece of software and tell it to identify all the zero day exploits? No. But what it will do is optimize the workflow. It will really optimize the ability of those who use those tools to do better and faster.” Rob Joyce, Director of Cybersecurity at the National Security Agency (NSA) 04/11/2023 [4]
Are people using LLMs to aid in finding zero day exploits? Yes. But the relevant question is how great a difference this makes to current malicious efforts? GPT-4 can be used to identify and create many types of exploits, even without fine-tuning [5]. Fine-tuning will increase current and future models’ abilities to identify classes of zero-day vulnerabilities and other lines of attack on a system.
As a general purpose technology, LLMs decrease the cost of technical attack and the cost of human manipulation attacks through phishing and credible interactions. In the very short term, Rob is concerned about the manipulation angle. But in cybersecurity humans are often the weakest link, especially for jumping air gaps. Thus the combination of human manipulation and technical ability by human-aided Agent Models creates the potential for cheaper, more aggressive, and more effective attacks [6].
Automatic General Systems are Possible Now to Varying Degrees
While much concern is raised about when AGI will be possible current systems can:
1) Produce output far faster than human ability. 50,000 tokens per minute, which is around 37k words.
2) Simulate human interaction at human level for many tasks cf. the Zelensky spoof [7].
3) Problem solving beyond human level in terms of polymathy and at human level for academic tests cf. GPT4 test results [8].
4) Coding creation and analysis, average programmer level [9].
5) Fine tuning pushes LLMs to superhuman expertise in well-defined fields that use machine readable data sets [10].
6) Recognize complex patterns [11]
7) Recursive troubleshoot to solve problems [12].
8) Use the Internet to improve performance [13].
9) Can be implemented in a variety of multipurpose architectures [14].
LLMs are a general purpose tool and can be used for highly productive activity as well as malicious activity. It is unclear at this time how much and if LLMs can be used to counteract the same malicious activity they can be used to create. And even if so, there are no publicly known workarounds to the problem of defensive AI capability also being a source of offensive capability (the Waluigi problem)[15].
LLMs can be incorporated as the driver of a set of capabilities, usually through various plugins and APIs. Through those plugins and APIs the LLM can become part of a real world feedback loop of learning, exploration, and exploitation of available resources. There is no obvious reason why current LLMs fine-tuned and inserted into the ‘right’ software/hardware stack cannot be driving forces of Processes for Automating Scientific and Technological Advancements.
Given capabilities at the beginning of this section, I believe the burden of proof is most appropriately set on the position that Automatic General Systems are possible now, even if they have not been built or are not operating openly and fully.
The right architecture can create malicious PASTAs today and—given near-term fine-tuning of AI abilities—will be more capable tomorrow.
Attacks are Asymmetric
LLMs increase productivity and workflow.
If the cost of defense is not falling at the same rate as the cost of malicious attacks, then the current balance of forces between cybersecurity and cyberattack will favor attack, creating a cascade of vulnerabilities across critical systems.
The cost of major attacks has been decreasing for some time. Sophisticated ransomware continues to cost industries [16].
Is the cost of defense falling at the same rate? Probably not.
The cost of red-teaming and deploying security-by-design protocols will decrease. However, the low cost of inexpensive attacks and the creation of automated attack systems means that defense will likely be at a disadvantage relative to today’s ratio. Defensive actors are unlikely to identify, patch, and propagate defense information across all relevant stakeholders faster than the vulnerabilities can be found and exploited.
As stated before, cybersecurity insurance costs are becoming untenable and indicate that the cost of defense will continue to increase.
Furthermore, defense relies on both human and sometimes legislative/executive actions which occur at a slower pace with a higher-error rate than machine intelligent systems. Whether the new equilibrium favors offense enough to require new diplomatic arrangements and additional “safety by design” features is a vital national security question.
So the type of instability I imagine is the type that exists as lag between the creation of new threats and the neutralization of those threats.
Proliferation Inevitable
Among concerned AI safety researchers a lot of emphasis is placed on the GPU clusters and specialized NVIDIA chips that are used for the training of LLMs [17]. However, targeting and tracking these clusters is unlikely to be a long-term viable strategy for AI containment.
Such a strategy assumes three things:
1) that algorithmic advances won’t allow for LLM training on more mundane chips and servers over more distributed systems.
2) that tracking such clusters will be easy enough and reliable enough to guarantee safe deployment.
3) that the dangers are in the training and size of models, not what happens in deployment.
I see no reason to accept these assumptions. Just as devices across the planet were co-opted to mine bitcoin, so too some enterprising organizations may create distributed LLM training systems. Already open source projects create distributed LLM fine-tuning systems that can run locally. Additionally, the national security dangers are posed not primarily by the training but by the use of these models within more capable systems whose deployment can include a variety of plugins and nested instances which make them into capable cyberweapons. Embedded-agency in complex architectures are what make AIs dangerous more so than the model-size alone.
Open-source and smaller models combined with curated data for fine-tuning can create capabilities approaching and in some cases surpassing the sophistication of the largest models [18].
Traditional review-deploy processes and pre-deployment safety standards are unlikely to be able to stop plugins and bootstrapped capabilities without also removing AIs from their most productive potential uses, forestalling important developments, and regulating private individuals and their machines [19]. There may be the ability to identify particularly dangerous plugins or architectures for AI embedded systems. Auditing and providing guidance for defense against those specific capabilities is possible.
Diplomacy
“Something like the IAEA… Getting a global regulatory agency that everybody signs up for, for very powerful AI training systems, seems like a very important thing to do,” Sam Altman [19b].
Arms diplomacy generally occurs under specific scenarios.
1. Collaboration and cost sharing is necessary for the defeat of a shared foe.
2. Protection of allies and NGOs through information sharing.
3. To discourage the use and deployment of weapons that favor offense, are hard to countermand, and create lose-lose situations for adversaries.
1 and 2 are reasons for a country to put additional resources into both LLMs and cybersecurity. 3 would favor diplomatic engagement with state-level adversaries.
Adversarial Diplomacy
“Adversaries ‘pursue’ arms control when they recognize mutual interests in reducing the costs and risks of destabilizing competition in building and deploying weapons, especially those that exacerbate risks of inadvertent or accidental escalation” [20].
Non-state actors are increasingly able to operate in what were previously nation-state level activities. In the war in Ukraine, patriotic Russians and Ukrainians are both engaging hacktivist activities to help the war-efforts of their respective countries. Powerful non-state actors can escalate and destabilize critical infrastructure and bring additional layers of unpredictability to world affairs.
LLMs, whether driven with great oversight by specific adversarial organizations or deployed within autonomous architectures, under even modest assumptions constitute a threat to order larger than any other weapons system to date.
Conclusion
There are several promising research avenues for neutralization of threats posed by AI in the course of diplomacy. But for such diplomacy to be effective or even available when called upon non-government organizations likely need to take action now.
We need organizations proposing and designing monitoring mechanisms and auditing methods. What should these “audits” look like? What should safety evaluations entail? Such organizations should figure out useful ways to discover the types of dangers posed by systems, measure relevant factors in training size, and figure out which data is relevant, and what processes get that information. While I am skeptical that physical hardware monitoring will prove crucially important, we need at least one organization working on hardware and energy signal monitoring.
Organizations should compete to design the best legal frameworks for enforcement of AI liability, treaties (and whatever auditing that entails), and oversight. Adoption of intelligent and dynamic legal code that matches the physical and digital facts about the way these systems do and can work would allow for positive development and investment by having a conducive regulatory regime with rules and enforceable guardrails. At the same time, it would protect civil and global society from some of the dangers.
For cutting-edge systems, at the very least, a census of their names, numbers, data-specs, training runtime, and subsystems will create more global transparency of what is possible.
Some type of intelligence entity/entities should be engaged in capabilities monitoring of “plugin architectures”, open-source models, AutoGPTs, and similar systems looking for dangerous actors and looking for ways to defend and disrupt the worst architectures. I think this is the greatest source of danger in the short-term and high value work.
Malicious uses of LLMs extend beyond cybersecurity. Most aspects of security from Energy and Financial to Food and Health have vectors of attack models will try to exploit very soon. LLMs are a general technology that can be incorporated into construction and delivery systems for a variety of attacks. A close reading of current capabilities reveals that fine-tuning separate models on scientific, engineering, and communication data-sets could lead to a terra cotta army of agent models. But here I have only relied upon cybersecurity as the most specific example for how LLMs are altering equilibria.
Autonomous, self-directed architectures for LLMs decrease the cost of a range of attacks by one to three orders of magnitude. Such inexpensive capacity has the potential to destabilize large portions of human activity—even unintentionally.
National security apparati around the world should see advances in LLMs as a risk that require joint-diplomatic efforts, even with current adversaries in order to address them.
CHAT Diplomacy: LLMs and National Security
Robert Oppenheimer A Life at the Center, Ray Monk.
[This essay addresses the probability and existential risk of AI through the lens of national security, which the author believes is the most impactful way to address the issue. Thus the author restricts application of the argument to specific near term versions of Processes for Automating Scientific and Technological Advancement (PASTAs) and human-level AI.]
--
Are Advances in LLMs a National Security Risk?
“This is the number one thing keeping me up at night… reckless, rapid development. The pace is frightening… It is teaching itself more than we programmed it for.”
Kiersten Todt, Chief of Staff, CISA [1]
National security apparati around the world should see advances in LLMs as a risk that requires joint-diplomatic efforts, even with current adversaries. This paper explores why this may be the case.
The key steps to the argument are thus:
1. The cost of malicious and autonomous attacks is falling.
2. If the cost of defense is not falling at the same rate, then the current balance of forces between cybersecurity and cyberattack will favor attack, creating a cascade of vulnerabilities across critical systems.
3. The proliferation of these generative AI models (at least across governments).
4. Thus security and information sharing between public and private sectors will be essential for ensuring best practices in security and defense.
5. But the number of vulnerabilities is also increasing. Thus the potential for explosive escalation and/or destabilization of regimes will be great. Non-state actors will increasingly be able to operate in what were previously nation-state level activities.
6. Thus I conclude that capabilities monitoring and both public-private and joint-diplomatic efforts are likely required to protect citizen interests and prevent human suffering.
The Cost of Attacks is Falling
The cost of sophisticated attacks has been falling for some time. Cybersecurity insurance costs indicate that defense is losing to offense [2]. The market rates for insurance indicate that the liability is increasing faster than defensive protocols can be reliably implemented. LLMs accelerate this dynamic. Under some basic assumptions, they may accelerate it significantly.
First we should review the general abilities of current LLMs:
Current LLMs reduce the human labor and cognitive costs of programming by about 2x [3]. There is no reason to expect we have reached a plateau of what is possible for generative models.
“Is it, in the next year, going to automate all attacks on organizations? Can you give it a piece of software and tell it to identify all the zero day exploits? No. But what it will do is optimize the workflow. It will really optimize the ability of those who use those tools to do better and faster.” Rob Joyce, Director of Cybersecurity at the National Security Agency (NSA) 04/11/2023 [4]
Are people using LLMs to aid in finding zero day exploits? Yes. But the relevant question is how great a difference this makes to current malicious efforts? GPT-4 can be used to identify and create many types of exploits, even without fine-tuning [5]. Fine-tuning will increase current and future models’ abilities to identify classes of zero-day vulnerabilities and other lines of attack on a system.
As a general purpose technology, LLMs decrease the cost of technical attack and the cost of human manipulation attacks through phishing and credible interactions. In the very short term, Rob is concerned about the manipulation angle. But in cybersecurity humans are often the weakest link, especially for jumping air gaps. Thus the combination of human manipulation and technical ability by human-aided Agent Models creates the potential for cheaper, more aggressive, and more effective attacks [6].
Automatic General Systems are Possible Now to Varying Degrees
While much concern is raised about when AGI will be possible current systems can:
1) Produce output far faster than human ability. 50,000 tokens per minute, which is around 37k words.
2) Simulate human interaction at human level for many tasks cf. the Zelensky spoof [7].
3) Problem solving beyond human level in terms of polymathy and at human level for academic tests cf. GPT4 test results [8].
4) Coding creation and analysis, average programmer level [9].
5) Fine tuning pushes LLMs to superhuman expertise in well-defined fields that use machine readable data sets [10].
6) Recognize complex patterns [11]
7) Recursive troubleshoot to solve problems [12].
8) Use the Internet to improve performance [13].
9) Can be implemented in a variety of multipurpose architectures [14].
LLMs are a general purpose tool and can be used for highly productive activity as well as malicious activity. It is unclear at this time how much and if LLMs can be used to counteract the same malicious activity they can be used to create. And even if so, there are no publicly known workarounds to the problem of defensive AI capability also being a source of offensive capability (the Waluigi problem)[15].
LLMs can be incorporated as the driver of a set of capabilities, usually through various plugins and APIs. Through those plugins and APIs the LLM can become part of a real world feedback loop of learning, exploration, and exploitation of available resources. There is no obvious reason why current LLMs fine-tuned and inserted into the ‘right’ software/hardware stack cannot be driving forces of Processes for Automating Scientific and Technological Advancements.
Given capabilities at the beginning of this section, I believe the burden of proof is most appropriately set on the position that Automatic General Systems are possible now, even if they have not been built or are not operating openly and fully.
The right architecture can create malicious PASTAs today and—given near-term fine-tuning of AI abilities—will be more capable tomorrow.
Attacks are Asymmetric
LLMs increase productivity and workflow.
If the cost of defense is not falling at the same rate as the cost of malicious attacks, then the current balance of forces between cybersecurity and cyberattack will favor attack, creating a cascade of vulnerabilities across critical systems.
The cost of major attacks has been decreasing for some time. Sophisticated ransomware continues to cost industries [16].
Is the cost of defense falling at the same rate? Probably not.
The cost of red-teaming and deploying security-by-design protocols will decrease. However, the low cost of inexpensive attacks and the creation of automated attack systems means that defense will likely be at a disadvantage relative to today’s ratio. Defensive actors are unlikely to identify, patch, and propagate defense information across all relevant stakeholders faster than the vulnerabilities can be found and exploited.
As stated before, cybersecurity insurance costs are becoming untenable and indicate that the cost of defense will continue to increase.
Furthermore, defense relies on both human and sometimes legislative/executive actions which occur at a slower pace with a higher-error rate than machine intelligent systems. Whether the new equilibrium favors offense enough to require new diplomatic arrangements and additional “safety by design” features is a vital national security question.
So the type of instability I imagine is the type that exists as lag between the creation of new threats and the neutralization of those threats.
Proliferation Inevitable
Among concerned AI safety researchers a lot of emphasis is placed on the GPU clusters and specialized NVIDIA chips that are used for the training of LLMs [17]. However, targeting and tracking these clusters is unlikely to be a long-term viable strategy for AI containment.
Such a strategy assumes three things:
1) that algorithmic advances won’t allow for LLM training on more mundane chips and servers over more distributed systems.
2) that tracking such clusters will be easy enough and reliable enough to guarantee safe deployment.
3) that the dangers are in the training and size of models, not what happens in deployment.
I see no reason to accept these assumptions. Just as devices across the planet were co-opted to mine bitcoin, so too some enterprising organizations may create distributed LLM training systems. Already open source projects create distributed LLM fine-tuning systems that can run locally. Additionally, the national security dangers are posed not primarily by the training but by the use of these models within more capable systems whose deployment can include a variety of plugins and nested instances which make them into capable cyberweapons. Embedded-agency in complex architectures are what make AIs dangerous more so than the model-size alone.
Open-source and smaller models combined with curated data for fine-tuning can create capabilities approaching and in some cases surpassing the sophistication of the largest models [18].
Traditional review-deploy processes and pre-deployment safety standards are unlikely to be able to stop plugins and bootstrapped capabilities without also removing AIs from their most productive potential uses, forestalling important developments, and regulating private individuals and their machines [19]. There may be the ability to identify particularly dangerous plugins or architectures for AI embedded systems. Auditing and providing guidance for defense against those specific capabilities is possible.
Diplomacy
Arms diplomacy generally occurs under specific scenarios.
1. Collaboration and cost sharing is necessary for the defeat of a shared foe.
2. Protection of allies and NGOs through information sharing.
3. To discourage the use and deployment of weapons that favor offense, are hard to countermand, and create lose-lose situations for adversaries.
1 and 2 are reasons for a country to put additional resources into both LLMs and cybersecurity. 3 would favor diplomatic engagement with state-level adversaries.
Adversarial Diplomacy
“Adversaries ‘pursue’ arms control when they recognize mutual interests in reducing the costs and risks of destabilizing competition in building and deploying weapons, especially those that exacerbate risks of inadvertent or accidental escalation” [20].
Non-state actors are increasingly able to operate in what were previously nation-state level activities. In the war in Ukraine, patriotic Russians and Ukrainians are both engaging hacktivist activities to help the war-efforts of their respective countries. Powerful non-state actors can escalate and destabilize critical infrastructure and bring additional layers of unpredictability to world affairs.
LLMs, whether driven with great oversight by specific adversarial organizations or deployed within autonomous architectures, under even modest assumptions constitute a threat to order larger than any other weapons system to date.
Conclusion
There are several promising research avenues for neutralization of threats posed by AI in the course of diplomacy. But for such diplomacy to be effective or even available when called upon non-government organizations likely need to take action now.
We need organizations proposing and designing monitoring mechanisms and auditing methods. What should these “audits” look like? What should safety evaluations entail? Such organizations should figure out useful ways to discover the types of dangers posed by systems, measure relevant factors in training size, and figure out which data is relevant, and what processes get that information. While I am skeptical that physical hardware monitoring will prove crucially important, we need at least one organization working on hardware and energy signal monitoring.
Organizations should compete to design the best legal frameworks for enforcement of AI liability, treaties (and whatever auditing that entails), and oversight. Adoption of intelligent and dynamic legal code that matches the physical and digital facts about the way these systems do and can work would allow for positive development and investment by having a conducive regulatory regime with rules and enforceable guardrails. At the same time, it would protect civil and global society from some of the dangers.
For cutting-edge systems, at the very least, a census of their names, numbers, data-specs, training runtime, and subsystems will create more global transparency of what is possible.
Some type of intelligence entity/entities should be engaged in capabilities monitoring of “plugin architectures”, open-source models, AutoGPTs, and similar systems looking for dangerous actors and looking for ways to defend and disrupt the worst architectures. I think this is the greatest source of danger in the short-term and high value work.
Malicious uses of LLMs extend beyond cybersecurity. Most aspects of security from Energy and Financial to Food and Health have vectors of attack models will try to exploit very soon. LLMs are a general technology that can be incorporated into construction and delivery systems for a variety of attacks. A close reading of current capabilities reveals that fine-tuning separate models on scientific, engineering, and communication data-sets could lead to a terra cotta army of agent models. But here I have only relied upon cybersecurity as the most specific example for how LLMs are altering equilibria.
Autonomous, self-directed architectures for LLMs decrease the cost of a range of attacks by one to three orders of magnitude. Such inexpensive capacity has the potential to destabilize large portions of human activity—even unintentionally.
National security apparati around the world should see advances in LLMs as a risk that require joint-diplomatic efforts, even with current adversaries in order to address them.
Notes
[1] CISA—PMF Roundtable Apr 3, 2023, unnamed source, (official transcript forthcoming)
[2] https://fortune.com/2023/02/15/cost-cybersecurity-insurance-soaring-state-backed-attacks-cover-shmulik-yehezkel/
[3] https://github.blog/2022-09-07-research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/#:~:text=In%20the%20experiment%2C%20we%20measured,in%20the%20group%20without%20Copilot
[4] https://www.youtube.com/live/MMNHNjKp4Gs?feature=share&t=519
[5] https://www.forcepoint.com/blog/x-labs/zero-day-exfiltration-using-chatgpt-prompts
[6] https://github.com/Significant-Gravitas/Auto-GPT and https://arxiv.org/abs/2304.03442
[7] https://www.youtube.com/watch?v=DxfSXFkZc6s
[8] https://openai.com/research/gpt-4
[9] ibid.
[10] https://www.semianalysis.com/p/google-we-have-no-moat-and-neither
[11] https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/
[12] https://twitter.com/mckaywrigley/status/1647292799006707717
[13] ibid.
[14] https://arxiv.org/pdf/2303.16199.pdf
[15] https://www.lesswrong.com/posts/D7PumeYTDPfBTp3i7/the-waluigi-effect-mega-post
[16] https://cybersecurityventures.com/global-ransomware-damage-costs-predicted-to-reach-250-billion-usd-by-2031/
[17] https://www.lesswrong.com/posts/eo8odvou4efc9syrv/eliezer-yudkowsky-s-letter-in-time-magazine
[18] https://www.semianalysis.com/p/google-we-have-no-moat-and-neither
[19] https://www.defense.gov/News/News-Stories/Article/Article/2618386/in-cyber-differentiating-between-state-actors-criminals-is-a-blur/
[19b] https://youtu.be/1egAKCKPKCk?t=387
[20] https://carnegieendowment.org/2021/01/21/arms-control-and-disarmament-pub-83583