I strongly disagree, habryka, on the basis that I believe LLMs are already providing some uplift for highly harmful offense-dominant technology (e.g. bioweapons). I think this effect worsens the closer you get to full AGI. The inference cost to do this, even with a large model, is trivial. You just need to extract the recipe.
This gives a weak state-actor (or wealthy non-state-actor) that has high willingness to undertake provocative actions the ability to gain great power from even temporary access to a small amount of inference from a powerful model. Once they have the weapon recipe, they no longer need the model.
I’m also not sure about tlevin’s argument about ‘right to know’. I think the State has a responsibility to protect its citizens. So I certainly agree the State should be monitoring closely all the AI companies within its purview. On the other hand, making details of the progress of the AI publicly known may lead to increased international tensions or risk of theft or terrorism. I suspect it’s better that the State have inspectors and security personnel permanently posted in the AI labs, but that the exact status of the AI progress be classified.
I think the costs of biorisks are vastly smaller than AGI-extinction risk, and so they don’t really factor into my calculations here. Having intermediate harms before AGI seems somewhat good, since it seems more likely to cause rallying around stopping AGI development, though I feel pretty confused about the secondary effects here (but am pretty confident the primary effects are relatively unimportant).
I think that doesn’t really make sense, since the lowest hanging fruit for disempowering humanity routes through self-replicating weapons. Bio weapons are the currently available technology which is in the category of self-replicating weapons. I think that would be the most likely attack vector for a rogue AGI seeking rapid coercive disempowerment.
Plus, having bad actors (human or AGI) have access to a tech for which we currently have no practical defense, which could wipe out nearly all of humanity for under $100k… seems bad? Just a really unstable situation to be in?
I do agree that it seems unlikely that some terrorist org is going to launch a civilization-ending bioweapon attack within the remaining 36 months or so until AGI (or maybe even ASI). But I do think that manipulating a terrorist org into doing this, and giving them the recipe and supplies to do so, would be a potentially tempting tactic for a hostile AGI.
I think if AI kills us all it would be because the AI wants to kill us all. It is (in my model of the world) very unlikely to happen because someone misuses AI systems.
I agree that bioweapons might be part of that, but the difficult part of actually killing everyone via bioweapons requires extensive planning and deployment strategies, which humans won’t want to execute (since they don’t want to die), and so if bioweapons are involved in all of us dying it will very likely be the result of an AI seeing using them as an opportunity to take over, which I think is unlikely to happen because someone runs some leaked weights on some small amount of compute (or like, that would happen years after the same AIs would have done the same when run on the world’s largest computing clusters).
In general, for any story of “dumb AI kills everyone” you need a story for why a smart AI hasn’t killed us first.
I think if AI kills us all it would be because the AI wants to kill us all. It is (in my model of the world) very unlikely to happen because someone misuses AI systems.
I agree that it seems more likely to be a danger from AI systems misusing humans than humans misusing the AI systems.
What I don’t agree with is jumping forward in time to thinking about when there is an AI so powerful it can kill us all at its whim. In my framework, that isn’t a useful time to be thinking about, it’s too late for us to be changing the outcome at that point.
The key time to be focusing on is the time before the AI is sufficiently powerful to wipe out all of humanity, and there is nothing we can do to stop it.
My expectation is that this period of time could be months or even several years, where there is an AI powerful enough and agentic enough to make a dangerous-but-stoppable attempt to take over the world. That’s a critical moment for potential success, since potentially the AI will be contained in such a way that the threat will be objectively demonstrable to key decision makers. That would make for a window of opportunity to make sweeping governance changes, and further delay take-over. Such a delay could be super valuable if it gives alignment research more critical time for researching the dangerously powerful AI.
Also, the period of time between now and when the AI is that powerful is one where AI-as-a-tool makes it easier and easier for humans aided by AI to deploy civilization-destroying self-replicating weapons. Current AIs are already providing non-zero uplift (both lowering barriers to access, and raising peak potential harms). This is likely to continue to rapidly get worse over the next couple years. Delaying AGI doesn’t much help with biorisk from tool AI, so if you have a ‘delay AGI’ plan then you need to also consider the rapidly increasing risk from offense-dominant tech.
I strongly disagree, habryka, on the basis that I believe LLMs are already providing some uplift for highly harmful offense-dominant technology (e.g. bioweapons). I think this effect worsens the closer you get to full AGI. The inference cost to do this, even with a large model, is trivial. You just need to extract the recipe.
This gives a weak state-actor (or wealthy non-state-actor) that has high willingness to undertake provocative actions the ability to gain great power from even temporary access to a small amount of inference from a powerful model. Once they have the weapon recipe, they no longer need the model.
I’m also not sure about tlevin’s argument about ‘right to know’. I think the State has a responsibility to protect its citizens. So I certainly agree the State should be monitoring closely all the AI companies within its purview. On the other hand, making details of the progress of the AI publicly known may lead to increased international tensions or risk of theft or terrorism. I suspect it’s better that the State have inspectors and security personnel permanently posted in the AI labs, but that the exact status of the AI progress be classified.
I think the costs of biorisks are vastly smaller than AGI-extinction risk, and so they don’t really factor into my calculations here. Having intermediate harms before AGI seems somewhat good, since it seems more likely to cause rallying around stopping AGI development, though I feel pretty confused about the secondary effects here (but am pretty confident the primary effects are relatively unimportant).
I think that doesn’t really make sense, since the lowest hanging fruit for disempowering humanity routes through self-replicating weapons. Bio weapons are the currently available technology which is in the category of self-replicating weapons. I think that would be the most likely attack vector for a rogue AGI seeking rapid coercive disempowerment.
Plus, having bad actors (human or AGI) have access to a tech for which we currently have no practical defense, which could wipe out nearly all of humanity for under $100k… seems bad? Just a really unstable situation to be in?
I do agree that it seems unlikely that some terrorist org is going to launch a civilization-ending bioweapon attack within the remaining 36 months or so until AGI (or maybe even ASI). But I do think that manipulating a terrorist org into doing this, and giving them the recipe and supplies to do so, would be a potentially tempting tactic for a hostile AGI.
I think if AI kills us all it would be because the AI wants to kill us all. It is (in my model of the world) very unlikely to happen because someone misuses AI systems.
I agree that bioweapons might be part of that, but the difficult part of actually killing everyone via bioweapons requires extensive planning and deployment strategies, which humans won’t want to execute (since they don’t want to die), and so if bioweapons are involved in all of us dying it will very likely be the result of an AI seeing using them as an opportunity to take over, which I think is unlikely to happen because someone runs some leaked weights on some small amount of compute (or like, that would happen years after the same AIs would have done the same when run on the world’s largest computing clusters).
In general, for any story of “dumb AI kills everyone” you need a story for why a smart AI hasn’t killed us first.
I agree that it seems more likely to be a danger from AI systems misusing humans than humans misusing the AI systems.
What I don’t agree with is jumping forward in time to thinking about when there is an AI so powerful it can kill us all at its whim. In my framework, that isn’t a useful time to be thinking about, it’s too late for us to be changing the outcome at that point.
The key time to be focusing on is the time before the AI is sufficiently powerful to wipe out all of humanity, and there is nothing we can do to stop it.
My expectation is that this period of time could be months or even several years, where there is an AI powerful enough and agentic enough to make a dangerous-but-stoppable attempt to take over the world. That’s a critical moment for potential success, since potentially the AI will be contained in such a way that the threat will be objectively demonstrable to key decision makers. That would make for a window of opportunity to make sweeping governance changes, and further delay take-over. Such a delay could be super valuable if it gives alignment research more critical time for researching the dangerously powerful AI.
Also, the period of time between now and when the AI is that powerful is one where AI-as-a-tool makes it easier and easier for humans aided by AI to deploy civilization-destroying self-replicating weapons. Current AIs are already providing non-zero uplift (both lowering barriers to access, and raising peak potential harms). This is likely to continue to rapidly get worse over the next couple years. Delaying AGI doesn’t much help with biorisk from tool AI, so if you have a ‘delay AGI’ plan then you need to also consider the rapidly increasing risk from offense-dominant tech.