Existential danger is very much related to weapons. Of course, AI could pose an existential threat without access to weapons. However, weapons provide the most dangerous vector of attack for a rogue, confused, or otherwise misanthropic AI. We should focus more on this immediate and concrete risk before the more abstract theories of alignment.
If an AI does something with weapons that its operators don’t want it to be doing, they will attempt to stop it. If they eventually succeed, then this doesn’t literally killeveryone, and the AI probably wasn’t the kind that can pose existential threat (even if it did cause a world-shaking disaster). If they can’t stop the AI, at all, even after trying for as long as they live, then it’s the kind of AI that would pose existential threat even without initially being handed access to weapons (if it wants weapons, it would be able to acquire them on its own). So the step of giving AI access to weapons is never a deciding factor for notkilleveryoneism, it’s only a deciding factor for preventing serious harm on a scale that’s smaller than that.
We should focus more on this immediate and concrete risk before the more abstract theories of alignment.
“Focus” suggests reallocation of a limited resource that becomes more scarce elsewhere as a result. I don’t think it’s a good thing to focus less on making sure that the outcome is not literally everyone dying than we are doing now. It’s possible to get to that point, where too much focus is on that, but I don’t think we are there.
Focus means spending time or energy on a task. Our time and energy is limited, and the danger of rogue AI is growing by the year. We should focus our energies on by forming an achievable goal, making a reasonable plan, and acting according to the plan.
Of course, there is a spectrum to the possible outcomes caused by a hypothetical rogue AI (rAI), ranging from insignificant to catastrophic. Any access the rAI might gain to human-made intelligent weapons would amplify the rAI’s power to cause real-world damage.
Of course, there is a spectrum to the possible outcomes
The problem is that with AI, facing existential risk eventually is a certainty, the capability of unbounded autonomous consequentialist agency is feasible to develop (humans have that level of capability, and humans are manifestly feasible, so AIs would merely need to be at least as capable). Either there is a way of mitigating that risk, or it killseveryone. At which point, no second chances. This is different from world-shaking disasters, which do allow second chances and also motivate trying to do better next time.
So this specifically is a natural threat level to consider on its own, not just as one of the points on a scale. And it’s arguably plausible in startlingly near future. And nobody has a reliable plan (or arguably any plan), including the people building the technology right now.
Yes, in the long term we will need a complete alignment strategy, such as permanent integration with our brains. However, before that happens, it would be prudent to limit the potential for a misaligned AI to cause permanent damage.
And, yes, we are in need of a more concrete plan and commitment from the people involved in the tech, especially with regards to lethal AI.
I’m thinking one or two years in the future is a plausible lower bound on time when a (technological) plan would need to be enacted to still have an effect on what happens eventually, or else in four years (from now) a killeveryone arrives (again, as an arguable lower bound, not as a median forecast).
Unless it’s fine by default, on its own, for reasons nobody reliably understands in advance, not because anyone had a plan. I think there is a good chance this is true, but betting the future of humanity on that is insane. Also, even if the first AGIs don’t killeveryone, they might fail to establish strong coordination that prevents other misaligned AGIs from getting built, which do killeveryone, including the first AGIs.
I think probably it’s more like 6 and 8 years, respectively, but that’s also not a lot of time to come up with a plan that depends on having fundamental science that’s not yet developed.
However, weapons provide the most dangerous vector of attack for a rogue, confused, or otherwise misanthropic AI.
I’m not sure why you think that. Human weapons, as horrific as they are, can only cause localized tragedies. Even if we gave the AI access to all of our nuclear weapons, and it fired them all, humanity would not be wiped out. Millions (possibly billions) would perish. Civilization would likely collapse or be set back by centuries. But human extinction? No. We’re tougher than that.
But an AI that competes with humanity, in the same way that Homo sapiens competed with Homo neanderthalis? That could wipe out humanity. We wipe out other species all the time, and only in a small minority of cases is it because we’ve turned our weapons on them and hunted them into extinction. It’s far more common for species to go extinct because humanity needed the habitat and other natural resources that that species needed to survive, and outcompeted that species for access to those resources.
Entities compete in various ways, yes. Competition is an attack on another entities’ chances of survival. Let’s define a weapon as any tool which could be used to mount an attack. Of course, every tool could be used as a weapon, in some sense. It’s a question of how much risk our tools pose to us, if they were to be used against us.
Let’s define a weapon as any tool which could be used to mount an attack.
Why? That broadens the definition of “weapon” to mean literally any tool, technology, or tactic by which one person or organization can gain an advantage over another. It’s far broader than and connotationally very different from the implied definition of “weapon” given by “building intelligent machines that are designed to kill people” and the examples of “suicide drones”, “assassin drones” and “robot dogs with mounted guns”.
Redefining “weapon” in this way turns your argument into a motte-and-bailey, where you’re redefining a word that connotes direct physical harm (e.g. robots armed with guns, bombs, knives, etc) to mean any machine that can, on its own, gain some kind of resource advantage over humans. Most people would not, for example, consider a superior stock-trading algorithm to be a “weapon”, but your (re)definition, it would be.
It is a broad definition, yes, for the purpose of discussing the potential for the tools in question to be used against humans.
My point is this: we should focus first on limiting the most potent vectors of attack: those which involve conventional ‘weapons’. Less potent vectors, (those that are not commonly considered as weapons) such as a ‘stock trading algorithm’, are of lower priority, since they offer more opportunities for detection and mitigation.
An algorithm that amasses wealth should eventually set off red flags (maybe banks need to improve their audits and identification requirements). Additionally, wealth is only useful when spent on a specific purpose. Those purposes could be countered by a government, if the government possesses sufficient ‘weapons’ to eliminate the offending machines.
If this algorithm takes such subtle actions that cannot be detected in time to prevent catastrophe, then we are doomed. However, there is also the likelihood that the algorithm will have weaknesses which allow it to be detected.
Social, economic, or environmental changes happen relatively slowly, on the scale of months or years, compared to potent weapons, which can destroy whole cities in a single day. Therefore, conventional weapons would be a much more immediate danger if corrupted by an AI. The other problems are important to solve, yes, but first humanity must survive its more deadly creations. The field of cybersecurity will continue to evolve in the coming decades. Hopefully world militaries can keep up, so so that no rogue intelligence gains control of these weapons.
To repeat what I said above: even a total launch of all the nuclear weapons in the world will not be sufficient to ensure human extinction. However, AI driven social, economic, and environmental changes could ensure just that.
If an AI got hold of a few nuclear weapons and launched them, that would, in fact, probably be counterproductive from the AI’s perspective, because in the face of such a clear warning sign, humanity would probably unite and shut down AI research and unplug its GPU clusters.
Most actions by which actors increase their power aren’t directly related to weapons. Existential danger comes from one AGI actor getting more power than human actors.
Which kinds of power do you refer to? Most kinds of power require human cooperation. The danger that an AI tricks us into destroying ourselves is small (though a false detection of nuclear weapons could do it). We need much more cooperation between world leaders, a much more positive dialogue between them.
Yes, you need human cooperation but human cooperation isn’t hard. You can easily pay people money to get them to do what you want.
With time more processes can use robots instead of humans for doing physical work and if the AGI already has all the economic and political power there’s nothing to stop the AGI from doing that.
The AGI might then reuse land that’s currently used for growing food for other purposes and step by step reduce the amount of food that’s available and there never needs to be a point where a human thinks that they are working for the destruction of humanity.
More stringent (in-person) verification of bank account ownership could mitigate this risk.
Anyways, the chance of discovery for any covert operation is proportional to the size of the operation and the time that it takes to execute. The more we pre-limit the tools available to a rogue machine to cause harm immediate harm, the more likely we will catch it in the act.
There’s no need for anything being covert. NetDragon Websoft is already having a chatbot as CEO. That chatbot can get funds wired by giving orders to employees.
If the chat bot would be a superintelligence, that would allow it to outcompete other companies.
Existential danger is very much related to weapons. Of course, AI could pose an existential threat without access to weapons. However, weapons provide the most dangerous vector of attack for a rogue, confused, or otherwise misanthropic AI. We should focus more on this immediate and concrete risk before the more abstract theories of alignment.
If an AI does something with weapons that its operators don’t want it to be doing, they will attempt to stop it. If they eventually succeed, then this doesn’t literally killeveryone, and the AI probably wasn’t the kind that can pose existential threat (even if it did cause a world-shaking disaster). If they can’t stop the AI, at all, even after trying for as long as they live, then it’s the kind of AI that would pose existential threat even without initially being handed access to weapons (if it wants weapons, it would be able to acquire them on its own). So the step of giving AI access to weapons is never a deciding factor for notkilleveryoneism, it’s only a deciding factor for preventing serious harm on a scale that’s smaller than that.
“Focus” suggests reallocation of a limited resource that becomes more scarce elsewhere as a result. I don’t think it’s a good thing to focus less on making sure that the outcome is not literally everyone dying than we are doing now. It’s possible to get to that point, where too much focus is on that, but I don’t think we are there.
Focus means spending time or energy on a task. Our time and energy is limited, and the danger of rogue AI is growing by the year. We should focus our energies on by forming an achievable goal, making a reasonable plan, and acting according to the plan.
Of course, there is a spectrum to the possible outcomes caused by a hypothetical rogue AI (rAI), ranging from insignificant to catastrophic. Any access the rAI might gain to human-made intelligent weapons would amplify the rAI’s power to cause real-world damage.
The problem is that with AI, facing existential risk eventually is a certainty, the capability of unbounded autonomous consequentialist agency is feasible to develop (humans have that level of capability, and humans are manifestly feasible, so AIs would merely need to be at least as capable). Either there is a way of mitigating that risk, or it killseveryone. At which point, no second chances. This is different from world-shaking disasters, which do allow second chances and also motivate trying to do better next time.
So this specifically is a natural threat level to consider on its own, not just as one of the points on a scale. And it’s arguably plausible in startlingly near future. And nobody has a reliable plan (or arguably any plan), including the people building the technology right now.
Yes, in the long term we will need a complete alignment strategy, such as permanent integration with our brains. However, before that happens, it would be prudent to limit the potential for a misaligned AI to cause permanent damage.
And, yes, we are in need of a more concrete plan and commitment from the people involved in the tech, especially with regards to lethal AI.
I’m thinking one or two years in the future is a plausible lower bound on time when a (technological) plan would need to be enacted to still have an effect on what happens eventually, or else in four years (from now) a killeveryone arrives (again, as an arguable lower bound, not as a median forecast).
Unless it’s fine by default, on its own, for reasons nobody reliably understands in advance, not because anyone had a plan. I think there is a good chance this is true, but betting the future of humanity on that is insane. Also, even if the first AGIs don’t killeveryone, they might fail to establish strong coordination that prevents other misaligned AGIs from getting built, which do killeveryone, including the first AGIs.
I think probably it’s more like 6 and 8 years, respectively, but that’s also not a lot of time to come up with a plan that depends on having fundamental science that’s not yet developed.
Best to slow down the development of AI in sensitive fields until we have a clearer understanding of its capabilities.
I’m not sure why you think that. Human weapons, as horrific as they are, can only cause localized tragedies. Even if we gave the AI access to all of our nuclear weapons, and it fired them all, humanity would not be wiped out. Millions (possibly billions) would perish. Civilization would likely collapse or be set back by centuries. But human extinction? No. We’re tougher than that.
But an AI that competes with humanity, in the same way that Homo sapiens competed with Homo neanderthalis? That could wipe out humanity. We wipe out other species all the time, and only in a small minority of cases is it because we’ve turned our weapons on them and hunted them into extinction. It’s far more common for species to go extinct because humanity needed the habitat and other natural resources that that species needed to survive, and outcompeted that species for access to those resources.
Entities compete in various ways, yes. Competition is an attack on another entities’ chances of survival. Let’s define a weapon as any tool which could be used to mount an attack. Of course, every tool could be used as a weapon, in some sense. It’s a question of how much risk our tools pose to us, if they were to be used against us.
Why? That broadens the definition of “weapon” to mean literally any tool, technology, or tactic by which one person or organization can gain an advantage over another. It’s far broader than and connotationally very different from the implied definition of “weapon” given by “building intelligent machines that are designed to kill people” and the examples of “suicide drones”, “assassin drones” and “robot dogs with mounted guns”.
Redefining “weapon” in this way turns your argument into a motte-and-bailey, where you’re redefining a word that connotes direct physical harm (e.g. robots armed with guns, bombs, knives, etc) to mean any machine that can, on its own, gain some kind of resource advantage over humans. Most people would not, for example, consider a superior stock-trading algorithm to be a “weapon”, but your (re)definition, it would be.
It is a broad definition, yes, for the purpose of discussing the potential for the tools in question to be used against humans.
My point is this: we should focus first on limiting the most potent vectors of attack: those which involve conventional ‘weapons’. Less potent vectors, (those that are not commonly considered as weapons) such as a ‘stock trading algorithm’, are of lower priority, since they offer more opportunities for detection and mitigation.
An algorithm that amasses wealth should eventually set off red flags (maybe banks need to improve their audits and identification requirements). Additionally, wealth is only useful when spent on a specific purpose. Those purposes could be countered by a government, if the government possesses sufficient ‘weapons’ to eliminate the offending machines.
If this algorithm takes such subtle actions that cannot be detected in time to prevent catastrophe, then we are doomed. However, there is also the likelihood that the algorithm will have weaknesses which allow it to be detected.
That’s exactly where I disagree. Conventional weapons aren’t all that potent compared to social, economic, or environmental changes.
Social, economic, or environmental changes happen relatively slowly, on the scale of months or years, compared to potent weapons, which can destroy whole cities in a single day. Therefore, conventional weapons would be a much more immediate danger if corrupted by an AI. The other problems are important to solve, yes, but first humanity must survive its more deadly creations. The field of cybersecurity will continue to evolve in the coming decades. Hopefully world militaries can keep up, so so that no rogue intelligence gains control of these weapons.
To repeat what I said above: even a total launch of all the nuclear weapons in the world will not be sufficient to ensure human extinction. However, AI driven social, economic, and environmental changes could ensure just that.
If an AI got hold of a few nuclear weapons and launched them, that would, in fact, probably be counterproductive from the AI’s perspective, because in the face of such a clear warning sign, humanity would probably unite and shut down AI research and unplug its GPU clusters.
Most actions by which actors increase their power aren’t directly related to weapons. Existential danger comes from one AGI actor getting more power than human actors.
Which kinds of power do you refer to? Most kinds of power require human cooperation. The danger that an AI tricks us into destroying ourselves is small (though a false detection of nuclear weapons could do it). We need much more cooperation between world leaders, a much more positive dialogue between them.
Yes, you need human cooperation but human cooperation isn’t hard. You can easily pay people money to get them to do what you want.
With time more processes can use robots instead of humans for doing physical work and if the AGI already has all the economic and political power there’s nothing to stop the AGI from doing that.
The AGI might then reuse land that’s currently used for growing food for other purposes and step by step reduce the amount of food that’s available and there never needs to be a point where a human thinks that they are working for the destruction of humanity.
More stringent (in-person) verification of bank account ownership could mitigate this risk.
Anyways, the chance of discovery for any covert operation is proportional to the size of the operation and the time that it takes to execute. The more we pre-limit the tools available to a rogue machine to cause harm immediate harm, the more likely we will catch it in the act.
There’s no need for anything being covert. NetDragon Websoft is already having a chatbot as CEO. That chatbot can get funds wired by giving orders to employees.
If the chat bot would be a superintelligence, that would allow it to outcompete other companies.