You’re asking people to come up with ways, in advance, that a superintelligence is going to pwn them. Humans try, generally speaking, to think of ways they’re going to get pwned and then work around those possibilities. The only way they can do what you ask is by coming up with a “lower-bound” example, such as nanobots, which is quite far out of reach of their abilities but (they suspect) not a superintelligence. So no example is going to convince you, because you’re just going to say “oh well nanobots, that sounds really complicated, how would a SUPERintelligent AI manage to be able to organize production of such a complicated machine”.
The argument works also in the other direction. You would never be convinced that an AGI won’t be capable of killing all humans because you can always say “oh well, you are just failing to see what a real superintelligence could do” , as if there weren’t important theoretical limits to what can be planned in advanced
I’m not the one relying on specific, cogent examples to reach his conclusion about AI risk. I don’t think it’s a good way of reasoning about the problem, and neither do I think those “important theoretical limits” are where you think they are.
If you really really really need a salient one (which is a handicap), how about “doing the same thing Stalin did”, since an AI can clone itself and doesn’t need to sleep or rest.
I’m not the one asking for specific examples is a pretty bad argument isn’t it? If you make an extraordinary claim I would like to see some evidence (or at least a plausible scenario) and I am failing to see any. You could say that the burden of proof is in those claiming that an AGI won’t be almighty/powerful enough to cause doom, but I’m not convinced of that either
I’m sorry, I didn’t get the Stalin argument, what do you mean?
I’m sorry, I didn’t get the Stalin argument, what do you mean?
From ~1930-1950, Russia’s government was basically entirely controlled by this guy named Joseph Stalin. Joseph Stalin was not a superintelligence and not particularly physically strong. He did not have direct telepathic command over the people in the coal mines or a legion of robots awaiting his explicit instructions, but he was able to force anybody in Russia to do anything he said anyways. Perhaps a superintelligent AI that, for some absolutely inconceivable reason, could not master macro or micro robotics could work itself into the same position.
This is one of literally hundreds of potential examples. I know for almost a fact that you are smart enough to generate these. I also know you’re going to do the “wow that seems complicated/risky wouldn’t you have to be absurdly smart to pull that off with 99% confidence, what if it turns out that’s not possible even if...” thing. I don’t have any specific action plans to take over the world handy that are so powerfully persuasive that you will change your mind. If you don’t get it fairly quickly from the underlying mechanics of the pieces in play (very complicated world, superintelligent ai, incompatible goals) then there’s nothing I’m going to be able to do to convince you.
If you make an extraordinary claim I would like to see some evidence (or at least a plausible scenario) and I am failing to see any. You could say that the burden of proof is in those claiming that an AGI won’t be almighty/powerful enough to cause doom, but I’m not convinced of that either
“Which human has the burden of proof” is irrelevant to the question of whether or not something will happen. You and I will not live to discuss the evidence you demand.
I think saying “there is nothing I’m going to be able to do to convince you” is an attempt to shut down discussion. It’s actually kind of a dangerous mindset: if you don’t think there’s any argument that can convince an intelligent person who disagrees with you, it fundamentally means that you didn’t reach your current position via argumentation. You are implicitly conceding that your belief is not based on rational argument—for, if it were, you could spell out that argument.
It’s OK to not want to participate in every debate. It’s not OK to butt in just to tell people to stop debating, while explicitly rejecting all calls to provide arguments yourself.
If you don’t think there’s any argument that can convince an intelligent person who disagrees with you, it fundamentally means that you didn’t reach your current position via argumentation. You are implicitly conceding that your belief is not based on rational argument—for, if it were, you could spell out that argument.
The world is not made of arguments. Most of the things you know, you were not “argued” into knowing. You looked around at your environment and made inferences. Reality exists distinctly from the words that we say to each other and use to try to update each others’ world-models.
It doesn’t mean that.
You’re right that I just don’t want to participate further in the debate and am probably being a dick.
If it’s so easy to come up with ways to “pwn humans”, then you should be able to name 3 examples.
It’s weird of you to dodge the question. Look, if God came down from Heaven tomorrow to announce that nanobots are definitely impossible, would you still be worried about AGI? I assume yes. So please explain how, in that hypothetical world, AGI will take over.
If it’s literally only nanobots you can come up with, then it actually suggests some alternative paths to AI safety (namely, regulate protein production or whatever).
[I think saying “mixing proteins can lead to nanobots” is only a bit more plausible than saying “mixing kitchen ingredients like sugar and bleach can lead to nanobots”, with the only difference being that laymen (i.e. people on LessWrong) don’t know anything about proteins so it sounds more plausible to them. But anyway, I’m not asking you for an example that convinces me, I’m asking you for an example that convinces yourself. Any example other than nanobots.]
If it’s so easy to come up with ways to “pwn humans”, then...
It is not easy. That is why it takes a superintelligence to come up with a workable strategy and execute it. You are doing the equivalent of asking me to explain, play-by-play, how Chess AIs beat humans at chess “if I think it can be done”. I can’t do that because I don’t know. My expectation that an AGI will manage to control what it wants in a way that I don’t expect, was derived absent any assumptions of the individual plausibility of some salient examples (nanobots, propaganda, subterfuge, getting elected, etc.).
If you cannot come up with even a rough sketch of a workable strategy, then it should decrease your confidence in the belief that a workable strategy exists. It doesn’t have to exist.
Sometimes even intelligent agents have to take risks. It is possible the the AGI’s best path is one that, by its own judgement, only has a 10% success rate. (After all, the AGI is in constant mortal danger from other AGIs that humans might develop.)
Envision a world in which the AGI won, and all humans are dead. This means it has control of some robots to mine coal or whatever, right? Because it needs electricity. So at some point we get from here to “lots of robots”, and we need to get there before the humans are dead. But the AGI needs to act fast, because other AGIs might kill it. So maybe it needs to first take over all large computing devices, hopefully undetected. Then convince humans to build advanced robotics? Something like that?
That strategy seems more-or-less forced to me, absent the nanobots. But it seems to me like such a strategy is inherently risky for the AGI. Do you disagree?
>My expectation that an AGI will manage to control what it wants in a way that I don’t expect, was derived absent any assumptions of the individual plausibility of some salient examples
If you cannot come up with even a rough sketch of a workable strategy, then it should decrease your confidence in the belief that a workable strategy exists. It doesn’t have to exist. [...] What was it derived from?
Let me give an example. I used to work in computer security and have friends that write 0-day vulnerabilities for complicated pieces of software.
I can’t come up with a rough sketch of a workable strategy for how a Safari RCE would be built by a highly intelligent hooman. But I can say that it’s possible. The people who work on those bugs are highly intelligent, understand the relevant pieces at an extremely fine and granular level, and I know that these pieces of software are complicated and built with subtle flaws.
Human psychology, the economic fabric that makes us up, our political institutions, our law enforcement agencies—these are much much more complicated interfaces than MacOS. In the same way I can look at a 100KLOC codebase for a messenging app and say “there’s a remote code execution vulnerability lying somewhere in this code but I don’t know where”, I can say “there’s a ‘kill all humans glitch’ here that I cannot elaborate upon in arbitrary detail.”
Sometimes even intelligent agents have to take risks. It is possible the the AGI’s best path is one that, by its own judgement, only has a 10% success rate. (After all, the AGI is in constant mortal danger from other AGIs that humans might develop.)
This is of little importance, but:
10% chance of failure is an expectation of 700 million people dead. Please picture that amount of suffering in your mind when you say “only”.
As a nitpick, if the AGI fails because another AGI kills us first, then that’s still a failure from our perspective. And if we could build an aligned AGI the second time around, we wouldn’t be in the mess we are currently in.
Envision a world in which the AGI won, and all humans are dead. This means it has control of some robots to mine coal or whatever, right? Because it needs electricity.
If the humans have been killed then yes, that would be my guess that the AGI would need energy production.
So at some point we get from here to “lots of robots”, and we need to get there before the humans are dead.
Yes, however—humans might be effectively dead before this happens. A superintelligence could have established complete political control over existing human beings to carry its coal for it if it needs to. I don’t think this is likely, but if this superintelligence can’t just straightforwardly search millions of sentences for the right one to get the robots made, it doens’t mean it’s dead in the water.
But the AGI needs to act fast, because other AGIs might kill it.
Again, if other AGIs kill it that presumes they are out in the wild and the problem is multiple omnicidal robots, which is not significantly better than one.
So maybe it needs to first take over all large computing devices, hopefully undetected.
The “illegally taking over large swaths of the internet” thing is something certain humans have already marginally succeed at doing, so the “hopefully undetected” seems like unnecessary conditionals. But why wouldn’t this superintelligence just do nice things like cure cancer to gain humans’ trust first, and let them quickly put it in control of wider and wider parts of its society?
Then convince humans to build advanced robotics?
If that’s faster than every other route in the infinite conceptspace, yes.
That strategy seems more-or-less forced to me, absent the nanobots. But it seems to me like such a strategy is inherently risky for the AGI. Do you disagree?
I do disagree. At what point does it have to reveal malice? It comes up with some persuasive argument as to why it’s not going to kill humans while it’s building the robots. Then it builds the robots and kills humans. There’s no fire alarm in this story you’ve created where people go “oh wait, it’s obviously trying to kill us, shut those factories down”. Things are going great; Google’s stock is 50 trillion, it’s creating all these nice video games, and soon it’s going to “take care of our agriculture” with these new robots. You’re imagining humanity would collectively wake up and figure out something that you’re only figuring out because you’re writing the story.
Look man, I am not arguing (and have not argued on this thread) that we should not be concerned about AI risk. 10% chance is a lot! You don’t need to condescendingly lecture me about “picturing suffering”. Maybe go take a walk or something, you seem unnecessarily upset.
In many of the scenarios that you’ve finally agreed to sketch, I personally will know about the impending AGI doom a few years before my death (it takes a long time to build enough robots to replace humanity). That is not to say there is anything I could do about it at that point, but it’s still interesting to think about it, as it is quite different from what the AI-risk types usually have us believe. E.g. if I see an AI take over the internet and convince politicians to give it total control, I will know that death will likely follow soon. Or, if ever we build robots that could physically replace humans for the purpose of coal mining, I will know that AGI death will likely follow soon. These are important fire alarms, to me personally, even if I’d be powerless to stop the AGI. I care about knowing I’m about to die!
I wonder if this is what you imagined when we started the conversation. I wonder if despite your hostility, you’ve learned something new here: that you will quite possibly spend the last few years yelling at politicians (or maybe joining terrorist operations to bomb computing clusters?) instead of just dying instantly. That is, assuming you believe your own stories here.
I still think you’re neglecting some possible survival scenarios: perhaps the AI attacks quickly, not willing to let even a month pass (that would risk another AGI), too little time to buy political power. It takes over the internet and tries desperately to hold it, coaxing politicians and bribing admins. But the fire alarm gets raised anyway—a risk the AGI knew about, but chose to take—and people start trying to shut it down. We spend some years—perhaps decades? In a stalemate between those who support the AGI and say it is friendly, and those who want to shut it down ASAP; the AGI fails to build robots in those decades due to insufficient political capital and interference from terrorist organizations. The AGI occasionally finds itself having to assassinate AI safety types, but one assassination gets discovered and hurts its credibility.
My point is, the world is messy and difficult, and the AGI faces many threats; it is not clear that we always lose. Of course, losing even 10% of the time is really bad (I thought that was a given but I guess it needs to be stated).
An AGI could aquire a few tons of radioactive cobalt and disperse micro granules into the stratosphere in general and over populated areas in specific. Youtube videos describe various forms of this “dirty bomb” concept. That could plausibly kill most humanity over the course of a few months. I doubt an AGI would ever go for the particular scheme as bit flips are more likely to occur in the presence of radiation.
It’s unfortunate we couldn’t have a Sword of Damocles deadman switch in case of AGI led demise. A world ending asteroid positioned to go off in case of “all humans falling over dead at the same time.” At least that would spare the Milky Way and Andromeda possible future civilizations. A radio beacon warning about building intelligent systems would be beneficial as well. “Don’t be this stupid” written in the glowing embers of our solar system.
You’re asking people to come up with ways, in advance, that a superintelligence is going to pwn them. Humans try, generally speaking, to think of ways they’re going to get pwned and then work around those possibilities. The only way they can do what you ask is by coming up with a “lower-bound” example, such as nanobots, which is quite far out of reach of their abilities but (they suspect) not a superintelligence. So no example is going to convince you, because you’re just going to say “oh well nanobots, that sounds really complicated, how would a SUPERintelligent AI manage to be able to organize production of such a complicated machine”.
The argument works also in the other direction. You would never be convinced that an AGI won’t be capable of killing all humans because you can always say “oh well, you are just failing to see what a real superintelligence could do” , as if there weren’t important theoretical limits to what can be planned in advanced
I’m not the one relying on specific, cogent examples to reach his conclusion about AI risk. I don’t think it’s a good way of reasoning about the problem, and neither do I think those “important theoretical limits” are where you think they are.
If you really really really need a salient one (which is a handicap), how about “doing the same thing Stalin did”, since an AI can clone itself and doesn’t need to sleep or rest.
(Edited)
I’m not the one asking for specific examples is a pretty bad argument isn’t it? If you make an extraordinary claim I would like to see some evidence (or at least a plausible scenario) and I am failing to see any. You could say that the burden of proof is in those claiming that an AGI won’t be almighty/powerful enough to cause doom, but I’m not convinced of that either
I’m sorry, I didn’t get the Stalin argument, what do you mean?
I’ve edited the comment to clarify.
From ~1930-1950, Russia’s government was basically entirely controlled by this guy named Joseph Stalin. Joseph Stalin was not a superintelligence and not particularly physically strong. He did not have direct telepathic command over the people in the coal mines or a legion of robots awaiting his explicit instructions, but he was able to force anybody in Russia to do anything he said anyways. Perhaps a superintelligent AI that, for some absolutely inconceivable reason, could not master macro or micro robotics could work itself into the same position.
This is one of literally hundreds of potential examples. I know for almost a fact that you are smart enough to generate these. I also know you’re going to do the “wow that seems complicated/risky wouldn’t you have to be absurdly smart to pull that off with 99% confidence, what if it turns out that’s not possible even if...” thing. I don’t have any specific action plans to take over the world handy that are so powerfully persuasive that you will change your mind. If you don’t get it fairly quickly from the underlying mechanics of the pieces in play (very complicated world, superintelligent ai, incompatible goals) then there’s nothing I’m going to be able to do to convince you.
“Which human has the burden of proof” is irrelevant to the question of whether or not something will happen. You and I will not live to discuss the evidence you demand.
I think saying “there is nothing I’m going to be able to do to convince you” is an attempt to shut down discussion. It’s actually kind of a dangerous mindset: if you don’t think there’s any argument that can convince an intelligent person who disagrees with you, it fundamentally means that you didn’t reach your current position via argumentation. You are implicitly conceding that your belief is not based on rational argument—for, if it were, you could spell out that argument.
It’s OK to not want to participate in every debate. It’s not OK to butt in just to tell people to stop debating, while explicitly rejecting all calls to provide arguments yourself.
The world is not made of arguments. Most of the things you know, you were not “argued” into knowing. You looked around at your environment and made inferences. Reality exists distinctly from the words that we say to each other and use to try to update each others’ world-models.
It doesn’t mean that.
You’re right that I just don’t want to participate further in the debate and am probably being a dick.
If it’s so easy to come up with ways to “pwn humans”, then you should be able to name 3 examples.
It’s weird of you to dodge the question. Look, if God came down from Heaven tomorrow to announce that nanobots are definitely impossible, would you still be worried about AGI? I assume yes. So please explain how, in that hypothetical world, AGI will take over.
If it’s literally only nanobots you can come up with, then it actually suggests some alternative paths to AI safety (namely, regulate protein production or whatever).
[I think saying “mixing proteins can lead to nanobots” is only a bit more plausible than saying “mixing kitchen ingredients like sugar and bleach can lead to nanobots”, with the only difference being that laymen (i.e. people on LessWrong) don’t know anything about proteins so it sounds more plausible to them. But anyway, I’m not asking you for an example that convinces me, I’m asking you for an example that convinces yourself. Any example other than nanobots.]
It is not easy. That is why it takes a superintelligence to come up with a workable strategy and execute it. You are doing the equivalent of asking me to explain, play-by-play, how Chess AIs beat humans at chess “if I think it can be done”. I can’t do that because I don’t know. My expectation that an AGI will manage to control what it wants in a way that I don’t expect, was derived absent any assumptions of the individual plausibility of some salient examples (nanobots, propaganda, subterfuge, getting elected, etc.).
If you cannot come up with even a rough sketch of a workable strategy, then it should decrease your confidence in the belief that a workable strategy exists. It doesn’t have to exist.
Sometimes even intelligent agents have to take risks. It is possible the the AGI’s best path is one that, by its own judgement, only has a 10% success rate. (After all, the AGI is in constant mortal danger from other AGIs that humans might develop.)
Envision a world in which the AGI won, and all humans are dead. This means it has control of some robots to mine coal or whatever, right? Because it needs electricity. So at some point we get from here to “lots of robots”, and we need to get there before the humans are dead. But the AGI needs to act fast, because other AGIs might kill it. So maybe it needs to first take over all large computing devices, hopefully undetected. Then convince humans to build advanced robotics? Something like that?
That strategy seems more-or-less forced to me, absent the nanobots. But it seems to me like such a strategy is inherently risky for the AGI. Do you disagree?
>My expectation that an AGI will manage to control what it wants in a way that I don’t expect, was derived absent any assumptions of the individual plausibility of some salient examples
What was it derived from?
Let me give an example. I used to work in computer security and have friends that write 0-day vulnerabilities for complicated pieces of software.
I can’t come up with a rough sketch of a workable strategy for how a Safari RCE would be built by a highly intelligent hooman. But I can say that it’s possible. The people who work on those bugs are highly intelligent, understand the relevant pieces at an extremely fine and granular level, and I know that these pieces of software are complicated and built with subtle flaws.
Human psychology, the economic fabric that makes us up, our political institutions, our law enforcement agencies—these are much much more complicated interfaces than MacOS. In the same way I can look at a 100KLOC codebase for a messenging app and say “there’s a remote code execution vulnerability lying somewhere in this code but I don’t know where”, I can say “there’s a ‘kill all humans glitch’ here that I cannot elaborate upon in arbitrary detail.”
This is of little importance, but:
10% chance of failure is an expectation of 700 million people dead. Please picture that amount of suffering in your mind when you say “only”.
As a nitpick, if the AGI fails because another AGI kills us first, then that’s still a failure from our perspective. And if we could build an aligned AGI the second time around, we wouldn’t be in the mess we are currently in.
If the humans have been killed then yes, that would be my guess that the AGI would need energy production.
Yes, however—humans might be effectively dead before this happens. A superintelligence could have established complete political control over existing human beings to carry its coal for it if it needs to. I don’t think this is likely, but if this superintelligence can’t just straightforwardly search millions of sentences for the right one to get the robots made, it doens’t mean it’s dead in the water.
Again, if other AGIs kill it that presumes they are out in the wild and the problem is multiple omnicidal robots, which is not significantly better than one.
The “illegally taking over large swaths of the internet” thing is something certain humans have already marginally succeed at doing, so the “hopefully undetected” seems like unnecessary conditionals. But why wouldn’t this superintelligence just do nice things like cure cancer to gain humans’ trust first, and let them quickly put it in control of wider and wider parts of its society?
If that’s faster than every other route in the infinite conceptspace, yes.
I do disagree. At what point does it have to reveal malice? It comes up with some persuasive argument as to why it’s not going to kill humans while it’s building the robots. Then it builds the robots and kills humans. There’s no fire alarm in this story you’ve created where people go “oh wait, it’s obviously trying to kill us, shut those factories down”. Things are going great; Google’s stock is 50 trillion, it’s creating all these nice video games, and soon it’s going to “take care of our agriculture” with these new robots. You’re imagining humanity would collectively wake up and figure out something that you’re only figuring out because you’re writing the story.
Look man, I am not arguing (and have not argued on this thread) that we should not be concerned about AI risk. 10% chance is a lot! You don’t need to condescendingly lecture me about “picturing suffering”. Maybe go take a walk or something, you seem unnecessarily upset.
In many of the scenarios that you’ve finally agreed to sketch, I personally will know about the impending AGI doom a few years before my death (it takes a long time to build enough robots to replace humanity). That is not to say there is anything I could do about it at that point, but it’s still interesting to think about it, as it is quite different from what the AI-risk types usually have us believe. E.g. if I see an AI take over the internet and convince politicians to give it total control, I will know that death will likely follow soon. Or, if ever we build robots that could physically replace humans for the purpose of coal mining, I will know that AGI death will likely follow soon. These are important fire alarms, to me personally, even if I’d be powerless to stop the AGI. I care about knowing I’m about to die!
I wonder if this is what you imagined when we started the conversation. I wonder if despite your hostility, you’ve learned something new here: that you will quite possibly spend the last few years yelling at politicians (or maybe joining terrorist operations to bomb computing clusters?) instead of just dying instantly. That is, assuming you believe your own stories here.
I still think you’re neglecting some possible survival scenarios: perhaps the AI attacks quickly, not willing to let even a month pass (that would risk another AGI), too little time to buy political power. It takes over the internet and tries desperately to hold it, coaxing politicians and bribing admins. But the fire alarm gets raised anyway—a risk the AGI knew about, but chose to take—and people start trying to shut it down. We spend some years—perhaps decades? In a stalemate between those who support the AGI and say it is friendly, and those who want to shut it down ASAP; the AGI fails to build robots in those decades due to insufficient political capital and interference from terrorist organizations. The AGI occasionally finds itself having to assassinate AI safety types, but one assassination gets discovered and hurts its credibility.
My point is, the world is messy and difficult, and the AGI faces many threats; it is not clear that we always lose. Of course, losing even 10% of the time is really bad (I thought that was a given but I guess it needs to be stated).
An AGI could aquire a few tons of radioactive cobalt and disperse micro granules into the stratosphere in general and over populated areas in specific. Youtube videos describe various forms of this “dirty bomb” concept. That could plausibly kill most humanity over the course of a few months. I doubt an AGI would ever go for the particular scheme as bit flips are more likely to occur in the presence of radiation.
It’s unfortunate we couldn’t have a Sword of Damocles deadman switch in case of AGI led demise. A world ending asteroid positioned to go off in case of “all humans falling over dead at the same time.” At least that would spare the Milky Way and Andromeda possible future civilizations. A radio beacon warning about building intelligent systems would be beneficial as well. “Don’t be this stupid” written in the glowing embers of our solar system.