You ask elsewhere for commenters to sit down and think for 5 minutes about why an agi might fail. This seems beside the point, since averting human exctinction doesn’t require averting one possible attack from an agi. It involves averting every single one of them, because if even one succeeds everyone dies.
In this it’s similar to human security—“why might a hacker fail” is not an interesting question to system designers, because the hacker gets as many attempts as he wants. For what attempts might look like, i think other posts have provided some reasonable guesses.
I also note that there already exist (non-intelligent) distributed computer systems entirely beyond the ability of any motivated human individual, government or organization to shut down. I refer, of course, to cryptocurrencies, which have this property as an explicit goal of their design.
So. Imagine that an AGI distributes itself among human computer systems in the same way as bitcoin mining software is today. Then it starts executing on someone’s list of doomsday ideas, probably in a way secretive enough to be deniable.
Who’s gonna shut it down? And what would such an action even look like?
(A possible suggestion is “everyone realizes their best interest is in coordinating shutting down their computers so that the AGI lacks a substrate to run on”. To which i would suggest considering the last three years’ worth of response to an obvious, threatening, global enemy that’s not even sentient and will not attempt to defend itself.)
averting human exctinction doesn’t require averting one possible attack from an agi. It involves averting every single one of them, because if even one succeeds everyone dies.
Why do you think that humans won’t retaliate? Why do you think that an AGI, knowing that humans will retaliate, will attack in the first place? Why do you think that this won’t give us a long enough time window to force the machine to work on specific plans?
2.In this it’s similar to human security—“why might a hacker fail” is not an interesting question to system designers, because the hacker gets as many attempts as he wants. For what attempts might look like, i think other posts have provided some reasonable guesses.
I guess that in human security you assume that the hacker can succeed at stealing your password and take contermeasures to avoid that. You don’t assume that the hacker will break into your house and eat your babies while you are sleeping. This might sound like a strange point, but hear me out for a second: if you have that unrealistic frame to begin with, you might spend time not only protecting your computer, but also building a 7 m wall around your house and hiring a professional bodyguard team. Having false beliefs about the world has a cost. In this community, specifically, I see people falling into despair because doom is getting close, and failing to see potential solutions to the alignment problem because they do have unrealistic expectations
Imagine that an AGI distributes itself among human computer systems in the same way as bitcoin mining software is today.
That it IS a possibility and I lack the knowledge myself to evaluate the likelihood of such scenario. Which leaves me more or less as I was before: maybe it is possible doing that but maybe not. The little I know suggests that a model like that would be pretty heavy and not easily distributable across the internet.
First part! Even if, by chance, we successfully detect and turn off the first AGI (say, Deepmind’s), that just means we’re “safe” until Facebook releases its new AGI. Without an alignment solution, this is a game we play more or less forever until either (A) we figure out alignment, (B) we die, or (C) we collectively, every nation, shutter all AI development forever. (C) seems deeply unlikely given the world’s demonstrated capabilities around collective action.
Second part:
I like Bitcoin as a proof-of-concept here, since it’s a technology that:
Imposes broadly distributed costs in the form of global warming and energy consumption, which everyone acknowledges.
Is greatly disliked by the powers-that-be for enabling various kinds of regulatory evasion; and in fact has one authority (China) actively taking steps to eradicate it from their society, which per reports has not been successful.
Is strictly worse at defending itself than AGI, since Bitcoin is non-sentient and will not take any steps whatsoever to defend itself.
This is an existence proof that there are some software architectures that today, right now cannot be eradicated in spite of a great deal of concerted societal efforts going into just that. Presumably an AGI can just ape their successful characteristicsinaddition to anything else it does; hell, there’s no reason an AGI couldn’t just distribute itself as particularly profitable bitcoin mining software.
After all, are people really going to turn off a computer making them hundreds of dollars per month just because a few unpopular weirdos are yelling about far-fetched doomsday scenarios around AGI takeover?
First part. It seems we agree! I just consider that A is more likely because you are already in a world where you can use those AGIs to produce results. This is what a pivotal act would look like. EY et al would argue, this is not going to happen because the first machine will already kill you. What I am criticizing is the position in the community where it is taking for granted that AGI = doom
Second part, I also like that scenario! I don’t consider especially unlikely that an AGi would try to survive like that. But watch out, you can’t really derive from here that machine will have the capacity of killing humanity. Only that a machine might try to survive like this. If you want to continue with the Bitcoin analogy, nothing prevents me from forking the code and create Litecoin, and tune the utility function to make it work for me
Besides the point? That is very convenient to people who don’t want to find that they are wrong. Did you read what I am arguing against? I don’t think I said at any point that an AGI won’t be dangerous. Can you read the last paragraph of the article please?
“If you think this is a simplistic or distorted version of what EY is saying, you are not paying attention. If you think that EY is merely saying that an AGI can kill a big fraction of humans in accident and so on but there will be survivors, you are not paying attention.”
Not sure why this functions as a rebuttal to anything i’m saying.
Sorry, it is true that I wasn’t clear enough and that I misread part of your comment. I would love to give you a properly detailed answer right now but I need to go, will come back to this later
You ask elsewhere for commenters to sit down and think for 5 minutes about why an agi might fail. This seems beside the point, since averting human exctinction doesn’t require averting one possible attack from an agi. It involves averting every single one of them, because if even one succeeds everyone dies.
In this it’s similar to human security—“why might a hacker fail” is not an interesting question to system designers, because the hacker gets as many attempts as he wants. For what attempts might look like, i think other posts have provided some reasonable guesses.
I also note that there already exist (non-intelligent) distributed computer systems entirely beyond the ability of any motivated human individual, government or organization to shut down. I refer, of course, to cryptocurrencies, which have this property as an explicit goal of their design.
So. Imagine that an AGI distributes itself among human computer systems in the same way as bitcoin mining software is today. Then it starts executing on someone’s list of doomsday ideas, probably in a way secretive enough to be deniable.
Who’s gonna shut it down? And what would such an action even look like?
(A possible suggestion is “everyone realizes their best interest is in coordinating shutting down their computers so that the AGI lacks a substrate to run on”. To which i would suggest considering the last three years’ worth of response to an obvious, threatening, global enemy that’s not even sentient and will not attempt to defend itself.)
Three things.
averting human exctinction doesn’t require averting one possible attack from an agi. It involves averting every single one of them, because if even one succeeds everyone dies.
Why do you think that humans won’t retaliate? Why do you think that an AGI, knowing that humans will retaliate, will attack in the first place? Why do you think that this won’t give us a long enough time window to force the machine to work on specific plans?
2.In this it’s similar to human security—“why might a hacker fail” is not an interesting question to system designers, because the hacker gets as many attempts as he wants. For what attempts might look like, i think other posts have provided some reasonable guesses.
I guess that in human security you assume that the hacker can succeed at stealing your password and take contermeasures to avoid that. You don’t assume that the hacker will break into your house and eat your babies while you are sleeping. This might sound like a strange point, but hear me out for a second: if you have that unrealistic frame to begin with, you might spend time not only protecting your computer, but also building a 7 m wall around your house and hiring a professional bodyguard team. Having false beliefs about the world has a cost. In this community, specifically, I see people falling into despair because doom is getting close, and failing to see potential solutions to the alignment problem because they do have unrealistic expectations
Imagine that an AGI distributes itself among human computer systems in the same way as bitcoin mining software is today.
That it IS a possibility and I lack the knowledge myself to evaluate the likelihood of such scenario. Which leaves me more or less as I was before: maybe it is possible doing that but maybe not. The little I know suggests that a model like that would be pretty heavy and not easily distributable across the internet.
My response comes in two parts.
First part! Even if, by chance, we successfully detect and turn off the first AGI (say, Deepmind’s), that just means we’re “safe” until Facebook releases its new AGI. Without an alignment solution, this is a game we play more or less forever until either (A) we figure out alignment, (B) we die, or (C) we collectively, every nation, shutter all AI development forever. (C) seems deeply unlikely given the world’s demonstrated capabilities around collective action.
Second part:
I like Bitcoin as a proof-of-concept here, since it’s a technology that:
Imposes broadly distributed costs in the form of global warming and energy consumption, which everyone acknowledges.
Is greatly disliked by the powers-that-be for enabling various kinds of regulatory evasion; and in fact has one authority (China) actively taking steps to eradicate it from their society, which per reports has not been successful.
Is strictly worse at defending itself than AGI, since Bitcoin is non-sentient and will not take any steps whatsoever to defend itself.
This is an existence proof that there are some software architectures that today, right now cannot be eradicated in spite of a great deal of concerted societal efforts going into just that. Presumably an AGI can just ape their successful characteristicsinaddition to anything else it does; hell, there’s no reason an AGI couldn’t just distribute itself as particularly profitable bitcoin mining software.
After all, are people really going to turn off a computer making them hundreds of dollars per month just because a few unpopular weirdos are yelling about far-fetched doomsday scenarios around AGI takeover?
First part. It seems we agree! I just consider that A is more likely because you are already in a world where you can use those AGIs to produce results. This is what a pivotal act would look like. EY et al would argue, this is not going to happen because the first machine will already kill you. What I am criticizing is the position in the community where it is taking for granted that AGI = doom
Second part, I also like that scenario! I don’t consider especially unlikely that an AGi would try to survive like that. But watch out, you can’t really derive from here that machine will have the capacity of killing humanity. Only that a machine might try to survive like this. If you want to continue with the Bitcoin analogy, nothing prevents me from forking the code and create Litecoin, and tune the utility function to make it work for me
Besides the point? That is very convenient to people who don’t want to find that they are wrong. Did you read what I am arguing against? I don’t think I said at any point that an AGI won’t be dangerous. Can you read the last paragraph of the article please?
“If you think this is a simplistic or distorted version of what EY is saying, you are not paying attention. If you think that EY is merely saying that an AGI can kill a big fraction of humans in accident and so on but there will be survivors, you are not paying attention.”
Not sure why this functions as a rebuttal to anything i’m saying.
Sorry, it is true that I wasn’t clear enough and that I misread part of your comment. I would love to give you a properly detailed answer right now but I need to go, will come back to this later