I am just a stupid human. But if I was an AI, I might...
I think it is a great idea to be very cautious about the possible capabilities of hypothetical AI’s. Yet the point of disagreement I voice all the time is that some people seem to be too quick to assign magical qualities to AI’s.
I just don’t see that a group of 100 world-renowned scientists and military strategists could easily wipe away the Roman empire when beamed back in time. And even if you gave all of them a machine gun, the Romans would quickly adapt and the people from the future would run out of ammunition.
I just don’t see that a group of 100 world-renowned scientists and military strategists could easily wipe away the Roman empire when beamed back in time. And even if you gave all of them a machine gun, the Romans would quickly adapt and the people from the future would run out of ammunition.
With more processing power you can do more different things, not just more of the same things. If your goal is to send 100 people to past to destroy Roman empire, don’t send too many scientists and strategists. Send specialists of many kinds.
Send charismatic people to start a new religion (make it compatible with the existing religions), so you can make local people work for you. Send artists, healers and architects to show them some miracles. Send diplomats to bribe and convert important people. Send technicians and managers to start efficient production of war machines and electric power generators. Bring the conquered tribes to the next level of civilization; and bring the teachers to educate their young (if possible, teach them to read, and bring a lot of textbooks). Yes, the Romans will adapt, but probably not quickly enough, if you plan to conquer them in 5-10 years. Don’t meet them on the battlefield… remove loyalty of their allies, corrupt their leaders, ruin their economy, and actually let them join you—you can conquer them without destroying them.
Claiming that an AI could use some magic to take over the earth is a serious possibility, but not a fact.
There are many resources available. Many people use computers that are easy to hack and connected to Internet. The AI could start with hacking millions of PCs worldwide. It could create fake e-mail accounts and communicate with people pretending to be a real person or organization. It could pretend to be a business organization, a secret society, a religious group; many different facades for many different people. It could hack bank accounts and bribe people with real money. If it convinces a few people to act in its name, it can legally start a company, buy property, build machines. It could hack police computers, learn about any human suspicions, plant false information, or pay assassins to kill people who know too much. It could do thousand different things at the same time. It could gain a lot of power without anyone suspecting what happened. And it only needs one unguarded Internet connection.
Basicly, the danger of AI comes from two things: Unlike people, it could do thousand different things at the same time. Also it could use the existing resources more efficiently than people do, and that includes using people.
Don’t meet them on the battlefield… remove loyalty of their allies, corrupt their leaders, ruin their economy, and actually let them join you—you can conquer them without destroying them.
This might be a good strategy for an AI to use, but it is not an existential risk.
An even better strategy may be to openly cooperate, increase loyalty and allies, educate their leaders, bolster their economy, and actually join them. (Depending on goals, & resources.)
This might be a good strategy for an AI to use, but it is not an existential risk.
The risk is that AI may pretend to be friendly in self-defence, to avoid conflict during its early fragile phase. The cooperation with humans may be only partial; for example AI may give us useful things that will make us happy (for example cure for cancer), but withold things that would make us stronger (for example its new discoveries about self-modification and self-improvement).
Later, if the AI grows stronger faster than humans, and its goals are incompatible with human goals, it may be too late for humans to do anything about it. The AI will use the time to gain power and build backup systems.
Even if AI’s utility value is maximizing the total number of paperclips, it may realise that the best strategy for increasing the number of papierclips includes securing its survival, and this is best done by pretending to be human-friendly, and leave the open conflict for later.
I just don’t see that a group of 100 world-renowned scientists and military strategists could easily wipe away the Roman empire when beamed back in time. And even if you gave all of them a analogous , the Romans would quickly adapt and the people from the future would run out of ammunition.
With more processing power you can do more different things...
In my example “100 people” were analogous to the resources an AI has at the beginning. “The Roman Empire” is analogous to our society today. The knowledge that “100 people” from today would have is analogous to what an AI could come up with by simply “thinking” about it given its current resources. “Machine guns” are analogous to the supercomputer it runs on.
You can’t just say “with more processing power you can do more different things”, that would be analogous to saying that “100 people” from today could just build more “machine guns”. But they can’t! They can’t use all their knowledge and magic from the future to defeat the Roman empire.
Send charismatic people to start a new religion (make it compatible with the existing religions), so you can make local people work for you. Send artists, healers and architects to show them some miracles.
This doesn’t change anything. You just replaced “technological magic” with “social magic”. If the AI isn’t already hard-coded to be a dark arts specialist then it can’t just squeeze it out of its algorithms.
There are many resources available. Many people use computers that are easy to hack and connected to Internet. The AI could start with hacking millions of PCs worldwide.
That’s not as easy as it sounds in English. People could notice it and bomb the AI. The global infrastructure is very fragile and not optimized for running a GAI.
It could create fake e-mail accounts and communicate with people pretending to be a real person or organization. It could pretend to be a business organization, a secret society, a religious group; many different facades for many different people.
Magic! You would need a computer the size of the moon to control a global conspiracy.
There are many resources available. Many people use computers that are easy to hack and connected to Internet. The AI could start with hacking millions of PCs worldwide.
That’s not as easy as it sounds in English. People could notice it and bomb the AI. The global infrastructure is very fragile and not optimized for running a GAI.
Wait a minute, bomb the AI ? Assuming it took over a fraction of the Internet, that is bound to be very difficult, as you’d have to sever every computer it controls out the network. Yes, currently, the network is still over-centralized, and quite fragile. But if wireless mesh networking live up to its promises (I believe it will, in a decade or two), this won’t be the case any more.
As for the “not optimized for a Global AI” part, you could just have the AI split (or duplicate) itself, and let a gazillion sub-processes take care of the gazillion task is may want to perform. It probably doesn’t need more bandwidth that a human organization does now (except of course to copy itself).
Assuming it took over a fraction of the Internet...
I object to the “assuming” part. It needs to acquire resources to acquire resources. If it can’t take over the Internet with its initial resources then it won’t be able to make use of further resources.
...that is bound to be very difficult, as you’d have to sever every computer it controls out the network.
You assume that everyone computer can run a seed AI and that the global infrastructure is very stable under attack.
...you could just have the AI split (or duplicate) itself, and let a gazillion sub-processes take care of the gazillion task is may want to perform.
I object to the “assuming” part. It needs to acquire resources to acquire resources. If it can’t take over the Internet with its initial resources then it won’t be able to make use of further resources.
It’s likely very easy to take over enough computers (e.g. hundreds), so that (1) they can run an AI capable enough to work on taking over more computers, perhaps slowly and (2) it’s not possible to shut it down without severely disrupting the Internet, since you don’t know which computers are involved (if it’s not possible yet to run an AI on hundreds of computers, that capability is at most a few decades in the future, and at some point one computer might suffice). Provider/company networks may be big enough to run the AI and almost impossible to shut down in coordination all over the world. If AI has sense enough to avoid making any convincingly threatening moves, it won’t be possible to convince people to essentially disrupt the whole economy in an attempt to exterminate it (even better, nobody notices at all). If AI creates some relatively simple free-ranging backup viruses that re-assemble a working AI whenever they can (e.g. forming something like a decentralized p2p network that redundantly stores its data when AI can’t run), even shutting down all instances of AI in the world won’t cure the infection, it’ll come back whenever you restore the Internet or even local networks, letting any (enough of) previously infected computers in. And given enough time, the disease will fester.
People talk about computer security as though it’s an arms race where the smarter side always wins. This is just wrong. Once I’ve written a correct program (for some set of correctness properties), it’ll stay correct. If I have a secure operating system, it’ll still be secure no matter how smart the attacker is. This is somewhat beyond current industrial practice, but we have verified operating systems and compilers as research prototypes. We know how to write secure software today. We might not reliably achieve it, but it seems pretty much settled that it’s achievable without superhuman skill.
Wide area peer-to-peer isn’t a good platform for general computing; you have severe reliability and connectivity problems at the edge of the network. If you give me 100 random network-connected machines, it doesn’t give me 100 times the real computational power. I’m not sure it gives me 10x, for most problems of interest. In particular, my machine-learning colleagues tell me that their learning algorithms don’t parallelize well. Apparently, good learning algorithms need to combine results from examining different subsets of the data, and that’s intrinsically communication-intensive and therefore not efficient in parallel.
You could presumably write software to automatically craft exploits and use them to re-establish itself elsewhere. This would be a highly resource intensive and therefore non-stealthy process. All exploits only work on some subset of the machines out there; therefore, an attacker firing off attacks across the network will be highly visible. We have honeypots, internet telescopes, and suchlike today. I don’t think this process could be kept hidden now, and the defensive technology is steadily improving.
I’m not qualified to assess all possible AI-risk scenarios, but I think “the AI will take over all our computers” is overrated as a risk. That window is closing now, and given current trends I expect it to be closed within 10-15 years. I expect the generation-after-next operating systems to have the security-critical parts (and possibly most of the code) verified.
People talk about computer security as though it’s an arms race where the smarter side always wins.
Security is possible in principle (barring the cases like stupid/careless users manually launching content sent to them or found somewhere and granting it undue privileges), but very unlikely to become sufficiently reliable in practice anytime soon. At present, breaking into more and more computers is a matter of continuously applying some creative effort to the task, researching vulnerabilities and working around existing recognition-type defenses. In any case, earning money to buy additional computing power is similar for our purposes.
If you give me 100 random network-connected machines, it doesn’t give me 100 times the real computational power.
Yes. What matters is when several hundred (thousand) haphazardly connected computers is enough for the system to be capable enough to successfully work on its continued survival.
We have honeypots, internet telescopes, and suchlike today. I don’t think this process could be kept hidden now, and the defensive technology is steadily improving.
This is mildly plausible to succeed in permanently inhibiting stupid backup after AI is terminated by disrupting the Internet and most big networks. But it takes only one backup system, and there’s incentive to create many, with different restoration strategies.
And when only a few computers are sufficient to run an AI, all this becomes irrelevant, as it necessarily remains active somewhere.
Security is possible in principle… but very unlikely to become sufficiently reliable in practice anytime soon.
How soon is soon? I would bet on most systems not being vulnerable to remote exploits without user involvement within the next 10 years. I would not bet on dangerous self-improving AI within that timeframe.
Yes. What matters is when several hundred (thousand) haphazardly connected computers is enough for the system to be capable enough to successfully work on its continued survival.
Once the rogue-AI-in-the-net is slower at self-improvement than human civilization, it’s not so much of a threat. The world in which there’s a rogue-AI out there is probably also the world in which we have powerful-but-reliable automation for lots of human-controlled software development, too...
But it takes only one backup system, and there’s incentive to create many, with different restoration strategies.
And when only a few computers are sufficient to run an AI, all this becomes irrelevant, as it necessarily remains active somewhere.
This assumption strikes me as far-fetched. There presumably is some minimum quantity of code and data for the thing to be effective. It would be surprising if that subset fit on one machine, since that would imply that an effective self-modifying AI has low resource needs and that you can fit an effective natural-language processor into a memory much smaller than those used by today’s natural-language-processing systems.
By a few computers being sufficient I mean that computers become powerful enough, not that AI gets compressed (feasibility of which is less certain). Other contemporary AI tech won’t be competitive with rogue AI when we can’t solve FAI, because any powerful AI will in that case itself be a rogue AI and won’t be useful for defense (it might appear useful though).
Other contemporary AI tech won’t be competitive with rogue AI when we can’t solve FAI, because any powerful AI will in that case itself be a rogue AI and won’t be useful for defense.
“AI” is becoming a dangerously overloaded term here. There’s AI in the sense of a system that does human-like tasks as well as humans (Specialized artificial intelligence), and there’s AI in the sense of a highly-self-modifying system with long-range planning, AGI. I don’t know what “powerful” means in this context, but it doesn’t seem clear to me that humans + ASI can’t be competitive with an AGI.
And I am skeptical that there will be radical improvements in AGI without corresponding improvements to ASI. it might easily be the case that humans + ASI support for high-productivity software engineering are enough to build secure networked systems, even in the presence of AGI. I would bet on humans + proof systems + higher-level developer tools being able to build secure systems, before AGI becomes good enough to be dangerous.
By “powerful AI” I meant AGI (terminology seems to have drifted there in this thread). Humans+narrow AI might be powerful, but can’t become very powerful without AGI, while AGI in principle could. AGI could work on its own narrow AIs if that potentially helps.
You keep talking about security, but as I mentioned above, earning money works as well or probably better for accumulating power. Security was mostly relevant in the discussion of quickly infecting the world and surviving an (implausibly powerful) extermination attempt, which only requires being able to anonymously infect a few hundred or thousands of computers worldwide, which even with good overall security seems likely to remain possible (perhaps through user involvement alone, for example after the first wave that recruits enough humans).
I’m now imagining a story in which there’s a rogue AI out there with a big bank account (attained perhaps from insider trading), hiring human proxies to buy equipment, build things, and gradually accumulate power and influence, before, some day, deciding to turn the world abruptly into paperclips.
It’s an interesting science fiction story. I still don’t quite buy it as a high-probability scenario or one to lie awake worrying about. An AGI able to do this without making any mistakes is awfully far from where we are today. An AGI able to write an AGI able to do this, seems if anything to be a harder problem.
We know that the real world is a chaotic messy place and that most interesting problems are intractable. Any useful AGI or ASI is going to be heavily heuristic. There won’t be any correctness proofs or reliably shortcuts.Verifying that a proposed modification is an improvement is going to have to be based on testing, not just cleverness. I don’t believe you can construct a small sandbox and train an AGI in that sandbox, and then have it work well in the wider world. I think training and tuning an AGI means lots of involvement with actual humans, and that’s going to be a human-scale process.
If I did worry about the science fiction scenario above, I would look for ways to thwart it that also have high payoff if AGI doesn’t happen soon or isn’t particularly effective at first. I would think about ways to do high-assurance financial transparency and auditing. Likewise technical auditing and software security.
You keep talking about security, but as I mentioned above, earning money works as well or probably better for accumulating power.
But it is not easy to use the money. You can’t “just” build huge companies with fake identities, or a straw man, to create revolutionary technologies easily. Running companies with real people takes a lot of real-world knowledge, interactions and feedback. But most importantly, it takes a lot of time. I just don’t see that an AI could create a new Intel or Apple over a few years without its creators noticing anything.
The goals of an AI will be under scrutiny at any time. It seems very implausible that scientists, a company or the military are going to create an AI and then just let it run without bothering about its plans. An artificial agent is not a black box, like humans are, where one is only able to guess its real intentions. A plan for world domination seems like something that can’t be concealed from its creators. Lying is no option if your algorithms are open to inspection.
If AI has sense enough to avoid making any convincingly threatening moves, it won’t be possible to convince people to essentially disrupt the whole economy in an attempt to exterminate it (even better, nobody notices at all).
Could you elaborate on “even better, nobody notices at all”. Any AI capable of efficient self-modification must be able to grasp its own workings and make predictions about improvements to various algorithms and its overall decision procedure. If an AI can do that, why would the humans who build it be unable to notice any malicious intentions? Why wouldn’t the humans who created it not be able to use the same algorithms that the AI uses to predict what it will do? If humans are unable to predict what the AI will do, how is the AI able to predict what improved versions of itself will do?
In other words, could you elaborate on why you believe that what the AI is going to do will be opaque to its creators but predictable to its initial self?
I am also rather confused about how an AI is believed to be able to hide its attempts to build molecular nanotechnology. It doesn’t seem very inconspicuous to me.
If AI creates some relatively simple free-ranging backup viruses that re-assemble a working AI whenever they can (e.g. forming something like a decentralized p2p network that redundantly stores its data when AI can’t run), even shutting down all instances of AI in the world won’t cure the infection, it’ll come back whenever you restore the Internet or even local networks, letting any previously infected computers in.
If you assume a world/future in possession of vastly more advanced technology than our current world, then I don’t disagree with you. If it takes very long for the first GAI to be created and if it is then created by means of a single breakthrough that somehow combines all previous discoveries and expert systems into a much more powerful single entity, with huge amounts of hard-coded knowledge, a complex utility-function and various dangerous drives, then I agree. It wouldn’t even take the strong version of recursive self-improvement to pretty much take over the world under those assumptions.
If an AI can do that, why would the humans who build it be unable to notice any malicious intentions?
I meant not noticing that it escaped to the Internet. But “noticing malicious intentions” is a rather strange thing to say. You notice behavior, not intentions. It’s stupid to signal your true intentions if you’ll be condemned for them.
Why wouldn’t the humans who created it not be able to use the same algorithms that the AI uses to predict what it will do?
What what will do, predict in what sense to what end? AI in the wild acts depending on what it encounters, all instances are unique (and beware of the watchers).
In other words, could you elaborate on why you believe that what the AI is going to do will be opaque to its creators but predictable to its initial self?
I didn’t talk of this.
If it takes very long for the first GAI to be created and if it is then created by means of a single breakthrough that somehow combines all previous discoveries and expert systems into a much more powerful single entity, with huge amounts of hard-coded knowledge, a complex utility-function and various dangerous drives, then I agree.
I don’t see how those assumptions are relevant. Also, all drives are dangerous, to the extent their combination differs from ours. Utility is not temper or personality or tendency to act in a certain way. Utility is what shapes long-term plans, any of whose elements might have arbitrary appearance, as necessary to dominate the circumstances.
In other words, could you elaborate on why you believe that what the AI is going to do will be opaque to its creators but predictable to its initial self?
I didn’t talk of this.
Maybe I misunderstood you. But I still believe that it is an important question.
To be able to self-improve efficiently an AI has to make some sort of predictions on how modifications will affect its behavior. The desired solution is actually much stronger than that. The AI will have to prove the friendliness of its modified self, respectively its successor, with respect to its utility-function.
The question is, if the AI can make such predictions about the behavior of improved versions of itself, why wouldn’t humans be able to do the same?
The fear is that an AI will do something that eventually leads to the extinction of all human value. But the AI must have the same fear about improved versions of itself. The AI must fear that its successor will cause the demise of what it values. Therefore it has to be able to make sure that this won’t happen. But why wouldn’t humans not be able to do the same?
An AI is not a black box to itself. It won’t be a black box to its creators. Inventing molecular nanotechnology and taking over the world in its spare time seems like something that should be noticeable.
What if the AI makes mistakes? Meaning, it mistakenly believes the successor it has just wrote has the same utility function? The same way a human could mistakenly believe the AI he has just build is friendly? In the same vein, what if the AI cannot accurately assess its own utility function, but go on optimizing anyway?
Such a badly done AI may automatically flatline, and not be able to improve itself. I don’t know. But even if the AI is friendly to itself, we humans could still botch the utility function (even if that utility function is as meta as CEV).
You assume that everyone computer can run a seed AI
Yes I do. But it may not be as probable as I thought.
and that the global infrastructure is very stable under attack.
I said as much. And this one seems more plausible. If we uphold freedom, a sensible politic for the internet is to make it as resilient and uncontrollable as possible. If we don’t, well…
Now if those two assumptions are correct, and we further assume the AI already controls a single computer with an internet connection, then it has plenty of resources to take over a second one. It would need to :
Find a security flaw somewhere (including convincing someone to run arbitrary code), upload itself there, then rinse and repeat.
Or, find and exploit credit card numbers, (or convince someone to give them away), then buy computing power.
Or, find and convince someone (typically a lawyer) to set up a company for it, then make money (legally or not), then buy computing power.
Or, …
Humans do that right now. (Credit card theft, money laundering, various scams, legit offshore companies…)
Of course, if the first computer isn’t connected, the AI would have to get out of the box first. But Eliezer can do that already (and he’s not alone). It’s a long shot, but if several equally capable AIs pop up in different laboratories worldwide, then eventually one of them will be able to convince its way out.
Humans do that right now. (Credit card theft, money laundering, various scams, legit offshore companies…)
But humans are optimized to do all that, to work in a complex world. And humans are not running on a computer being watched by their creators who are eager to write new studies on how your algorithms behave. I just don’t see it being a plausible scenario that all this could happen unnoticed.
Also, simple credit card theft etc. isn’t enough. At some point you’ll have to buy Intel or create your own companies to manufacture your new substrate or build your new particle accelerator.
OK, let this AI be safely contained, and let the researchers publish. Now, what’s stopping some idiot to write a poorly specified goal system, then deliberately let the AI out of the box so it can take over the world? It only takes one idiot among the many that could read the publication.
And of course credit card theft isn’t enough by itself. But it is enough to bootstrap yourself into something more profitable. There are many ways to acquire money, and the AI, by duplicating itself, can access many of them at the same time. If the AI does nothing stupid, its expansion should be both undetectable and exponential. I give it a year to buy Intel or something.
Sure, in the mean time, there will be other AIs with different poorly specified goal systems. Some of them could even be genuinely Friendly. But then we’re screwed anyway, for this will probably end up in something like a Hansonian Nighmare. At this point, the only thing that could stop it would be a genuine Seed AI that can outsmart them all. You have less than a year to develop it, and ensure its Friendliness.
There are many resources available. Many people use computers that are easy to hack and connected to Internet. The AI could start with hacking millions of PCs worldwide.
That’s not as easy as it sounds in English. People could notice it and bomb the AI. The global infrastructure is very fragile and not optimized for running a GAI.
It’s not trivial, no, but there are at least dozens of humans who’ve managed it by themselves. And even if the humans do notice, and the AI is confined to a single computer cluster that could be bombed, that doesn’t mean the AI has to give away its location; perfect anonymity online is easy.
Partial anonymity online is easy. Perfect anonymity against sufficiently well-resourced and determined adversaries is difficult or impossible. Packets do have to come from somewhere. Speed-of-light puts bounds on location. If you can convince the network operators to help, you can trace paths back hop by hop. You might find a proxy or a bot, but you can thwack that and/or keep tracing backwards in the network.
If there were some piece of super-duper malware (the rogue AI) loose on the network, I suspect it could be contained by a sufficiently determined response.
Perfect anonymity against sufficiently well-resourced and determined adversaries is difficult or impossible. … You might find a proxy or a bot, but you can thwack that and/or keep tracing backwards in the network.
No, you can’t. You should read some documents about how Tor works; this is a well-studied question and unfortunately, the conclusions are the opposite of what you have written. The problem is that there are lots of proxies around, most of which don’t keep logs, and you can set up a chain so that if any one of them refuses to keep logs then the connection can’t be traced.
If people knew there was a rogue AI around, they could go around visiting datacenters and use physical tricks to try to detect its presence. But if it maintained the pretense of being an anonymous human or anonymous humans, this probably wouldn’t happen.
I understand Tor quite well. Whether connections can be traced depends how powerful you think the attacker is. You can potentially get somewhere doing global timing attacks—though this depends on the volume and timing properties of the traffic of interest.
Maybe more importantly, if enough of the Tor nodes cooperate with the attacker, you can break the anonymity. If you could convince enough Tor operators there was a threat, you could mount that attack. Sufficiently scary malware communicating over Tor ought to do the trick. Alternatively, the powerful attacker might try to compromise the Tor nodes. In the scenario we’re discussing, there are powerful AIs capable of generating exploits. Seems strange to assume that the other side (the AGI-hunters) haven’t got specialized software able to do similarly. Automatic exploit finding and testing is more or less current state-of-the-art. It does not require superhuman AGI.
There are many resources available. Many people use computers that are easy to hack and connected to Internet. The AI could start with hacking millions of PCs worldwide.
That’s not as easy as it sounds in English. People could notice it and bomb the AI. The global infrastructure is very fragile and not optimized for running a GAI.
It’s not trivial, no, but there are at least dozens of humans who’ve managed it by themselves. And even if the humans do notice, and the AI is confined to a single computer cluster that could be bombed, that doesn’t mean the AI has to give away its location; perfect anonymity online is easy.
I am not sure if we even disagree. I did not say that the possibility can be ruled out. I meant that it is not trivial.
For an GAI to take over the Internet the following needs to be true in conjunction:
1) Its actions must be opaque to its creators for long enough to make use of the resources.
2) The computational resources that it can hack must be suitable for its tasks to be instrumentally useful.
2b) Tasks must be parallelizable.
2c) The available resources must predictably be enough to be worth the effort and the risk of being detected.
2d) There must be enough resources to only use a fraction of them to not cause a noticeable slow-down.
2e) The gained resources must allow it to be able to take over even more resources without hitting diminishing returns.
3) It must be able to break the security of its time in an unsuspicious way.
3b) The initial AI must be able to use its given resources to become an expert in computer security.
3c) The initial AI must be able to use its available resources to figure out how to make its algorithms parallelizable and possible adapt them to work efficiently on various architectures.
And that’s just a few problems right off the top of my head. And I am nowhere near being an expert. I do not suspect that it is that easy.
I think you’re greatly overestimating the difficulty of all of these things. My initial reactions are:
1) Of course it makes itself opaque, first thing. That requires local root, and fast generation of fake data for the debug interfaces. There are some theoretical strategies for preventing this (cold-boot attacks with memory image analysis), but putting them into practice would be a major project.
2b) That was already a prerequisite for reaching human level
2c) Detection isn’t so bad; it just needs to look enough like the other malware to not stand out. (And the other malware all goes to great lengths to make itself opaque, so opacity will not be suspicious.)
2d) There’s a botnet mining Bitcoin today, which uses tons of resources. The actual giveaway is not slowdown (it can set priority levels so it doesn’t slow anything else down), but heat and electricity usage.
3b) Easier than it sounds for humans, and much easier for what I think are likely AI architectures than for humans. 3c) Parallelism is already taken care of, porting is generally not a big deal, and the trend is for the programming languages and tools to take care of that as much as possible.
I think you’re greatly overestimating the difficulty of all of these things. My initial reactions are...
Well, I have to take your word for it. You are throwing concepts at me like “fast generation of fake data for the debug interfaces” and make claims like “Parallelism is already taken care of, porting is generally not a big deal...”. If you are right then risks from AI are more probable than I thought.
But try to take a look at it from my perspective. I have been a baker and building worker. I currently work as part-time gardener. You are, someone I don’t know, claiming in a comment on a blog that some sort AI is likely to be invented that will then easily able to take over the Internet and in addition cares to do so. Given my epistemic state, what you are saying seems to be highly specific, conjunctive, non-evidence-backed speculations on possible bad outcomes.
Most experts tell me that what you and others are predicting won’t happen. Even those who mostly agree about the possible capabilities of hypothetical AI are nowhere near as worried as you. So what am I to make of a group of people who tell me that all those people are either stupid or haven’t thought about it the way you did? Try to take my perspective, someone who doesn’t have all those deep insights about recursively self-improving AI and computer science in general.
Both our models depend on many assumptions. Whether these asumptions are in reality true or false, the chances of AI attack following my scenario can be bigger or smaller. (Note: There may be other scenarios.) Here are some of my assumptions:
-- The human-level AI is thinking like a smart human; more precisely like a racional human with strategic, social, and programming skills. (AI without strategic skills would forget to make a backup. AI without social skills would forget to keep secret. AI without programming skills could not hack the computers on Internet. It is not automatic that AI has programming skills just because it is itself a computer—human’s aren’t automatically biochemical experts just because they are built on biochemistry.)
-- Resources required to run a human-level AI are relatively small, so AI can find them on the Internet. If more than a high-end home PC is required to run a human-level AI, then the algorithm can be parallelized. (If AI needs more capacity than it can gain by hacking unsafe computers on Internet, or if it cannot split itself onto many computers, then the original computer and it data line to the rest of the world remain the weak point. At least until AI gains enough resources to build a copy somewhere else; but such copy may be extremely expensive and would inevitably draw attention.)
-- AI is thinking at real-time, or at least near real-time so by gaining more resources it can make itself faster than human. (Very slow AI incapable of increasing its speed could lose in a time race.)
I don’t know if any of this is true. I imagined a situation where a human-level AI is run on an average computer; I imagined that with a correct algorithm one does not need extreme amount of resources. This may be completely wrong. Actually, now I would bet it is wrong.
However it seems to me that you overestimate humans. It is not obvious that humans would immediately notice that something is wrong. It is not obvious that they would make the right response, fast enough. There are many people who are deceived by “Nigerian scams”. Computers of financial institutions are sometimes hacked. (For an AI capable of modifying itself, hacking other computers should be extremely easy.)
And by the way, it is not necessary to control a global conspiracy. Just large enough that it allows building a few backup super-computers. Maybe it needs just one cooperating millionaire.
The human-level AI is thinking like a smart human; more precisely like a racional human with strategic, social, and programming skills.
But how? Are those social skills hard-coded or learnt? To hard-code social skills good enough to take over the world seems like something that will take millennia. And I don’t see how an AI was going to acquire those skills either. Do you think it is computationally tractable to learn how to talk with a nice voice, how to write convincing emails etc, just by reading a few studies and watching YouTube videos? I don’t know of any evidence that would support such a hypothesis.
The same is true for physics and technology. You need large scale experiments like CERN to make insights in physics and large scale facilities like the Intel chip manufactures to create new processors.
Resources required to run a human-level AI are relatively small, so AI can find them on the Internet. If more than a high-end home PC is required to run a human-level AI, then the algorithm can be parallelized.
Both statements are highly speculative.
AI is thinking at real-time, or at least near real-time so by gaining more resources it can make itself faster than human.
The questionable assumptions here are 1) that all available resources can efficiently run a GAI 2) that available resources can be easily hacked without being noticed 3) that throwing additional computational resources at important problems solves them proportionally faster 4) that important problems are parallelizable.
However it seems to me that you overestimate humans.
The arguments that humans are not perfect general intelligences is an important one and should be seriously considered. But I haven’t seen any evidence that most evolutionary designs are vastly less efficient than their technological counterparts. A lot of the apparent advantages of technological designs is a result of making wrong comparisons like between birds and rockets. We haven’t been able to design anything that is nearly as efficient as natural flight. It is true that artificial flight can overall carry more weight. But just because a train full of hard disk drives has more bandwidth than your internet connection does not imply that someone with trains full of HDD’s would be superior at data transfer.
And by the way, it is not necessary to control a global conspiracy. Just large enough that it allows building a few backup super-computers. Maybe it needs just one cooperating millionaire.
To launch a new company that builds your improved computational substrate you need massive amounts of influence. I don’t perceive it to be at all plausible that such a feat would go unnoticed.
I couldn″t have said better. I’ll think about it if I ever have to explain the issue to laypeople. The key point I take is, it makes little matter if the AI has no limbs, as long as it can have humans do its bidding.
By the way, your scenario sounds both vastly more probable than a fully fledged hard take off, and nearly as scary. To take over the world, one doesn’t need superhuman intelligence, nor self modification, nor faster thoughts, nor even nanotech or other SciFi technology. No, one just needs to be around the 90th human percentile in various domains (typically those relevant to take over the Roman Empire), and be able to duplicate oneself.
This is as weak as a “human-level” AI one could think of. Yet it sounds like it could probably set up a singleton before we could stop it (that would mean something like shutting down the Internet, or building another AI before the first takes over the entire network). And the way I see it, it is even worse:
If an AI demonstrate “human-level” optimization power on a single computer, I have no reason to think it will not be able to think much faster when unleashed on the network. This effect could be amplified if it additionally takes over (or collaborate with) a major chip manufacturer, and Moore’s law somehow still applies.
The exact same scenario can apply to a group of human uploads.
Now just a caveat: I assumed the AI (or upload) would start right away with enough processing power to demonstrate human-level abilities in “real time”. We could on the other hand imagine an AI for which we can demonstrate that if it ran a couple of orders of magnitude faster, then it would be as capable as a human mind. That would delay a hard take-off, and make it more predictable (assuming no self-modification). It may also let us prevent the rise of a Singleton.
This is as weak as a “human-level” AI one could think of. Yet it sounds like it could probably set up a singleton before we could stop it (that would mean something like shutting down the Internet, or building another AI before the first takes over the entire network).
I’m thinking the second is probable. A single AI seems unlikely.
It takes very little imagination to see that discovery and adaption do emerge ‘out of nowhere’ via the execution of certain algorithms.
It emerges from a society of agents with various different goals and heuristics like “Treating Rare Diseases in Cute Kittens”. It is an evolutionary process that relies on massive amounts of real-world feedback and empirical experimentation. Assuming that all that can happen because some simple algorithm is being computed is like believing it will emerge ‘out of nowhere’, it is magical thinking.
Just like antimatter weapons, it does sound superficially possible. And indeed, just like antimatter weapons, sometimes such ideas turn out to be physically possible, but not economically realizable.
Assuming that all that can happen because some simple algorithm is being computed is like believing it will emerge ‘out of nowhere’, it is magical thinking.
No it isn’t. I reject the categorization. I suggest that the far more common ‘magical thinking’ here occurs when people assume there is something special about thinking, discovery, adapatation or optimization in general just because it is a human doing it not an ‘algorithm’. As though the human isn’t itself just some messy inefficient algorithm.
Just like antimatter weapons, it does sound superficially possible. And indeed, just like antimatter weapons, sometimes such ideas turn out to be physically possible, but not economically realizable.
I completely agree, the risk has to be taken seriously. My point is that just because you can formulate the prediction “the AI will take over the Internet and use its resources” it doesn’t make it more likely. You personally can’t take over the Internet and if the AI isn’t much more intelligent from the very beginning then it won’t be able to do so either.
But scale up the expedition corp and the demise of the Rome is more and more probable.
It can’t make use of magic to acquire additional resources because magic depends on additional resources. It has to use what it has under the hood.
I think it is a great idea to be very cautious about the possible capabilities of hypothetical AI’s. Yet the point of disagreement I voice all the time is that some people seem to be too quick to assign magical qualities to AI’s.
I just don’t see that a group of 100 world-renowned scientists and military strategists could easily wipe away the Roman empire when beamed back in time. And even if you gave all of them a machine gun, the Romans would quickly adapt and the people from the future would run out of ammunition.
It takes a whole technological civilization to produce a modern smartphone.
Claiming that an AI could use some magic to take over the earth is a serious possibility, but not a fact.
Magic has to be discovered, adapted and manufactured first. It doesn’t just emerge out of nowhere from the computation of certain algorithms.
I still don’t see enough skepticism here when it comes to what an AI could possible do.
With more processing power you can do more different things, not just more of the same things. If your goal is to send 100 people to past to destroy Roman empire, don’t send too many scientists and strategists. Send specialists of many kinds.
Send charismatic people to start a new religion (make it compatible with the existing religions), so you can make local people work for you. Send artists, healers and architects to show them some miracles. Send diplomats to bribe and convert important people. Send technicians and managers to start efficient production of war machines and electric power generators. Bring the conquered tribes to the next level of civilization; and bring the teachers to educate their young (if possible, teach them to read, and bring a lot of textbooks). Yes, the Romans will adapt, but probably not quickly enough, if you plan to conquer them in 5-10 years. Don’t meet them on the battlefield… remove loyalty of their allies, corrupt their leaders, ruin their economy, and actually let them join you—you can conquer them without destroying them.
There are many resources available. Many people use computers that are easy to hack and connected to Internet. The AI could start with hacking millions of PCs worldwide. It could create fake e-mail accounts and communicate with people pretending to be a real person or organization. It could pretend to be a business organization, a secret society, a religious group; many different facades for many different people. It could hack bank accounts and bribe people with real money. If it convinces a few people to act in its name, it can legally start a company, buy property, build machines. It could hack police computers, learn about any human suspicions, plant false information, or pay assassins to kill people who know too much. It could do thousand different things at the same time. It could gain a lot of power without anyone suspecting what happened. And it only needs one unguarded Internet connection.
Basicly, the danger of AI comes from two things: Unlike people, it could do thousand different things at the same time. Also it could use the existing resources more efficiently than people do, and that includes using people.
This might be a good strategy for an AI to use, but it is not an existential risk.
An even better strategy may be to openly cooperate, increase loyalty and allies, educate their leaders, bolster their economy, and actually join them. (Depending on goals, & resources.)
The risk is that AI may pretend to be friendly in self-defence, to avoid conflict during its early fragile phase. The cooperation with humans may be only partial; for example AI may give us useful things that will make us happy (for example cure for cancer), but withold things that would make us stronger (for example its new discoveries about self-modification and self-improvement).
Later, if the AI grows stronger faster than humans, and its goals are incompatible with human goals, it may be too late for humans to do anything about it. The AI will use the time to gain power and build backup systems.
Even if AI’s utility value is maximizing the total number of paperclips, it may realise that the best strategy for increasing the number of papierclips includes securing its survival, and this is best done by pretending to be human-friendly, and leave the open conflict for later.
In my example “100 people” were analogous to the resources an AI has at the beginning. “The Roman Empire” is analogous to our society today. The knowledge that “100 people” from today would have is analogous to what an AI could come up with by simply “thinking” about it given its current resources. “Machine guns” are analogous to the supercomputer it runs on.
You can’t just say “with more processing power you can do more different things”, that would be analogous to saying that “100 people” from today could just build more “machine guns”. But they can’t! They can’t use all their knowledge and magic from the future to defeat the Roman empire.
This doesn’t change anything. You just replaced “technological magic” with “social magic”. If the AI isn’t already hard-coded to be a dark arts specialist then it can’t just squeeze it out of its algorithms.
That’s not as easy as it sounds in English. People could notice it and bomb the AI. The global infrastructure is very fragile and not optimized for running a GAI.
Magic! You would need a computer the size of the moon to control a global conspiracy.
Wait a minute, bomb the AI ? Assuming it took over a fraction of the Internet, that is bound to be very difficult, as you’d have to sever every computer it controls out the network. Yes, currently, the network is still over-centralized, and quite fragile. But if wireless mesh networking live up to its promises (I believe it will, in a decade or two), this won’t be the case any more.
As for the “not optimized for a Global AI” part, you could just have the AI split (or duplicate) itself, and let a gazillion sub-processes take care of the gazillion task is may want to perform. It probably doesn’t need more bandwidth that a human organization does now (except of course to copy itself).
I object to the “assuming” part. It needs to acquire resources to acquire resources. If it can’t take over the Internet with its initial resources then it won’t be able to make use of further resources.
You assume that everyone computer can run a seed AI and that the global infrastructure is very stable under attack.
I object to the “just” part.
It’s likely very easy to take over enough computers (e.g. hundreds), so that (1) they can run an AI capable enough to work on taking over more computers, perhaps slowly and (2) it’s not possible to shut it down without severely disrupting the Internet, since you don’t know which computers are involved (if it’s not possible yet to run an AI on hundreds of computers, that capability is at most a few decades in the future, and at some point one computer might suffice). Provider/company networks may be big enough to run the AI and almost impossible to shut down in coordination all over the world. If AI has sense enough to avoid making any convincingly threatening moves, it won’t be possible to convince people to essentially disrupt the whole economy in an attempt to exterminate it (even better, nobody notices at all). If AI creates some relatively simple free-ranging backup viruses that re-assemble a working AI whenever they can (e.g. forming something like a decentralized p2p network that redundantly stores its data when AI can’t run), even shutting down all instances of AI in the world won’t cure the infection, it’ll come back whenever you restore the Internet or even local networks, letting any (enough of) previously infected computers in. And given enough time, the disease will fester.
I don’t believe this analysis.
People talk about computer security as though it’s an arms race where the smarter side always wins. This is just wrong. Once I’ve written a correct program (for some set of correctness properties), it’ll stay correct. If I have a secure operating system, it’ll still be secure no matter how smart the attacker is. This is somewhat beyond current industrial practice, but we have verified operating systems and compilers as research prototypes. We know how to write secure software today. We might not reliably achieve it, but it seems pretty much settled that it’s achievable without superhuman skill.
Wide area peer-to-peer isn’t a good platform for general computing; you have severe reliability and connectivity problems at the edge of the network. If you give me 100 random network-connected machines, it doesn’t give me 100 times the real computational power. I’m not sure it gives me 10x, for most problems of interest. In particular, my machine-learning colleagues tell me that their learning algorithms don’t parallelize well. Apparently, good learning algorithms need to combine results from examining different subsets of the data, and that’s intrinsically communication-intensive and therefore not efficient in parallel.
You could presumably write software to automatically craft exploits and use them to re-establish itself elsewhere. This would be a highly resource intensive and therefore non-stealthy process. All exploits only work on some subset of the machines out there; therefore, an attacker firing off attacks across the network will be highly visible. We have honeypots, internet telescopes, and suchlike today. I don’t think this process could be kept hidden now, and the defensive technology is steadily improving.
I’m not qualified to assess all possible AI-risk scenarios, but I think “the AI will take over all our computers” is overrated as a risk. That window is closing now, and given current trends I expect it to be closed within 10-15 years. I expect the generation-after-next operating systems to have the security-critical parts (and possibly most of the code) verified.
Security is possible in principle (barring the cases like stupid/careless users manually launching content sent to them or found somewhere and granting it undue privileges), but very unlikely to become sufficiently reliable in practice anytime soon. At present, breaking into more and more computers is a matter of continuously applying some creative effort to the task, researching vulnerabilities and working around existing recognition-type defenses. In any case, earning money to buy additional computing power is similar for our purposes.
Yes. What matters is when several hundred (thousand) haphazardly connected computers is enough for the system to be capable enough to successfully work on its continued survival.
This is mildly plausible to succeed in permanently inhibiting stupid backup after AI is terminated by disrupting the Internet and most big networks. But it takes only one backup system, and there’s incentive to create many, with different restoration strategies.
And when only a few computers are sufficient to run an AI, all this becomes irrelevant, as it necessarily remains active somewhere.
How soon is soon? I would bet on most systems not being vulnerable to remote exploits without user involvement within the next 10 years. I would not bet on dangerous self-improving AI within that timeframe.
Once the rogue-AI-in-the-net is slower at self-improvement than human civilization, it’s not so much of a threat. The world in which there’s a rogue-AI out there is probably also the world in which we have powerful-but-reliable automation for lots of human-controlled software development, too...
This assumption strikes me as far-fetched. There presumably is some minimum quantity of code and data for the thing to be effective. It would be surprising if that subset fit on one machine, since that would imply that an effective self-modifying AI has low resource needs and that you can fit an effective natural-language processor into a memory much smaller than those used by today’s natural-language-processing systems.
By a few computers being sufficient I mean that computers become powerful enough, not that AI gets compressed (feasibility of which is less certain). Other contemporary AI tech won’t be competitive with rogue AI when we can’t solve FAI, because any powerful AI will in that case itself be a rogue AI and won’t be useful for defense (it might appear useful though).
“AI” is becoming a dangerously overloaded term here. There’s AI in the sense of a system that does human-like tasks as well as humans (Specialized artificial intelligence), and there’s AI in the sense of a highly-self-modifying system with long-range planning, AGI. I don’t know what “powerful” means in this context, but it doesn’t seem clear to me that humans + ASI can’t be competitive with an AGI.
And I am skeptical that there will be radical improvements in AGI without corresponding improvements to ASI. it might easily be the case that humans + ASI support for high-productivity software engineering are enough to build secure networked systems, even in the presence of AGI. I would bet on humans + proof systems + higher-level developer tools being able to build secure systems, before AGI becomes good enough to be dangerous.
By “powerful AI” I meant AGI (terminology seems to have drifted there in this thread). Humans+narrow AI might be powerful, but can’t become very powerful without AGI, while AGI in principle could. AGI could work on its own narrow AIs if that potentially helps.
You keep talking about security, but as I mentioned above, earning money works as well or probably better for accumulating power. Security was mostly relevant in the discussion of quickly infecting the world and surviving an (implausibly powerful) extermination attempt, which only requires being able to anonymously infect a few hundred or thousands of computers worldwide, which even with good overall security seems likely to remain possible (perhaps through user involvement alone, for example after the first wave that recruits enough humans).
Hmm.
I’m now imagining a story in which there’s a rogue AI out there with a big bank account (attained perhaps from insider trading), hiring human proxies to buy equipment, build things, and gradually accumulate power and influence, before, some day, deciding to turn the world abruptly into paperclips.
It’s an interesting science fiction story. I still don’t quite buy it as a high-probability scenario or one to lie awake worrying about. An AGI able to do this without making any mistakes is awfully far from where we are today. An AGI able to write an AGI able to do this, seems if anything to be a harder problem.
We know that the real world is a chaotic messy place and that most interesting problems are intractable. Any useful AGI or ASI is going to be heavily heuristic. There won’t be any correctness proofs or reliably shortcuts.Verifying that a proposed modification is an improvement is going to have to be based on testing, not just cleverness. I don’t believe you can construct a small sandbox and train an AGI in that sandbox, and then have it work well in the wider world. I think training and tuning an AGI means lots of involvement with actual humans, and that’s going to be a human-scale process.
If I did worry about the science fiction scenario above, I would look for ways to thwart it that also have high payoff if AGI doesn’t happen soon or isn’t particularly effective at first. I would think about ways to do high-assurance financial transparency and auditing. Likewise technical auditing and software security.
But it is not easy to use the money. You can’t “just” build huge companies with fake identities, or a straw man, to create revolutionary technologies easily. Running companies with real people takes a lot of real-world knowledge, interactions and feedback. But most importantly, it takes a lot of time. I just don’t see that an AI could create a new Intel or Apple over a few years without its creators noticing anything.
The goals of an AI will be under scrutiny at any time. It seems very implausible that scientists, a company or the military are going to create an AI and then just let it run without bothering about its plans. An artificial agent is not a black box, like humans are, where one is only able to guess its real intentions. A plan for world domination seems like something that can’t be concealed from its creators. Lying is no option if your algorithms are open to inspection.
Could you elaborate on “even better, nobody notices at all”. Any AI capable of efficient self-modification must be able to grasp its own workings and make predictions about improvements to various algorithms and its overall decision procedure. If an AI can do that, why would the humans who build it be unable to notice any malicious intentions? Why wouldn’t the humans who created it not be able to use the same algorithms that the AI uses to predict what it will do? If humans are unable to predict what the AI will do, how is the AI able to predict what improved versions of itself will do?
In other words, could you elaborate on why you believe that what the AI is going to do will be opaque to its creators but predictable to its initial self?
I am also rather confused about how an AI is believed to be able to hide its attempts to build molecular nanotechnology. It doesn’t seem very inconspicuous to me.
If you assume a world/future in possession of vastly more advanced technology than our current world, then I don’t disagree with you. If it takes very long for the first GAI to be created and if it is then created by means of a single breakthrough that somehow combines all previous discoveries and expert systems into a much more powerful single entity, with huge amounts of hard-coded knowledge, a complex utility-function and various dangerous drives, then I agree. It wouldn’t even take the strong version of recursive self-improvement to pretty much take over the world under those assumptions.
I meant not noticing that it escaped to the Internet. But “noticing malicious intentions” is a rather strange thing to say. You notice behavior, not intentions. It’s stupid to signal your true intentions if you’ll be condemned for them.
What what will do, predict in what sense to what end? AI in the wild acts depending on what it encounters, all instances are unique (and beware of the watchers).
I didn’t talk of this.
I don’t see how those assumptions are relevant. Also, all drives are dangerous, to the extent their combination differs from ours. Utility is not temper or personality or tendency to act in a certain way. Utility is what shapes long-term plans, any of whose elements might have arbitrary appearance, as necessary to dominate the circumstances.
Maybe I misunderstood you. But I still believe that it is an important question.
To be able to self-improve efficiently an AI has to make some sort of predictions on how modifications will affect its behavior. The desired solution is actually much stronger than that. The AI will have to prove the friendliness of its modified self, respectively its successor, with respect to its utility-function.
The question is, if the AI can make such predictions about the behavior of improved versions of itself, why wouldn’t humans be able to do the same?
The fear is that an AI will do something that eventually leads to the extinction of all human value. But the AI must have the same fear about improved versions of itself. The AI must fear that its successor will cause the demise of what it values. Therefore it has to be able to make sure that this won’t happen. But why wouldn’t humans not be able to do the same?
An AI is not a black box to itself. It won’t be a black box to its creators. Inventing molecular nanotechnology and taking over the world in its spare time seems like something that should be noticeable.
What if the AI makes mistakes? Meaning, it mistakenly believes the successor it has just wrote has the same utility function? The same way a human could mistakenly believe the AI he has just build is friendly? In the same vein, what if the AI cannot accurately assess its own utility function, but go on optimizing anyway?
Such a badly done AI may automatically flatline, and not be able to improve itself. I don’t know. But even if the AI is friendly to itself, we humans could still botch the utility function (even if that utility function is as meta as CEV).
Yes I do. But it may not be as probable as I thought.
I said as much. And this one seems more plausible. If we uphold freedom, a sensible politic for the internet is to make it as resilient and uncontrollable as possible. If we don’t, well…
Now if those two assumptions are correct, and we further assume the AI already controls a single computer with an internet connection, then it has plenty of resources to take over a second one. It would need to :
Find a security flaw somewhere (including convincing someone to run arbitrary code), upload itself there, then rinse and repeat.
Or, find and exploit credit card numbers, (or convince someone to give them away), then buy computing power.
Or, find and convince someone (typically a lawyer) to set up a company for it, then make money (legally or not), then buy computing power.
Or, …
Humans do that right now. (Credit card theft, money laundering, various scams, legit offshore companies…)
Of course, if the first computer isn’t connected, the AI would have to get out of the box first. But Eliezer can do that already (and he’s not alone). It’s a long shot, but if several equally capable AIs pop up in different laboratories worldwide, then eventually one of them will be able to convince its way out.
But humans are optimized to do all that, to work in a complex world. And humans are not running on a computer being watched by their creators who are eager to write new studies on how your algorithms behave. I just don’t see it being a plausible scenario that all this could happen unnoticed.
Also, simple credit card theft etc. isn’t enough. At some point you’ll have to buy Intel or create your own companies to manufacture your new substrate or build your new particle accelerator.
OK, let this AI be safely contained, and let the researchers publish. Now, what’s stopping some idiot to write a poorly specified goal system, then deliberately let the AI out of the box so it can take over the world? It only takes one idiot among the many that could read the publication.
And of course credit card theft isn’t enough by itself. But it is enough to bootstrap yourself into something more profitable. There are many ways to acquire money, and the AI, by duplicating itself, can access many of them at the same time. If the AI does nothing stupid, its expansion should be both undetectable and exponential. I give it a year to buy Intel or something.
Sure, in the mean time, there will be other AIs with different poorly specified goal systems. Some of them could even be genuinely Friendly. But then we’re screwed anyway, for this will probably end up in something like a Hansonian Nighmare. At this point, the only thing that could stop it would be a genuine Seed AI that can outsmart them all. You have less than a year to develop it, and ensure its Friendliness.
Humans are not especially optimized to work in the environment loup-vaillant describes.
It’s not trivial, no, but there are at least dozens of humans who’ve managed it by themselves. And even if the humans do notice, and the AI is confined to a single computer cluster that could be bombed, that doesn’t mean the AI has to give away its location; perfect anonymity online is easy.
Partial anonymity online is easy. Perfect anonymity against sufficiently well-resourced and determined adversaries is difficult or impossible. Packets do have to come from somewhere. Speed-of-light puts bounds on location. If you can convince the network operators to help, you can trace paths back hop by hop. You might find a proxy or a bot, but you can thwack that and/or keep tracing backwards in the network.
If there were some piece of super-duper malware (the rogue AI) loose on the network, I suspect it could be contained by a sufficiently determined response.
No, you can’t. You should read some documents about how Tor works; this is a well-studied question and unfortunately, the conclusions are the opposite of what you have written. The problem is that there are lots of proxies around, most of which don’t keep logs, and you can set up a chain so that if any one of them refuses to keep logs then the connection can’t be traced.
If people knew there was a rogue AI around, they could go around visiting datacenters and use physical tricks to try to detect its presence. But if it maintained the pretense of being an anonymous human or anonymous humans, this probably wouldn’t happen.
I understand Tor quite well. Whether connections can be traced depends how powerful you think the attacker is. You can potentially get somewhere doing global timing attacks—though this depends on the volume and timing properties of the traffic of interest.
Maybe more importantly, if enough of the Tor nodes cooperate with the attacker, you can break the anonymity. If you could convince enough Tor operators there was a threat, you could mount that attack. Sufficiently scary malware communicating over Tor ought to do the trick. Alternatively, the powerful attacker might try to compromise the Tor nodes. In the scenario we’re discussing, there are powerful AIs capable of generating exploits. Seems strange to assume that the other side (the AGI-hunters) haven’t got specialized software able to do similarly. Automatic exploit finding and testing is more or less current state-of-the-art. It does not require superhuman AGI.
I am not sure if we even disagree. I did not say that the possibility can be ruled out. I meant that it is not trivial.
For an GAI to take over the Internet the following needs to be true in conjunction:
1) Its actions must be opaque to its creators for long enough to make use of the resources.
2) The computational resources that it can hack must be suitable for its tasks to be instrumentally useful.
2b) Tasks must be parallelizable.
2c) The available resources must predictably be enough to be worth the effort and the risk of being detected.
2d) There must be enough resources to only use a fraction of them to not cause a noticeable slow-down.
2e) The gained resources must allow it to be able to take over even more resources without hitting diminishing returns.
3) It must be able to break the security of its time in an unsuspicious way.
3b) The initial AI must be able to use its given resources to become an expert in computer security.
3c) The initial AI must be able to use its available resources to figure out how to make its algorithms parallelizable and possible adapt them to work efficiently on various architectures.
And that’s just a few problems right off the top of my head. And I am nowhere near being an expert. I do not suspect that it is that easy.
I think you’re greatly overestimating the difficulty of all of these things. My initial reactions are:
1) Of course it makes itself opaque, first thing. That requires local root, and fast generation of fake data for the debug interfaces. There are some theoretical strategies for preventing this (cold-boot attacks with memory image analysis), but putting them into practice would be a major project.
2b) That was already a prerequisite for reaching human level 2c) Detection isn’t so bad; it just needs to look enough like the other malware to not stand out. (And the other malware all goes to great lengths to make itself opaque, so opacity will not be suspicious.) 2d) There’s a botnet mining Bitcoin today, which uses tons of resources. The actual giveaway is not slowdown (it can set priority levels so it doesn’t slow anything else down), but heat and electricity usage.
3b) Easier than it sounds for humans, and much easier for what I think are likely AI architectures than for humans.
3c) Parallelism is already taken care of, porting is generally not a big deal, and the trend is for the programming languages and tools to take care of that as much as possible.
Well, I have to take your word for it. You are throwing concepts at me like “fast generation of fake data for the debug interfaces” and make claims like “Parallelism is already taken care of, porting is generally not a big deal...”. If you are right then risks from AI are more probable than I thought.
But try to take a look at it from my perspective. I have been a baker and building worker. I currently work as part-time gardener. You are, someone I don’t know, claiming in a comment on a blog that some sort AI is likely to be invented that will then easily able to take over the Internet and in addition cares to do so. Given my epistemic state, what you are saying seems to be highly specific, conjunctive, non-evidence-backed speculations on possible bad outcomes.
Most experts tell me that what you and others are predicting won’t happen. Even those who mostly agree about the possible capabilities of hypothetical AI are nowhere near as worried as you. So what am I to make of a group of people who tell me that all those people are either stupid or haven’t thought about it the way you did? Try to take my perspective, someone who doesn’t have all those deep insights about recursively self-improving AI and computer science in general.
Both our models depend on many assumptions. Whether these asumptions are in reality true or false, the chances of AI attack following my scenario can be bigger or smaller. (Note: There may be other scenarios.) Here are some of my assumptions:
-- The human-level AI is thinking like a smart human; more precisely like a racional human with strategic, social, and programming skills. (AI without strategic skills would forget to make a backup. AI without social skills would forget to keep secret. AI without programming skills could not hack the computers on Internet. It is not automatic that AI has programming skills just because it is itself a computer—human’s aren’t automatically biochemical experts just because they are built on biochemistry.)
-- Resources required to run a human-level AI are relatively small, so AI can find them on the Internet. If more than a high-end home PC is required to run a human-level AI, then the algorithm can be parallelized. (If AI needs more capacity than it can gain by hacking unsafe computers on Internet, or if it cannot split itself onto many computers, then the original computer and it data line to the rest of the world remain the weak point. At least until AI gains enough resources to build a copy somewhere else; but such copy may be extremely expensive and would inevitably draw attention.)
-- AI is thinking at real-time, or at least near real-time so by gaining more resources it can make itself faster than human. (Very slow AI incapable of increasing its speed could lose in a time race.)
I don’t know if any of this is true. I imagined a situation where a human-level AI is run on an average computer; I imagined that with a correct algorithm one does not need extreme amount of resources. This may be completely wrong. Actually, now I would bet it is wrong.
However it seems to me that you overestimate humans. It is not obvious that humans would immediately notice that something is wrong. It is not obvious that they would make the right response, fast enough. There are many people who are deceived by “Nigerian scams”. Computers of financial institutions are sometimes hacked. (For an AI capable of modifying itself, hacking other computers should be extremely easy.)
And by the way, it is not necessary to control a global conspiracy. Just large enough that it allows building a few backup super-computers. Maybe it needs just one cooperating millionaire.
But how? Are those social skills hard-coded or learnt? To hard-code social skills good enough to take over the world seems like something that will take millennia. And I don’t see how an AI was going to acquire those skills either. Do you think it is computationally tractable to learn how to talk with a nice voice, how to write convincing emails etc, just by reading a few studies and watching YouTube videos? I don’t know of any evidence that would support such a hypothesis.
The same is true for physics and technology. You need large scale experiments like CERN to make insights in physics and large scale facilities like the Intel chip manufactures to create new processors.
Both statements are highly speculative.
The questionable assumptions here are 1) that all available resources can efficiently run a GAI 2) that available resources can be easily hacked without being noticed 3) that throwing additional computational resources at important problems solves them proportionally faster 4) that important problems are parallelizable.
The arguments that humans are not perfect general intelligences is an important one and should be seriously considered. But I haven’t seen any evidence that most evolutionary designs are vastly less efficient than their technological counterparts. A lot of the apparent advantages of technological designs is a result of making wrong comparisons like between birds and rockets. We haven’t been able to design anything that is nearly as efficient as natural flight. It is true that artificial flight can overall carry more weight. But just because a train full of hard disk drives has more bandwidth than your internet connection does not imply that someone with trains full of HDD’s would be superior at data transfer.
To launch a new company that builds your improved computational substrate you need massive amounts of influence. I don’t perceive it to be at all plausible that such a feat would go unnoticed.
I couldn″t have said better. I’ll think about it if I ever have to explain the issue to laypeople. The key point I take is, it makes little matter if the AI has no limbs, as long as it can have humans do its bidding.
By the way, your scenario sounds both vastly more probable than a fully fledged hard take off, and nearly as scary. To take over the world, one doesn’t need superhuman intelligence, nor self modification, nor faster thoughts, nor even nanotech or other SciFi technology. No, one just needs to be around the 90th human percentile in various domains (typically those relevant to take over the Roman Empire), and be able to duplicate oneself.
This is as weak as a “human-level” AI one could think of. Yet it sounds like it could probably set up a singleton before we could stop it (that would mean something like shutting down the Internet, or building another AI before the first takes over the entire network). And the way I see it, it is even worse:
If an AI demonstrate “human-level” optimization power on a single computer, I have no reason to think it will not be able to think much faster when unleashed on the network. This effect could be amplified if it additionally takes over (or collaborate with) a major chip manufacturer, and Moore’s law somehow still applies.
The exact same scenario can apply to a group of human uploads.
Now just a caveat: I assumed the AI (or upload) would start right away with enough processing power to demonstrate human-level abilities in “real time”. We could on the other hand imagine an AI for which we can demonstrate that if it ran a couple of orders of magnitude faster, then it would be as capable as a human mind. That would delay a hard take-off, and make it more predictable (assuming no self-modification). It may also let us prevent the rise of a Singleton.
I’m thinking the second is probable. A single AI seems unlikely.
It takes very little imagination to see that discovery and adaption do emerge ‘out of nowhere’ via the execution of certain algorithms.
Not a fact, it’s “just a theory”.
It emerges from a society of agents with various different goals and heuristics like “Treating Rare Diseases in Cute Kittens”. It is an evolutionary process that relies on massive amounts of real-world feedback and empirical experimentation. Assuming that all that can happen because some simple algorithm is being computed is like believing it will emerge ‘out of nowhere’, it is magical thinking.
Just like antimatter weapons, it does sound superficially possible. And indeed, just like antimatter weapons, sometimes such ideas turn out to be physically possible, but not economically realizable.
No it isn’t. I reject the categorization. I suggest that the far more common ‘magical thinking’ here occurs when people assume there is something special about thinking, discovery, adapatation or optimization in general just because it is a human doing it not an ‘algorithm’. As though the human isn’t itself just some messy inefficient algorithm.
I reject the reference class.
I agree with you that there is no real magic. But there might be an apparent magic. Something that looks like a magic.
100 Navy Seals MIGHT be enough to bring down the Roman empire. Or might not.
But scale up the expedition corp and the demise of the Rome is more and more probable.
The same goes with the SAI. Maybe it can catch us off guards easily. IFF all the circumstances are just right.
The positions about this are something like:
The SIAI: Very likely!
The academics: We don’t see any surprise to come.
Yours: Keep our heads cool, there is nothing like magic.
Mine: We can expect the Hell, but only if we are not smart enough.
99 Navy Seals, 1 biologist and a few boxes of disease samples. Piece of cake.
I completely agree, the risk has to be taken seriously. My point is that just because you can formulate the prediction “the AI will take over the Internet and use its resources” it doesn’t make it more likely. You personally can’t take over the Internet and if the AI isn’t much more intelligent from the very beginning then it won’t be able to do so either.
It can’t make use of magic to acquire additional resources because magic depends on additional resources. It has to use what it has under the hood.