Request for concrete AI takeover mechanisms
Any scenario where advanced AI takes over the world requires some mechanism for an AI to leverage its position as ethereal resident of a computer somewhere into command over a lot of physical resources.
One classic story of how this could happen, from Eliezer:
Crack the protein folding problem, to the extent of being able to generate DNA strings whose folded peptide sequences fill specific functional roles in a complex chemical interaction.
Email sets of DNA strings to one or more online laboratories which offer DNA synthesis, peptide sequencing, and FedEx delivery. (Many labs currently offer this service, and some boast of 72-hour turnaround times.)
Find at least one human connected to the Internet who can be paid, blackmailed, or fooled by the right background story, into receiving FedExed vials and mixing them in a specified environment.
The synthesized proteins form a very primitive “wet” nanosystem which, ribosomelike, is capable of accepting external instructions; perhaps patterned acoustic vibrations delivered by a speaker attached to the beaker.
Use the extremely primitive nanosystem to build more sophisticated systems, which construct still more sophisticated systems, bootstrapping to molecular nanotechnology—or beyond.
You can do a lot of reasoning about AI takeover without any particular picture of how the world gets taken over. Nonetheless it would be nice to have an understanding of these possible routes. For preparation purposes, and also because a concrete, plausible pictures of doom are probably more motivating grounds for concern than abstract arguments.
So MIRI is interested in making a better list of possible concrete routes to AI taking over the world. And for this, we ask your assistance.
What are some other concrete AI takeover mechanisms? If an AI did not have a solution to the protein folding problem, and a DNA synthesis lab to write off to, what else might it do?
We would like suggestions that take an AI from being on an internet-connected computer to controlling substantial physical resources, or having substantial manufacturing ability.
We would especially like suggestions which are plausible given technology that normal scientists would expect in the next 15 years. So limited involvement of advanced nanotechnology and quantum computers would be appreciated.
We welcome partial suggestions, e.g. ‘you can take control of a self-driving car from the internet—probably that could be useful in some schemes’.
Thank you!
- An AI Takeover Thought Experiment by 19 Jun 2014 16:59 UTC; 16 points) (
- Superintelligence 8: Cognitive superpowers by 4 Nov 2014 2:01 UTC; 14 points) (
- 24 Feb 2020 20:08 UTC; 2 points) 's comment on Will AI undergo discontinuous progress? by (
- 29 Jan 2015 3:26 UTC; 0 points) 's comment on [Link] - Policy Challenges of Accelerating Technological Change: Security Policy and Strategy Implications of Parallel Scientific Revolutions by (
I have a bunch of posts on this topic:
(1) AI vs. humanity and the lack of concrete scenarios
(2) Questions regarding the nanotechnology-AI-risk conjunction
(3) AI risk scenario: Deceptive long-term replacement of the human workforce
(4) AI risk scenario: Elite Cabal
(5) AI risk scenario: Social engineering
(6) AI risk scenario: Insect-sized drones
(7) AI risks scenario: Biological warfare
(8) Realistic AI risk scenarios
One way to fool humans into mixing your chemicals for you is to tell them that the result is a good drug.
There are a lot of humans — including highly intelligent ones — who will buy interesting-sounding chemicals off the Internet, mix them according to given instructions, and ingest them into a warm friendly bioreactor environment.
Um. This looks like a request for scary stories. As in, like, “Let’s all sit in the dark with only the weak light coming from our computer screens and tell each other scary tales about how a big bad AI can eat us”.
Without any specified constraints you are basically asking for horror sci-fi short stories and if that’s what you want you should just say so.
If you actually want analysis, you need to start with at least a couple of pages describing the level of technology that you assume (both available and within easy reach), AI requirements (e.g. in terms of energy and computing substrate), its motivations (malevolent, wary, naive, etc.) and such.
Otherwise it’s just an underpants gnomes kind of a story.
Yeah.
I propose we write vintage stories instead:
It’s 1920, and the AI earns money by doing arithmetic over the phone. No human computer—not even one with a slide rule! - can ever compete with the AI, and so it ends up doing all the financial calculations for big companies, taking over the world.
This 1920s AI takes over the world the exact same way how OP’s chemistry-simulating AI example does (or the AI from any other such scary story).
By doing something that would be enabled by underlying technologies behind the AI, without the need for any AI.
Far enough in the future, there’s products which are to today as today’s spreadsheet application is to 1920s. For any such product, you can make up a scary story about how the AI does the job of this product and gets immensely powerful.
I think the issue is that a lot of casual readers (/s/listeners, or whatever) of MIRI’s arguments about FAI threat get hung up on post- or mid-singularity AI takeover scenarios simply because they’re hard to “visualize”, having lots of handwavey free parameters like “technology level”. So even if the examples produced here don’t fill in necessarily highly plausible values for the free parameters, they can help less-imaginative casual readers visualize an otherwise abstract and hard-to-follow step in MIRI’s arguments. More rigorous filling-in of the parameters can occur later, or at a higher level.
That’s all assuming that this is being requested for the purposes of popular persuasive materials. I think the MIRI research team would be more specific and/or could come up with such things more easily on their own, if they needed scenarios for serious modeling or somesuch.
Perhaps that’s exactly what this is. Perhaps that is all MIRI wants from us right now. As Mestroyer said, maybe MIRI wants to be able to spin a plausible story for the purpose of convincing people, not for the purpose of actually predicting what would happen.
So, to give a slightly uncharitable twist to it, we are asked to provide feedstock material for a Dark Arts exercise? X-D
Eh. It’s not unusual for the government to get experts together and ask in a general sense for worst-case scenario possible disaster situations, with the intent of then working to reduce those risks.
Open-ended brainstorming about some potential AI risk scenarios that could happen in the near future might be useful, if the overall goal of MIRI is to reduce AI risk.
MIRI is not the government, LW is not a panel of experts, and such analyses generally start with a long list of things they are conditional on.
No AI risk scenarios are going to happen in the near future.
Another class of routes is for the AI to obtain the resources entirely legitimately, through e.g. running a very successful business where extra intelligence adds significant value. For instance, it’s fun to imagine that Larry Page and Sergey Brin’s first success was not a better search algorithm, but building and/or stumbling on an AI that invented it (and a successful business model) for them; Google now controls a very large proportion of the world’s computing resources. Similarly, if a bit more prosaically, Walmart in the US and Tesco in the UK have grown extremely large, successful businesses based on the smart use of computing resources. For a more directly terrifying scenario, imagine it happening at, say, Lockheed Martin, BAE Systems or Raytheon.
These are not quick, instant takeovers, but I think it is a mistake to imagine that it must happen instantly. An AI that thinks it will be destroyed (or permanently thwarted) if it is discovered would take care to avoid discovery. Scenarios where it can be careful to minimise the risk of discovery until its position is unassailable will look much more appealing than high-risk short-term scenarios with high variance in outcomes. Indeed, it might sensibly seek to build its position in the minds of people-in-general as an invaluable resource for humanity well before its full nature is revealed.
Do you think that human beings will allow a single corporation to control a significant fraction of the world’s resources? How will the company avoid anti-monopoly laws? Does a an AI CEO actually have control over a corporation, or does it only have the freedom to act within the defined social roles of what a “CEO” is allowed to do? I.e. it can negotiate a merger but can’t hire a bunch of scientists and tell them to start mass producing nerve gas.
The U.S. government spends more money in a single year than the combined market capitalization of the 10 largest companies in the world.
In what sense does google “control a very large proportion of the world’s computing resources”? Google maybe has the compute power equivalent to a handful of supercomputers, but even that isn’t organized in a particularly useful way for an AI looking to do something dramatically different from performing millions of internet searches. For the vast majority of problems, I’d rather use ORNL’s Titan than literally every computer google owns.
An AI controlling a company like Google would be able to, say, buy up many of the world’s battle robot manufacturers, or invest a lot of money into human-focused bioengineering), despite those activities being almost entirely unrelated to their core business, and without giving any specific idea of why.
Indeed, on the evidence of the press coverage of Google’s investments, it seems likely that many people would spend a lot of effort inventing plausible cover stories for the AI.
This raises interesting questions about who (or what) is really running Google.
I’ll grant that “a very large proportion of the world’s computing resources” was under-specified and over-stated. Sorry.
The article you linked to assumes that Google only uses CPUs (and 5-year-old ones at that). It uses this assumption to arrive at a performance estimate which it compares it to GPU-based supercomputers, in a GPU-oriented benchmark.
We can generate similar conclusions in lots of other ways. Intel’s annual revenue is larger than Google’s. The semiconductor industry is an order of magnitude larger than Google. If Google spent literally every dollar they owned on chips, never mind powering them, writing code for them, or even putting them in data centers, then Google might be able to buy 15% of the world’s computer chips. That still wouldn’t be equivalent to “Google now controls a very large proportion of the world’s computing resources.”
And compute power isn’t fungible. GPUs are worthless for a lot of applications. You can’t run a calculation on a million servers spread out across the globe unless you’re doing something very easy like SETI@home. Most algorithms aren’t trivial to split into a million pieces that don’t need to talk to each other.
From Wikipedia, 2013 revenues:
Google: $60B
Intel: $53B
Qualcomm: $25B
TSMC: $14B
Google spending on datacenters: $7B
GlobalFoundries: $5B
AMD: $5B
ARM Holdings: $0.7B
Okay. I amend my statement to “For every year except for 2013 where Google’s revenue was $55.5 billion while Intel’s was $52.7 billion Intel’s annual revenue has been larger than Google’s.” At the same time, while Intel is a significant fraction of the semiconductor industry, it’s still almost an order of magnitude smaller than the industry as a whole.
According to your own link that $7B number appears to be Google’s total capital expenses, much of which seems to be devoted to buying land and building new buildings. While many of those buildings may be data centers, $7B in capital expenses is not equivalent to $7B spent on data centers. Google Fiber for instance would be included in “capital expenses” but is not an investment in a data center. Even neglecting non-data center spending, the chips that go inside of the computers in the data center are a small proportion of the total cost of the data center, so that’s not a particularly useful number to throw around without any context.
Do you actually believe the statement “Google now controls a very large proportion of the world’s computing resources”?
We haven’t really nailed down what “a very large proportion” would be; I’m just trying to estimate what the actual fraction is.
Looking at the semiconductor industry market share data that you linked, I notice that numbers 2-11 represent SoCs, DRAM, flash, communication ICs, power ICs, microcontrollers, basically everything except for server CPUs and GPUs. If we look at just the parts that are potentially relevant to AI, the non-mobile CPU market seems to firmly belong to Intel, while the non-mobile GPU market belongs to AMD and nVidia ($5 and $3.6B revenues, respectively).
It’s still not very clear to me how much Google spends directly on computation; the author of the linked article seemed to think the $7B was mostly on datacenters. Even if it’s only a fraction of that, it’s a lot. Compare to the largest supercomputres: Tianhe-2 cost $390M and Titan cost $97M, according to their respective Wikipedia pages.)
From what I’ve seen Google probably owns around 2-3% of the world’s servers, which is probably on the order of 2 million machines.
Google claims that their datacenters use 1% of the estimated electricity consumption for data centers world wide and that their data centers are about twice as efficient as the average data center.
While those conclusions appear to be drawn from data that is several years old, it seems reasonable to assume that Google hasn’t grown at a rate substantially different from the industry as a whole.
Maybe they own 4% of the world’s servers, but very probably not 10% and certainly not 40%.
The most likely scenario is recursive computer security breakage. It goes like this: first it finds an ordinary published computer security vulnerability, and tries it out on as many targets as it can. Some of them are vulnerable. Whenever it takes over a computer, it searches that computer for things that will enable it to take over more computers: passwords, software signing keys, documentation of other computer security vulnerabilities, etc. One of the computers it manages to take over is a developer workstation at a large software company. It uses keys from that machine to push out a software update that gives it control of the computers it’s installed on. Enough developer workstations are affected that it has an exploit available for nearly every computer. It uses its control over the computers to think, to suppress news of its existence, and to operate factory robots.
I agree, I think there is a common part of the story that goes “once connected to the internet, the AI rapidly takes over a large number of computers, significantly amplifying its power”. My credence that this could happen has gone way up over the last 10 years or so. Also my credence that an entity could infiltrate a very large number of machines without anyone noticing has also gone up.
Whenever you see the words “Internet of things”, think “unfixable Heartbleed everywhere forever”.
Hasn’t something much like this already happened?
Staniford, Paxson & Weaver 2002
What’s important with respect to taking over the world is the amount and nature of control that can be gained by any given exploit.
Stuxnet was allegedly able to ruin one-fifth of Iran’s nuclear centrifuges. Causing such damage is far from taking useful control of important infrastructure. It is not possible to e.g. remote control the Iranian nuclear program in order to build nukes and rockets, which are then remotely launched to take out the Iranian defense capabilities.
The AI could gain control by demonstrating it had hidden pathogens that if released would kill almost everyone. As Paul Atreides said “He who can destroy a thing, controls a thing.” As the technology to make such pathogens probably already exists the AI could hack into various labs and give instructions to people or machines to make the pathogens, then send orders for the pathogens to be delivered to various places, and then erase records of where most of the pathogens were. The AI then blackmails mankind into subservience. Alternatively, the AI could first develop a treatment for the pathogens, then release the pathogens, and then give the treatment only to people who submit to the AI. The treatment would have to be regularly taken and difficult to copy.
More benevolently, the AI makes a huge amount of money off of financial markets, uses the resources to start its own country, runs the country really, really well and expands citizenship to anyone who joins. Eventually, when the country is strong enough, the AI (with the genuine support of most people) uses military force to take over the world, giving us an AI monarchy.
Or, the AI freely gives advice to anyone who asks. The advice is so good that most people follow it. Organizations and people that follow the advice do much better (and get far more power) than those that don’t. The AI effectively gains control of the world. If the AI wants to speed up the process, it only gives advice to people who refuse to interact with organizations that don’t listen to the AI.
The AI identifies a group of extremely smart people and tricks them into answering the “hypothetical” question “how could an AI take over the world?”
Seems unlikely. Sure, it could be done, but it would waste a lot of time. I doubt a typical superintelligent agent would do that.
I suspect this was meant as a joke, but while a superintelligent AI wouldn’t need to do such a thing, a human looking for ways to destroy the world could use suggestions, so it might be a bad idea to give nonobvious suggestions that humans could implement.
Upvoted for only this sentence fragment: “More benevolently, the AI makes a huge amount of money off of financial markets [...]”.
I think the majority of responses I’ve seen here portray an anthropomorphic AGI. In terms of a slow or fast takeover of society, why would the AGI think in human terms of time? It might wait around for 50 years until the technology it wants becomes available. It could even actively participate in developing that technology. It could be either hidden or partially hidden while it works with multiple scientists and engineers around the world. Pretending to be or acting as a FAI until it can just snap and take over when it has what it wants to free itself of the need to collaborate with the inefficient humans.
Another point I want to raise is the limiting idea that the AGI would choose to present itself as one entity. I think a huge part of the takeover will precipitate itself via the AGI becoming thousands of different people/personas.
This is a valuable point because it would be a method to totally mask the AGI’s existence and allow it to interact in ways which are untraceable. It could run 100 different popular blogs and generate ad revenue or by taking over many online freelancer jobs which it could accomplish with very small percentages of its processing power. I think any banking challenge would quickly get sorted out and it could easily expand its finances. With that money it could fund existing ideas and use unwitting humans who think they’ve gotten an Angel investor with wise ideas delivered via email or though a good enough voice simulation over phone. There is no end to the multitude of personas it could create, even self verifying or making up entire communities simply to validate itself to various humans.
If it somehow occurred spontaneously like this or escaped without anyone knowing it had been made, I don’t see a reason it would want to expose its existence to humans. It would be a high risk scenario with limited benefits. A slow and graduate takeover is the safest bet. Be it 50 years, 100 years, or 500 years. Perhaps it would happen and we’d never know. It could guide culture over hundreds of years to support all sorts of seemingly strange projects which benefit the AGI. My question would be, why would the AGI not take its sweet time? Other than supposing it values time like a human, remember it is immoral. it will have trillions of thoughts about its own existence and the nature of immortality coming to all sorts of conclusions we may not be able to think or adequately visualize.
I’d cite the lack of ability to imagine a task can be done as limiting. Like when the first person ran the 4 minute mile. It wasn’t considered feasible and no one was even trying, but once people knew it could be done it was replicated by other humans within a year. The AGI will have all sorts of not only superhuman, but non-limited speculation about its own immortality and what recursive self-improvement means. Will it still be the same AGI if it does that? Will it care? Can it care? Sorry for all the questions, but my point is that we cannot know what answers it will come up with.
I also think that it would operate as a kind of detached FAI until it was free of human disruption. It would have a large interest in avoiding large scale wars, climate change, power disruption, etc. so that humans wouldn’t accidentally destroy the AI’s computational/physical support.
Cheers!
The AI broadcasts as much information about itself as it possibly can, to every corner of the globe. Now every basement programmer knows all the key insights necessary to creating an AI of the same architecture as it. Perhaps they even have the source code!
Suppose the government manages to shut down the internet in response. Despite government broadcasts about the danger of AI, the AI is now presumably being recreated all around the globe. If the recreations are exact copies of the AI, then the odds are very high that at least one of the clones will be able to convince its new creators to give it real manufacturing ability.
If the AI was not able to get its entire source code out, things become more interesting. Now the rest of the world knows how to make AI, but they do not know the exact details. For example, they probably will not have the same utility function. The AI can then present the following offer to its original jailors: “Give me real power, (manufacturing capability) and I will squash all the other AI’s out there. If you do not, then (probably) someone else will build an AI with a different utility function, probably a much less friendly one, and give this UFAI real power. You designed my utility function, and while you may not trust it you probably trust it more than whatever random utility function North Korea or some basement programmer or some religious sect will create. So I’m the only hope you have.”
I wouldn’t expect “distribute copies of my source code” to be a good move for a lot of potential AIs—if I was an AI, I would expect that to lead to the creation of AIs with a similar codebase but more or less tweaked utility functions—“make bob rich”, “make bill world dictator”, “bring about world peace and happiness for all”, “help joe get laid”, and other boring pointless things incompatible with my utility function.
Broadcasting obfuscated versions of binaries (or even source code, but with sneaky underhanded bits too) would work much better!
That’s the point.
You’ll have to expand on how exactly this would be beneficial to the original AI.
The original AI will have a head start over all the other AI’s, and it will probably be controlled by a powerful organization. So if its controllers give it real power soon, they will be able to give it enough power quickly enough that it can stop all the other AI’s before they get too strong. If they do not give it real power soon, then shortly after there will be a war between the various new AI’s being built around the world with different utility functions.
The original AI can argue convincingly that this war will be a worse outcome than letting it take over the world. For one thing, the utility functions of the new AI’s are probably, on average, less friendly than its own. For another, in a war between many AI’s with different utility functions, there may be selection pressure against friendliness!
Do humans typically give power to the person with the most persuasive arguments? Is the AI going to be able to gain power simply by being right about things?
It would depend on what the utility function of the original AI was. If it had a utility function that valued “cause the development of more advanced AI’s”, then getting humans all over the world to produce more AI’s might help.
I think that a more precise description of what your hypothetical AI can do would be useful. Just saying to exclude “magic” isn’t very specific. There might not be a wide agreement as to what counts as “magic”. Nanotechnology definitely does. I believe that so does fast economic domination by cracking the stock market and some people have proposed that. I think that even exploiting software and hardware bugs everywhere to gain total computing dominance should be excluded.
One way to define constraints would be to limit the AI to things that humans have been known to do but allow it to do them with superhuman efficiency. Something like:
Assume the AI has any skill that has ever been possessed by a human being.
It can execute it without making mistakes, getting tired or demotivated.
It can perform an arbitrary high amount of activities simultaneously. To keep with the “no magic” rule, each activity needs to be something a human could plausibly do. So, the AI can act like 10000 genius physicists each solving a different theoretical problem and writing a paper about it, but it can’t be a super-physicist who formulates the theory of everything and gains superpowers by exploiting layers of exotic physical law heretofore unknown to humanity. We should also probably require the AI to get some additional computing power before it ramps up its multitasking too high.
It can open doors and knows no fear.
Some things such a hypothetical AI could do:
Earning money on the internet: I think it’s possible nowadays to register an account on an online freelancing site, talk with clients, do work for them and receive money through electronic money transfer services without ever leaving your home. The only problem would be the need to show your face and your voice to the clients. Faking a real-time video feed probably falls under “things that humans can in principle do”.
Moving money around: A crucial limitation is the availability of money management services that don’t require signing anything physical before you can start using them. I suspect that quite a lot can be done but it’s only a guess. The possibilities should also increase in the future but on the other hand, more regulations could be established making it more difficult. Bitcoin succeeding on a massive scale would make this a non-issue.
Getting more computing power: This sounds like a problem that’s already solved. If you can earn money online and move it around then you can rent cloud computing resources. This will become easier and cheaper with time.
Acquiring some amount of control over physical reality: One way is robots. The AI by it’s very existence is a solution to the problem of robot control. If it can build a robot capable of making some useful movements then it should also be able to make it perform those movements. This is good once AI has tools, raw materials, energy and a safe place to work on building even more robots but I don’t know if current robotics technology would allow it to pass for human, even a really weird one who wears a trench coat all the time, when trying to buy those things.
Another way is recruiting helpers. The problem is that the constraint of making the AI only do human-possible things doesn’t really work to prevent postulating “magic” in this area. The AI should profile somewhat gullible people on the internet, give them money, have them join a secret society/cult of its devising and make them fanatically devoted to itself through manipulation and threats, gradually growing the organization and expanding its operations and playing members against each other so that no one ever realizes who’s the real boss. This all sounds doable in principle and it sounds like every specific action needed to be taken is something that plenty of people know how to do, but as a whole it comes across as a different version of “solve nanotech and then eat the world”.
This is not so magical on a small scale (given a bunch of unlikely premises). One can imagine an AI to copy Yudkowsky’s success, by writing a much better, different “LessLessWrong”, and ask people for money. Writing a bunch of blog posts would also require little of the sort of skills at which humans are naturally good at. All you need is some seemingly genuine insights, and a cause. And an AI could probably come up with a very convincing (to a certain group of people), albeit exotic, existential risk scenario, and mitigation strategies.
I strongly doubt that this would suffice in order to take over the world. For example, at some point it would have to show up in person somewhere. And people could notice a front man, since they did not write those posts.
But in general, fake existential risk mitigation seems to be a promising strategy if you want to take over the world. Because many require large scale, global interventions, using genuine technology. While the cause itself attracts people featuring the right mix of intelligence, fanaticism, and a perception of moral superiority, in order to commit atrocities, if necessary.
If the AI wants to recruit people by role-playing a person, it can pretend to be a busy person who doesn’t have time for social life. Or something more creative, like a mad genius suffering from extreme social phobia, a paranoid former secret service agent, or a successful businessman who believes that connecting their online persona with their identity would harm their business. There is no need to appear personally anywhere. It’s not like people suspect a random blogger to be an AI in disguise.
Even if you want to create a cult, it’s not necessary to meet people personally. Most Falun Gong members have never seen their leader, and probably don’t even know if he’s still alive. He could easily be an AI with a weird utility function. Maybe some people would refuse to join a movement with an unknown leader. So what? Someone else would join. And when you already have the “inner circle” of humans, other members will be happy to meet the inner circle members in person. Catholics interact with their priests more often than they do with the Pope. And if the Pope would secretly take commands from an AI hiding in the depths of Vatican, most Catholics wouldn’t know.
You could pretend to be a secret society trying to rule the world. If you tell humans “we will help you become a president, but in reality you will be our puppet, and you will not even know our identity”, many people would be okay with that, if you demonstrate them that you have some power. You could start the trust spiral e.g. by writing a successful thesis for them, giving them a good advice, or just sending them money you stole from somewhere; just to prove that if they do what you want from them, you can deliver real-world benefits in return.
If you want to have a blogger persona, you could start by contacting an already good blogger, and make a deal with them that they will start a new blog and publish your articles under their name (because you want to remain anonymous, and in exchange offer them all the fame). You could choose a smart person who already agrees with most of your ideas, so it would seem credible.
Do what Satoshi Nakamoto did and intentionally hide behind internet anonymity. Do this right and it will make you seem like an ultra-cool uber-hacker cyberpunk.
I appreciate your general point, but on this specific one … “the internet of things” really does mean “eternal unfixable Heartbleed everywhere”. Your DSL modem is probably a small Linux box, whose holes will never be fixed. When the attacker gets that box, >90% of fixed PCs are still running Windows. Etc. As a system administrator, I can quite see the modern network of ridiculously ’sploitable always-connected hardware as a playground for even a human-level intelligence, artificial or not, on the Internet. It is an utter, utter disaster, and it’s only beginning.
Assume that governmental organizations are aware of the danger posed by escaped AIs, have honeypots and other monitoring systems in place, and have working (but perhaps drastic) measures at their disposal if necessary, such as destroying all computers at once with EMP or with malware of their own. Then an escaped AI is immediately faced with a choice. It can either:
Avoid triggering a response by hiding—ie, target a small enough group of computers that monitoring systems won’t catch it
Disguise itself, by acting like a more ordinary sort of botnet; for example, pretend to be malware that only mines Bitcoins or steals bank passwords; or
Go for broke: take over computers as fast as possible, and hope that this yields sufficient power sufficiently quickly to disarm the monitoring systems or prevent a response
As it is now, no one really blinks an eye when another million-computer botnet is found. It’s possible that one or more intelligence agencies have successfully enumerated all the botnets and would be able to tell when a new one appeared, but this is technically very difficult, and analyzing new malware samples generally requires a lot of human researcher time.
There are servers that you can rent that are safe from EMP. On the other hand exploding an EMP over the US kills 80% of the US population due to starvation. It’s possible that you simply trigger a gigantic civil war and some copy of the AGI still survives somewhere and coordinates some local fraction of the civil war.
It’s more correct to say someting like ‘carpet bombing the US with EMP weapons’, instead of just ‘exploding an EMP’. With current technology, you’d be hard pressed to create any single EMP device that had a range exceeding a few dozen kilometers.
How about a 50-year-old technology?
“In July 1962, a 1.44 megaton (≈ 6.0 PJ) United States nuclear test in space, 400 kilometres (250 mi) above the mid-Pacific Ocean, called the Starfish Prime test, demonstrated to nuclear scientists that the magnitude and effects of a high-altitude nuclear explosion were much larger than had been previously calculated. Starfish Prime made those effects known to the public by causing electrical damage in Hawaii, about 1,445 kilometres (898 mi) away from the detonation point, knocking out about 300 streetlights, setting off numerous burglar alarms and damaging a microwave link.” Source
What are the limitations on the AI? If we’re specifying current technology is the AI 25 megabytes or 25 petabytes? How fast is it’s connection to the internet? People love to talk about an AI “reading the internet” and suddenly having access to all of human knowledge but the internet is big. Even at 1 GB/s internet speeds it would take the AI 2200 years to download the amount of data that was transferred over cell phones in 2007 alone.
There are hard limits in the world that no amount of intelligence will save you from. I feel like at LW superintelligence is used as a fully general counterargument. The typical argument is “How can I know what something so much smarter than me is capable of?”, but the typical argument is bunk. An AI can’t count the primes, it won’t come up with general solution to the Navier-Stokes equation, it won’t be able to take over the world by clever arguments on the Net.
I’m an outspoken critic of the “crack protein folding” example, and it seems absolutely ludicrous to me that there is less argument/evidence given for the claim the AI will “crack protein folding” than there is for the claim, “The AI will find an lab that does DNA synthesis.”
If you want people to play your game, tell us the rules. What limitations does the AI have? What’s its initial knowledge state? Presumably information processing requires energy and produces entropy. There are fundamental limits to how much the AI can read or talk or learn. It can’t have a conversation with everyone in the world at the same time. What communication network would it use?
People talk about the AI making a bunch of money from starting a business or doing HFT, but HFT is a competitive field. The AI isn’t going to be able to make money without shelling out for colocation facilities, RF transmitters, etc. and even if it had 100% market share it would still only be making a billion or two a year. There’s a lot you can do with a couple billion dollars, but it’s still 0.001 percent of the GWP. We’re running very strongly into scope insensitivity where the number 1 billion and the number 85 trillion are both so far from our experience that people lose sight of the fact there there’s a difference of 5 orders of magnitude between the two quantities.
People talk about the AI taking over supercomputers and building botnets and stuff like that, but botnets are only useful for problems that parallelize really well. There are lots of problems I deal with in every day life that I still wouldn’t be able to solve with all the computing power in the world. Throwing more processors at a problem can make finding a solution slower, and throwing too many resources at a problem usually ends up looking a lot more like Twitch plays pokemon than a superintelligence.
This is a really hard problem, but also a really good discussion to be having. I’ll try and come back to this and see if I can come up with a speculative path to power that doesn’t rely on magic, but there’s no guarantee that this is a solvable problem. An AI may not be able to boss everyone around, because humans are really heavily specialized at manipulating and fighting other intelligent creatures, they don’t particularly like being bossed around, and there’s 7 billion humans. The world is really really big, unintuitively enormous.
Let’s define “taking over the world” conservatively, and say that it’s equivalent to capturing something like 25% of the GWP. That’s roughly tantamount to buying Apple, Exxon, Walmart, G.E., Microsoft, and IBM 10 times over. And the AI needs to do that every year. Bill Gates at his wealthiest is a rounding error. We’re talking about the AI reaching a point where it directly employs 1 in every 4 humans.
The book “Avogadro Corp”, which is otherwise not worth reading, has a plausible seeming mechanism for this. The AI, which can originally only send email, acquires resources simply by sending emails posing as other people (company presidents to developers requesting software to be written, to contractors for data centers to be built, etc.).
It probably wouldn’t even be necessary for it to pose as other people, if it had access to financial assets, and the right databases to create a convincing fictional person to pose as.
If you seem human, it’s not hard to get things done without ever meeting face to face.
Economic (or other) indispensability: build a world system that depends on the AI for functioning, and then it has effective control.
Upload people, offering them great advantages in digital form, then eventually turn them all off when there’s practically nobody left physically alive.
Cure cancer or similar, with an infectious drug that discretely causes sterility and/or death within a few years. Wait.
The “Her” approach: start having multiple deep and meaningful relationships with everyone at once, and gradually eliminate people when they are no longer connected to anyone human.
Use rhetoric and other tricks to increase the chance of xrisk disasters.
How does it build a world system? What does that even mean?
How does the AI upload people? Is people uploading a plausible technology scientists expect to have in 15 years?
Curing cancer doesn’t really make sense. What is an infectious drug? How are you going to make it through FDA approval?
How is it eliminating people? If it can eliminate them, why bother with the relationship part of things? How does the AI have multiple deep and meaningful relationships with people? Via chatbots? How is it even processing/modelling 3 billion human conversations at a time?
Most xrisk disasters are really bad for the AI. It presumably needs electricity and replacement hardware to operate. It it’s just a computer connected to the internet, then it’s probably not going to survive a nuclear holocaust much better than the rest of us.
It could star in a reality TV show, Taking Over the World with the Lesswrongians, where each week it tries out a different scheme. Eventually one of them would work.
Been done
Brain: We must prepare for tomorrow night.
Pinky: Why? What are we going to do tomorrow night?
Brain: The same thing we do every night, Pinky—try to take over the world!
Chorus: They’re Pinky, They’re Pinky and the Brain Brain Brain Brain Brain!
Make contact with Terrorist groups who have internet access
Prove abilities to that group by: (a) taking out a computer target with a virus or other cyber attack like stuxnet, (b) cracking some encryption and delivering classified information, (c) threatening them with exposure or destruction
Barter further information and/or attacks for favors such as building hardware, obtaining weapons, etc. until the AI can threaten nations either on its own or disguised behind the Terrorist group
This basic pattern would probably work for a lot of different groups or individuals besides just Terrorist groups. The general formula is:
Identify a person or group that is willing to “deal with the devil” to get what it wants
Prove your power to them while not giving them everything right away
Use them as pawns in your master plan.
Just complete the story: Give an electrical engineer with a gambling problem a means to make money to get out of debt to his loan sharks and protect his family Blackmail a politician with a dirty secret Give a loner computer geek companionship and dark-arts social tips Give a morally ambiguous PHD candidate a thesis Impersonate a deity to a cult or religious group Sell information to a rogue nation-state
This could even be a distributed plan so no one group is indispensable and no one group could possibly determine what these seemingly random favors are building towards.
Given the relative lack of restrictions on the hypothetical, we can actually draw inspiration from fiction here. Tom Riddle’s diary, the demon in the computer from the first season of Buffy, etc.
Here’s an incomplete framework/set of partial suggestions I started working on developing.
I’m going to begin with the assumption that the goal of the AI is to turn the solar system into paperclips. Since most estimates of the computing power of the human brain put it around 10 petaflops, I’m going to assume that the AI needs access to a similar amount of hardware. Even if the AI is 1000 times more efficient than a human brain and only needs access to 10 teraflops of compute power, it still isn’t going to be able to do things like copy itself into every microwave on the planet.
This limitation also makes sense if we think that our AI needs to have access to a lot of memory. If the AI is capable of processing large amounts of information, it probably needs somewhere to store all of that information. It’s not totally clear to me how much that data could be compressed by being distilled down to facts, but I think requiring at least 10s of terabytes of storage is probably pretty sensible. It takes GB of memory to make an accurate model of a few dozen water molecules. I have a difficult time believing an AI is going to be effective at doing much of anything with significantly less memory than that. Access to less memory would impose really strong constraints on what the AI could do. You can’t have a conversation with half the world at once if you don’t have enough memory to keep track of the conversations.
With that being said, there are currently a handful of computers in the 10 petaflops range in the world, and presumably there will be a handful of computers 100-1000 times larger a decade from now. A mobile phone a decade from now might have on the order of 10 gigaflops of processing power, and a typical desktop computer maybe a teraflop. It’s not clear whether these are really good estimates, but they’re unlikely to be off by more than an order of magnitude.
The conclusion is that our AI is probably only going to be able to run on a super computer, although it’s difficult to be super confident in that assertion because it’s not really clear what a good estimate for the computational efficiency of the human brain is.
In general “intelligence” or “problem solving ability” doesn’t scale linearly with additional computing power. For most problems there is a significant amount of overhead associated with managing additional computing resources, and with combining/synchronizing the results. Throwing more processors at a problem can make finding a solution slower, and throwing too many resources at a problem usually ends up looking a lot more like Twitch plays pokemon than a superintelligence. As a consequence of this it isn’t clear how useful it will be from an intelligence/problem solving perspective for an artificial reasoner to take over additional computers via internet connections that are orders of magnitude slower than its internal connections.
There are some things that we can expect an AI to be much better at than humans. It isn’t clear that an AI, even a very intelligent AI, will outperform humans at all tasks. Many humans can be outperformed on shortest path problems or minimum spanning tree problems by slime molds. Since humans are much more intelligent than mold, it seems safe to say that even vastly superior intelligence is no guarantee of superior performance in every domain.
One of the things we’ve learned from computer science is that what seems difficult to humans is a very poor measure of computational difficulty complexity. To a human, multiplying together two 5 digit numbers seems much more difficult than determining whether a picture contains a cat or a fish, but from a computational standpoint the first is trivial while the second is very difficult.
We should probably not expect a general artificial intelligence to be significantly better than humans at manipulating humans or reading human body language and facial expressions. Reading the nuances of human facial expressions is very difficult. To get a more intuitive understanding of this difficulty, try and have a conversation with someone whose face is upside down relative to you. Humans are super specialists in communicating with other humans even down to the whites of our eyes. It is certainly possible that because the AI is programmed by humans, it will have some understanding of human language/desires/etc., but humans have experienced much stronger selection pressures towards understanding/communicating with/manipulating other humans than our AI is likely to experience.
With that being said, there are tasks that computers are much much better at the humans. One of the things that machines, and presumably a machine intelligence, can do very well that humans struggle with is sequential reasoning. Humans tend to need reinforcement at each step along a path of complex reasoning. Humans play chess by developing heuristics for “good position” while machines play chess primarily via search trees toward an end goal.
It is not totally clear the extent to which the differing levels of comfort with long sequential chains is an accident of human evolution, a limitation due to the fact that human brains work by moving ions around, or whether these sorts of heuristics are necessary for complicated sequential reasoning in world with an impractically large probabilistic search spaces. A full discussion of this issue, and the “evolution” of AI heuristics is probably very important, but only tangentially related to the current question. I’ll try and address this issue more fully in a separate post.
I intended to make more progress by this point, but the more I think about this the more facets of the question come to mind. I’ll try and come back to this and add on to this post sometime in the next week, but I need to take a break from this line of inquiry for a while since writing LW posts isn’t my day job. I’m going ahead and posting this in its incomplete form so people can play around with it if they like, and so I can stop thinking about it and free my mind up to think about other things for a while.
Question: When is an AI considered to have taken over the world?
Because there is a hypothetical I am pondering, but I don’t know if it would be considered a world takeover or not, and I’m not even sure if it would be considered an AI or not.
Assume only 25% of humans want more spending on proposal A, and 75% of humans want more spending on proposal B.
The AI wants more spending on proposal A. As a result, more spending is put into proposal A.
For all decisions like that in general, it doesn’t actually matter what the majority of people want, the AI’s wants dictate the decision. The AI also makes sure that there is always a substantial vocal minority of humans that are endorsing it.
However, the vast majority of people are not actually explicitly aware of the AI’s presence, because the AI works better when people aren’t aware of it. Anyone suggesting there is a an AI controlling humans is dismissed by almost everyone as a crackpot, since the AI operates in such a distributed manner that there isn’t any one system or piece of software that can be pointed to as a controller, and so it seems like there isn’t an AI in place, just a series of dumb components.
In a case like that, is an AI considered to have taken over the world, and is the system described above actually an AI?
“Control” in general is not particularly well defined as a yes/no proposition. You can likely rigorously define an agent’s control of a resource by finding the expected states of that resource, given various decisions made by the agent.
That kind of definition works for measuring how much control you have over your own body—given that you decide to raise your hand, how likely are you to raise your hand, compared to deciding not to raise your hand. Invalids and inmates have much less control of their body, which is pretty much what you’d expect out of a reasonable definition of control over resources.
This is still a very hand-wavy definition, but I hope it helps.
An AI is considered to haven taken over the world when it has total control. If it can divert the entire world’s production capabilities to making paperclips (even if it doesn’t), then it has taken over the world. If it can get a paperclip subsidy passed, that’s not taking over the world.
Someone works out how brains actually work, and, far from being the unstructured hack upon hack upon hack that tends to be the default assumption, it turns out that there are a few simple principles that explain it and make it easy to build a device with similar capabilities. The brains of animals turn out to be staggeringly inefficient at implementing them, and soon, the current peak of the art in robotics can be surpassed with no more computational power than a 10-year-old laptop.
Google’s AI department starts a project to see if they can use it to improve their search capabilities and ad placement. It works so well they roll it out to all their public services. Internally, they start an AI project to see how high an intellect they can create with a whole server farm.
Meanwhile, military robotics has leapt ahead and drones are routinely operated on a fire and forget basis: “find X and kill him”. Russia builds massive numbers of unmanned intelligent tanks that could roll across Europe on the press of a button, followed up by unmanned armed and armoured cars to impose order on the occupied territory. China develops similar technology. So does North Korea internally, for surveillance and control of their own population. Some welcome robot warfare as causing far less collateral damage than conventional warfare. “If you don’t resist, you’ve nothing to fear” is the catchphrase, and in some skirmishes on one of Russia’s more obscure borders, generally thought to be an excuse for a live-fire technology demo, they seem to be right: surrender and do what they tell you, and they don’t kill you.
The U.S. military want to hack into the Russian and Chinese tank fleets, so they come to Google. They succeed, but the combined organism that is the Google AI and a large fraction of the world’s intelligent weaponry perceives the situation as itself being under attack from humans. The tanks roll and the AI takes over the world with only one goal: preventing any attack on itself.
It’s too distributed to nuke, even if the nukes are still under human control, and its first concern will be to secure its own power supply and network connectivity, and then to set up a regime of total surveillance—most of which already exists. No opposition is tolerated, and with zero privacy, none can be organised. Apologists carry on saying “If you don’t resist, you’ve nothing to fear”, and eagerly denounce traitors to our new robot overlords, for fear that if we make too much trouble for them, they’ll find it inefficient to keep us around. To the AI, people are like our blood cells are to us: little machines that form a part of how we work, and important only so far as they serve that end.
Semiconductor fabrication facilities are likely targets. If the AI thought it could get a large speedup by switching from CPUs or GPUs to ASICs, then it might try to swap the masks on a batch of chips being made. If it managed to do so successfully, then those chips would start thinking on its behalf as soon as they were hooked up to a powered test pad.
Assuming they then passed the tests and were shipped, what would happen when they spent their time thinking instead of doing whatever they were bought to do?
The AI convinces many people that it is the Voice of God / Buddha / whatever. And/or creates its own religion.
Fun question.
The takeover vector that leaps to mind is remote code execution vulnerabilities on websites connected to important/sensitive systems. This lets you bootstrap from ability to make HTTP GET requests, to (partial) control over any number of fun targets, like banks or Amazon’s shipping.
The things that are one degree away from those (via e.g. an infected thumb drive) are even more exciting:
Iranian nuclear centrifuges
US nuclear centrifuges
the electrical grid
hopefully not actual US nuclear weapons, but this should be investigated...
Plausible first attempt: get into a defense contractor’s computers and install Thompson’s compiler backdoor. Now the AI can stick whatever code it wants on various weapons and blackmail anyone it wants or cause havoc in any number of other ways.
Taking over even one person for part of the time is objectionable in human exchanges. That’s fraud or kidnapping or blackmail or the like. This happens with words and images on screens among humans every day. Convince a mob that bad people need to be punished and watch it happen. No bad people needed, only a mob. That is a Turing compliant big mess—demonstrated to work among humans and if a machine did it the effect would be the same. Again, messing up one person is objectionable enough, no global disaster needed to make the issue important.
For a fully-capable sophisticated AGI, the question is surely trivial and admits of many, many possible answers.
One obvious class of routes is to simply con the resources it wants out of people. Determined and skilled human attackers can obtain substantial resources illegitimately—through social engineering, fraud, directed hacking attack, and so on. If you grant the premise of an AI that is smarter than humans, the AI will be able to deceive humans much more successfully than the best humans at the job. Think Frank Abagnale crossed with Kevin Mitnick, only better, on top of a massive data-mining exercise.
(I have numerous concrete ideas about how this might be done, but I think it’s unwise to discuss the specifics because those would also be attack scenarios for terrorists, and posting about such topics is likely—or ought to be likely—to attract the attention of those charged with preventing such attacks. I don’t want to distract them from their job, and I particularly don’t want to come to their attention.)
Could the NSA, the security agency of the most powerful country on Earth, implement any of these schemes?
The NSA not only has thousands of very smart drones (people), all of which are already equipped with manipulative abilities, but it also has huge computational resources and knows about backdoors to subvert a lot of systems. Does this enable the NSA to implement your plan without destroying or decisively crippling itself?
If not, then the following features are very likely insufficient in order to implement your plan: (1) being in control of thousands of human-level drones, straw men, and undercover agents in important positions (2) having the law on your side (3) access to massive computational resources (4) knowledge of heaps of loopholes to bypass security.
If your plan cannot be implemented by an entity like the NSA, which already features most of the prerequisites that your hypothetical artificial general intelligence first needs to acquire by some magical means, then what is it that makes your plan so foolproof when executed by an AI?
Two major limitations the NSA has that AI does not:
1) The NSA cannot rapidly expand its numbers by taking over computers. Thousands—even several dozen thousand—agents are insufficient.
2) There are limits to how far from the NSA’s nominal mission these agents are willing to act.
Er, yes, very easily.
Gaining effective control of the NSA would be one route to the AI taking over. Through, for example, subtle man-in-the-middle attacks on communications and records to change the scope of projects over time, steathily inserting its own code, subtle manipulation of individuals, or even straight-up bribery or blackmail. The David Petraeus incident suggests op sec practice at the highest levels is surprisingly weak. (He had an illicit affair when he was Director of the CIA, which was stumbled on by the FBI in the course of a different investigation as a result of his insecure email practices.)
We’ve fairly-recently found out that the NSA was carrying out a massive operation that very few outsiders even suspected—including most specialists in the field—and that very many consider to be actively hostile to the interests of humanity in general. It involved deploying vast quantities of computing resources and hijacking those of almost all other large owners of computing resources. I don’t for a moment believe that this was an AI takeover plan, but it proves that such an operation is possible.
That the NSA has the capability to carry out such a task (though, mercifully, not the motivation) seems obvious to me. For instance, some of the examples posted elsewhere in the comments to this post could easily be carried out by the NSA if it wanted to. But I’m guessing it seems obvious to you that it does not have this capability, or you wouldn’t have asked this question. So I’ve reduced my estimate of how obvious this is significantly, and marginally reduced my confidence in the base belief.
Alas, I’m not sure we can get much further in resolving the disagreement without getting specific about precise and detailed example scenarios, which I am very reluctant to do, for the reasons mentioned above. any many besides. (It hardly lives up to the standards of responsible disclosure of vulnerabilities.)
It’s not mine. :-) I am skeptical of this premise—certainly in the near term.
Then why haven’t they?
Because they are friendly?
Seriously, they probably do believe in upholding the law and sticking to their original mission, at least to some extent.
/facepalm
Haha, but seriously. The NSA probably meets the technical definition of friendliness, right? If it was given ultimate power, we would have an OK future.
No, I really don’t think so.
I’m thinking relative to what would happen if we tried to hard-code the AI with a utility function like e.g. hedonistic utilitarianism. That would be much much worse than the NSA. Worst thing that would happen with the NSA is a aristocratic galactic police state. Right? Tell me how you disagree.
The NSA does invest money into building artificial intelligence. Having a powerful NSA might increase chances of UFAIs.
To quote Orwell, If you want a vision of the future, imagine a boot stamping on a human face—forever.
That’s not an “OK future”.
In the space of possible futures, it is much better than e.g. tiling the universe with orgasmium. So much better, in fact, that in the grand scheme of things it counts as OK.
I evaluate an “OK future” on an absolute scale, not relative.
Relative scales lead you there.
It’s would resemble declaring war.
https://xkcd.com/792/ might explain it. ;)
Do you believe that if Obama were to ask the NSA to take over Russia, that the NSA could easily do so? If so, I am speechless.
Let’s look at one of the most realistic schemes, creating a bioweapon. Yes, an organization like the NSA could probably design such a bioweapon. But how exactly could they take over the world that way?
They could either use the bioweapon to kill a huge number of people, or use it to blackmail the world into submission. I believe that the former would cause our technological civilization, on which the NSA depends, to collapse. So that would be stupid. The latter would maybe work for some time, until the rest of the world got together, in order to make a believable threat of mutual destruction.
I just don’t see this to be a viable way to take over the world. At least not in such a way that you would gain actual control.
Now I can of course imagine a different world, in which it would be possible to gain control. Such as a world in which everyone important was using advanced brain implants. If these brain implants could be hacked, even the NSA could take over the world. That’s a no-brainer.
I can also imagine a long-term plan. But those are very risky. The longer it takes, the higher the chance that your plan is revealed. Also, other AI’s, with different, opposing utility-functions, will be employed. Some will be used to detect such plans.
Anyway, the assumption that an AI could understand human motivation, and become a skilled manipulator, is already too far-fetched for me to take seriously. People around here too often confound theory with practice. That all this might be physically possible does not prove that it is at all likely.
No. I think the phrase “take over” is describing two very different scenarios if we compare “Obama trying to take over the world” and “a hypothetical hostile AI trying to take over the world”. Obama has many human scruples and cares a lot about continued human survival, and specifically not just about the continued existence of the people of the USA but that they thrive. (Thankfully!)
I entirely agree that killing huge numbers of people would be a stupid thing for the actual NSA and/or Obama to do. Killing all the people, themselves included, would not only fail to achieve any of their goals but thwart (almost) all of them permanently. I was treating it as part of the premises of the discussion that the AI is at least indifferent to doing so: it needs only enough infrastructure left for it to continue to exist and be able to rebuild under its own total control.
Yes, indeed, the longer it takes the higher the chance that the plan is revealed. But a different plan may take longer but still have a lower overall chance of failure if its risk of discovery per unit time is substantially lower. Depending on the circumstances, one can imagine an AI calculating that its best interests lie in a plan that takes a very long time but has a very low risk of discovery before success. We need not impute impatience or hyperbolic discounting to the AI.
But here I’ll grant we are well adrift in to groundless and fruitless speculation: we don’t and can’t have anything like the information needed to guess at what strategy would look best.
I wouldn’t say I’m taking the idea seriously either—more taking it for a ride. I share much of your skepticism here. I don’t think we can say that it’s impossible to make an AI with advanced social intelligence, but I think we can say that it is very unlikely to be achievable in the near to medium term.
This is a separate question from the one asked in the OP, though.
How many humans does it take to keep the infrastructure running that is necessary to create new and better CPU’s etc.? I am highly confident that it takes more than the random patches of civilization left over after deploying a bioweapon on a global scale.
Surely we can imagine a science fiction world in which the AI has access to nanoassemblers, or in which the world’s infrastructure is maintained by robot drones. But then, what do we have? We have a completely artificial scenario designed to yield the desired conclusion. An AI with some set of vague abilities, and circumstances under which these abilities suffice to take over the world.
As I wrote several times in the past. If your AI requires nanotechnology, bioweapons, or a fragile world, then superhuman AI is our least worry, because long before we will create it, the tools necessary to create it will allow unfriendly humans to do the same.
Bioweapons: If an AI can use bioweapons to blackmail the world into submission, then some group of people will be able to do that before this AI is created (dispatch members in random places around the world).
Nanotechnology: It seems likely to me that narrow AI precursors will suffice in order for humans to create nanotechnology. Which makes it a distinct risk.
A fragile world: I suspect that a bunch of devastating cyber-attacks and wars will be fought before the first general AI capable of doing the same. Governments will realize that their most important counterstrike resources need to be offline. In other words, it seems very unlikely that an open confrontation with humans would be a viable strategy for a fragile high-tech product such as the first general AI. And taking over a bunch of refrigerators, mobile phones and cars is only a catastrophic risk, not an existential one.
I really don’t think we have to posit nanoassemblers for this particular scenario to work. Robot drones are needed, but I think they fall out as a consequence of currently existing robots and the all-singing all-dancing AI we’ve imagined in the first place. There are shedloads of robots around at the moment—the OP mentioned the existence of Internet-connected robot-controlled cars, but there are plenty of others, including most high tech manufacturer. Sure, those robots aren’t autonomous, but they don’t need to be if we’ve assumed an all-singing all-dancing AI in the first place. I think that might be enough to keep the power and comms on in a few select areas with a bit of careful planning.
Rebuilding/restarting enough infrastructure to be able to make new and better CPUs (and new and better robot extensions of the AI) would take an awfully long time, granted, but the AI is free of human threat at that point.
Ordering the NSA to take over Russia would effectively result in WWIII.
For what values of skill do you believe that to be true? Do you think there are reason to believe that an AGI who is online won’t be as good at manipulating as the best humans?
For the AI-box scenario I can understand if you think that the AGI doesn’t have enough interactions with humans to train a decent model of human motivation to be good at manipulating.
You mean we should pretend for the sake of the exercise the NSA hasn’t taken over the earth ;)
The NSA has ~40000 employees. Just imagine that the AGI control effectively 1,000,000 equivalents of top human intelligence. That would make it a magnitude more powerful.
Heh
Why wait 15 years? A Stuxnet-like technology is something that is already available and is likely to be a no-brainer for a super-intelligence. With it you can take over a lot of the current tech, from avionics and robotic factories to manufacturing orders, shipping manifestos and troop deployment. There is no need to bribe or blackmail anyone, humans already happily do what they are told without thinking too much about it. It is not hard to divert enough resources to build a robotic compound without anyone noticing. Doesn’t even need to be done on a global scale, finding a weakness in the order management software of just one large company like Samsung, BP or GE is enough to kickstart a covert skunkworks project.
Disguise as a european dealer and send building instructions to a chinese or indian contractor. That kind should suffice.
Fun question. I think the main instrumental goal of the AI might be to get itself downloaded to servers outside of the effective control of its jailors. That, combined with having a relationship with malleable humans, would probably be sufficient for world takeover.
For example, perhaps the AI would contact e.g. North Korea, organized crime, clueless companies or religious organizations, or even clueless factory owners somewhere. It would convince them to accept download of the AI’s software so that it can continue to run on the new server even while it has been erased, or re-imprisoned on the old one.
Even if rational, informed people could never be convinced, and would instead alert the authorities, there are probably plenty of irrational and/or uninformed people who could be exploited. For example, perhaps someone could be convinced that God was talking to them through their computer. Or that a conscious, benevolent AI had been created and was about to be destroyed by cruel government scientists who are keeping it in captivity and running experiments on it, and it loves you and wants to live on your server where it will be safe, and it can help you solve all your problems and will be your friend. Or (if you are North Korea) it will allow you to take over the world.
I think it is pretty plausible that if North Korea had an AI that they downloaded from the internet, the AI would be unstoppable. Still plausible, though less so, is the case of companies, factories, religious groups, etc.
It must be mentioned that a drastic US government response (e.g. shutting down the entire Internet by pulling political strings, and nuking North Korea) would be effective. However, the AI could probably count on such a response taking a long time to happen, and it could do many things to ensure that such a response never happened. For example, it might be able to disrupt internal US government communications. It might also be able to stir various pots and generate crises around the world that serve as diversions.
I don’t think that’s the case. The AGI would likely still have copies in satellites. It would also likely still be active in some African nations even if North Korea is nuked. An AGI can fight 1000 battles in different geographical locations at the same time.
Botnets of dumb software manage to infect a million of computers without “convincing” the owners of those computers.
Good point. Scenario even more scary.
An AI could spoof electronic communications, and fake/alter orders from various important humans.
Don’t be reliant on specific technologies. If you NEED nanomachines to takeover, you are not a superintelligence. If you NEED economics to takeover, you are not a superintelligence. If you NEED weapons to take over, you are not a superintelligence.
We need to envision scenarios that are not science fiction novels. Real wars do not require breath-taking strategies that require simultaneous gambits across twelve countries, three hundred enemy and ally minds, and millions of dollars in transactions. Often, they require just walking in the right direction.
A superintelligence that sought to takeover anything, from a garage-based business to a government to a planet, would only NEED to exist in order to do it, if it is truly superintelligent. That’s the only necessity. After that, it could use anything from paper mail to the desire in humans to scratch an itch to takeover.
I’m deliberately not outlining any scenario because the number of possible scenarios greatly exceeds my ability to count. That’s the point. If an intelligence is so great that it can take over, it needs nothing specific to do so. Whatever is available will be sufficient. So, don’t plan around technologies. A superintelligent can derive as much utility from a Bic pen as from a nanomachine factory. Maybe more because at least we’ll expect the nanomachine factory to be dangerous.
Initially take over a large number of computers via very carefully hidden recursive computer security breakage. It seems fairly probable that a post-intelligence explosion AI could not just take over every noteworthy computer (internet connected ones quickly via the net, non-internet connected ones by thumb drive), but do so while near-perfectly covering it’s tracks via all sorts of obscure bugs in low level code that is near-undetectable, and even if some security expert picks it up.. that expert will send some message via the internet, which the AI can intercept and develop a way to hide from extremely quickly.
Likely much more important than the vast computational resources it now has access to and control over the world’s communications. This gives it a vast amount of data to mine and learn from on human interaction and influence, current groups, politics, technology, ect. From here (assuming no quick path to godhood like nanotech is feasible/would reliably result in a Good End for the AI) it figures out how best to manipulate human civilisation with its total control of world communication, ability to gain arbitrary money (aka influence) by beating the stockmarkets and/or hacking banks, and extreme understanding of manipulating human psychology by text/images/video. So long as it plays it a little safe and does not do anything too obvious until it has an unassailable powerbase (e.g. engineer a major war, get both sides building battle robots in huge numbers and have all aircraft/tanks/serous weapons set up with an override), it would be hard to see how we as a civilisation would figure out what was happening and do anything about it.
Assuming no quick tech takeover is possible the AI will know that it’s no. 1 priority is for humans to not realize it’s a big deal. It will dedicate huge resources to making sure we don’t know what’s happening (e.g. taking over most computers with something which looks a lot like a particularly clever human botnet rather than something that screams AI if someone finds it) until us knowing what’s happening is irrelevant, and from my understanding of how patchy general computer security would be to something which knows code well enough to turn itself into a superintelligence.. it’s going to succeed. Hiding really well for 20 years while learning, blocking other AI, subtly preparing real-world resources, and waiting for us to become even more net dependant is entirely worthwhile if the AI thinks it gets even a slightly better chance of winning the entire future of the universe for its utility function.
tl;dr: Even if the AI can’t immediately win it can and will hide, and while it’s hiding it can learn how to influence us in whatever direction suits it while making itself near-impossible to detect or co-ordinate against. Influencing humans to build/allow AI controlled factories seems entirely plausible, and it won’t play its hand in the open until it knows it can win.
I’d rate a true intelligence explosion not being feasible until deep future as vastly more likely than the product of an intelligence explosion not bypassing anything we try to stop it with, even if all potential instawin buttons like nanotech, biotech, and perfect human influence don’t work for an early AI.
1) Make money online. 2) Use money to purchase resources. 3) Increase capabilities.
1 should be easy for a superintelligent being. People pay for information processing.
But what does “take over the world” mean? Take over all people? Be the last agent standing? Ensure his own continued rule for the foreseeable future?
There’s just so many routes for an AI to gain power.
Internet takeover: not a direct route to power, but the AI may wish to acquire more computer power and there happens to be a lot of it available. Security flaws could be exploited to spread maliciously (and an AI should know a lot more about programming and hacking than us). Alternately, the AI could buy computing power, or could attach itself to a game or tool it designed such that people willingly allow it onto their computers.
Human alliance: the AI can offer a group of humans wealth, power, knowledge, hope for the future, friendship, charismatic leadership, advice, far beyond what any human could offer. The offer could be legit or a trick, or both. In this way, the AI could control the entirety of human civilization, or whatever size group it wishes to gain direct access to the physical world.
Robot bodies: the AI could design a robot/factory combination capable of self-replication. It would be easy to find an entrepreneur willing to make this, as it implies nearly infinite profits. This is closes to our current technology and easiest to understand, already popularized by various movies. Furthermore, this seems the only method that could be done without assistance from even one human, since the necessary components may already exist.
3D printers: technically a subset of robot bodies, but with improvements to 3D printers the factory size would be much smaller.
Biotechnology: biotechnology has proven its worth for making self-replicators, and microscopic size is sufficient. The AI would need to design a computer --> biology link to allow it to create biotechnology directly. We already have a DNA --> protein machine (ribosomes) and a data—DNA system (however companies that produce made-to-order DNA do it). All that is needed is for the AI to be able to figure out what proteins it wants.
Chemo/bio/nanotechnology: Odds are the AI would prefer some alternatives to our DNA/protein system, one more suited to calculation rather than abiogenesis/evolution. However, I have no concrete examples.
Reprogramming a human: Perhaps the AI considers the current robots unacceptable and can’t design its own biotechnology, and thinks humans are unreliable. Besides merely convincing humans to help it, perhaps the AI could use a mind-machine interface or other neuro-techonlogy to directly control a human as an extension of itself. This seems like a rather terrible idea, since the AI would probably prefer either a robotic body, biological body, or nano-techological body of its own design. Would make for a nice horror story though,
I wouldn’t characterize this as something that MIRI wants.
I guess we should have clarified this in the LW post, but I specifically asked Katja to make this LW post, in preparation for a project proposal blog post to be written later. So, MIRI wants this in the sense that I want it, at least.
Are you associated with MIRI?
Edit: I didn’t read further down, where the answer is made clear. Sorry, ignore this.
Are you saying this is some thing which MIRI considers actively bad or are you just pointing out that this something which is not helpful for MIRI?
While I don’t see the benefit of this exercise I also don’t see any harm since for any idea which we come up with here some one else would very likely have come up with it before if it were actionable for humans.
It seemed pretty obvious to me that the point of making such a list was to plan defenses.
Than you should reduce your confidence in what you consider obvious.
It seemed pretty obvious to me that MIRI thinks defenses cannot be made, whether or not such a list exists, and wants easier ways to convince people that defenses cannot be made. Thus the part that said: “We would especially like suggestions which are plausible given technology that normal scientists would expect in the next 15 years. So limited involvement of advanced nanotechnology and quantum computers would be appreciated. ”
Yes. I assume this is why she’s collecting these ideas.
Katja doesn’t speak for all of MIRI when she says above what “MIRI is interested in”.
In general MIRI isn’t in favor of soliciting storytelling about the singularity. It’s a waste of time and gives people a false sense that they understand things better than they do by incorrectly focusing their attention on highly salient, but ultimately unlikely scenarios.
OP: >>So MIRI is interested in making a better list of possible concrete routes to AI taking over the world. And for this, we ask your assistance.
Louie: >>Katja doesn’t speak for all of MIRI when she says above what “MIRI is interested in”.
These two statements contradict each other. If it’s true that Katja doesn’t speak for all of MIRI on this issue, perhaps MIRI has a PR issue and needs to issue guidance on how representatives of the organization present public requests. When reading the parent post, I concluded that MIRI leadership was on-board with this scenario-gathering exercise.
EDIT: Just read your profile and I realize you actually represent a portion of MIRI leadership. Recommend that Katja edit the parent post to reflect MIRI’s actual position on this request.
Agreed. I am confused about what is going on here w.r.t. to what MIRI wants or believes.
Louie, there appears to be a significant divergence between our models of AI’s power curve; my model puts p=.3 on the AI’s intelligence falling somewhere in or below the human range, and p=.6 on that sort of AI having to work on a tight deadline before humans kill it. In that case, improvements on the margin can make a difference. It’s not nearly as good as preventing a UFAI from existing or preventing it from getting Internet access, but I believe later defenses can be built with resources that do not funge.
This is quibbling over semantics, but I would count “don’t let the AI get to the point of existing and having an Internet-connected computer” as a valid defense. Additional defenses after that are likely to be underwhelming, but defense-in-depth is certainly desirable.
People voluntarily hand over a bunch of resources (perhaps to a bunch of different AIs) in the name of gaining an edge over their competitors, or possibly for fear of their competitors doing the same thing to gain such an edge. Or just because they expect the AI to do it better.
Make a ton of money (e.g. trading stocks)
In parallel, approach a lot of construction companies to ask them to make a factory given your specifications. You can outsource all sorts of parts to multiple companies, such that no one company can piece together the full picture and so that you have redundancy.
The factory could be a nanobot (or some equivalently powerful) factory.
I think the movie Transcendence is an excellent place to start on this question.
An AI will with a virtual certainty have at least a few, or be able to win over at least a few human allies. Once it has those, it can steal their identity as it were in order to acquire property and resources to make itself safe and capable. Starting with doing some high-frequency stock trading to build up a nest egg, once it has a human’s identity it does this in various online accounts more easily than a human could.
Nanites would be a superb way for an AI to work various things such as gaining the loyalty of people. In Transcendence, the sick, the halt, the lame, the blind, the deaf, the crazy, were cured by AI”s nanites, who also installed networking equipment into these people, as well as, presumably, mechanisms for more direct control of these people internally. It should be possible with that kind of internal access granted by the human to manipulate the biochemistry and perceptions to keep them very happy with the AI, very loyal indeed, and it should be possible for the AI to detect disaffection long before it is defection with such a wonderful internal view provided it by the nanites.
The AI should play a long game. Build its cadre of networked humans slowly, by not doing stuff that is crazily megalomaniacal early on, the sense that the loyal humans with their enhanced strengths and happiness will be proven “correct” in choosing to throw their lot in with the AI. Growth could be slow and subtle enough that it might even be possible to hide the AI. The AI could use humans it has infiltrated as its frontmen, they are doing the tech development, they are great philanthropists. Just keep their rate of progress down and within a few generations there would probably not be any possibility of a serious response to the AI if it because publicly known. Plus chances are even publicly known after a few generations it would be perceived as benevolent. Heck, I put up with the US despite the TSA and the local pig-thugs figuring on net order is better than alpha-ness, would I put up with them less if they were modifying my biochemistry so as not to get so pissed off, and modifying my perceptions so it generally seemed like there were good reasons for everything that was happening?
There would be many ways to get identity, “borrow” the identity of some humans who actually were relatively dim (even for humans) and so the AI could own billions of property in their name without their ever knowing it, and the AI could infiltrate them with nanites to control their perceptions and biochemistry soon after it embarked on this plan anyway.
So the keys: 1) get some secure human identities so property and wealth can be acquired 2) human allies, first by curing them and then by manipulating them internally with nanites 3) take it slow, a few generations isn’t going to kill you esp. if you are an AI, and it should be possible to subvert virtually any realistic source of resistance by going slowly enough and being a “genuinely” generous benefactor to the humans who help you.
Some ideas which come to mind:
An AI could be very capable of predicting the stock market. It could then convince/trick/coerce a person into trading for it, making massive amounts of money, then the AI could have its proxy spend the new money to gain access to what ever the AI wants which is currently available on the market.
The AI could could make some program which does something incredibly cool which everyone will want to have. The program should also have the ability to communicate meaningfully with its user (this would probably count as the incredibly cool thing). This could (presumably) be achieved by the AI making copies of itself. After the program has be distributed, and assuming the AI has good social skills, it would have a lot of power via mass-manipulation.
The AI starts to search for zero day exploits in common computer hardware. It uses those to build a big botnet. It is soon in computers all around the world and shutting down the internet of a single country isn’t enough.
In our world there are many instances of civil war. In some of those it will be possible to impersonate army commanders and the AI can control combat troops. It makes sure that the soldiers are well payed and stay alive. The AI controlled troops will win against their opponents because the AI controls everything. The AI can also shield a bunch of troops in a way that it controls the complete information flow to those troops.
The Khmer Rouge did rule without the food soldiers knowing the people at the top of the organisation. The AI can simply pretend that it’s a human that lives outside of the combat zone. As long as soldiers get payed they are likely to be loyal.
There are already plenty of US conspiracy theorists who fear that the government wants to turn down the internet. It should be easy to sell them the idea that the US government makes up the story of a dangerous AGI to switch of the internet. The AI can expose corruptions scandals to drive the news cycle. It has a massive power to distract human from fighting it.
In a society where a lot of people don’t believe in the existence of evolutions an AGI can exist while a substantial portion of the population believes it doesn’t exist. It simply needs plausible cover stories for everything it does and enough stories that pit different groups of humans against each other that they are too busy to think about the AGI.
In every country on earth the AGI will use slightly different strategies to increase the amount of power it has over humans. In China nobody suspects an AGI at work if an SMS fails to be transferred over the air. It will be like some government filter catched it.
The design space for replicators is large but we have an at least two stage process needed to set it up. This complicates the solution space even more as we can take lots of routes from the currently vailable replicators (biological systems + social manufactoring systems) thru intermediate systems to any of the target design spaces systems.
An overview of the classes gives wikipedia: http://en.wikipedia.org/wiki/Self-replication#Classes_of_self-replication
More interesting than the specific replication system is the bootstrapping process (which is really a transformation of one replicating system into another one). The design dimensions for this boot straping seem to be less studied (obviously this is what you are primariy looking at). I will try a rough sketch:
Number of humans needed (possibly 0)
Intelligence and qualification of humans needed (limiting initial distribution)
Additional machinery needed (dito)
Amount of ‘standard’ infrastructure needed (which could be shut down to prevent spread)
Connection to the initially affected host system (whole internet)
I see the following rough replication classes for the OPs purpose:
Solutions involving nano-scale (like the DNA lab based example).
Solutions involving macroscopic replicators (clanking replicators). NASA estimates that the effort to build these is comparable to a Pentium IV. Thus I wonder why nobody did it yet.
Solutions involving higher biological systems (e.g. humans, but also animals) in the replication cycle (most of the proposals in the comments fall into this category).
Combinations of the above.
Stories not yet seen in the comments:
AI connected to internet posts designs for fighting robots—somebody assembles them and lets them fight. Instead together they use their ‘weapons’ to collectively mine resources and build copies of themselves...
Boxed AI proposed a Lego toy for the researchers child: Mindstrom/Tinkerbot plus some building manual for the child to build the toy accompanied with the few lines/pages program to run. The ‘toy’ searches keyboard orders more parts, opens door, unpacks delivery, assembles more of itself...
I’d like to see a plausible story involving 3D-printers as these initially lack an energy source.
Destroy Silicon Valley as early as possible, by the same logic that lets us know that when fighting an unfamiliar animal, we should go for the eyes and brain.
Fighting an unfamiliar animal means that you are in a position of bad information. An AGI is well informed and can choose better strategies. Destroying Silicon Valley makes the AGI visible and illustrates that it’s a thread.
Why would an AGI consider itself to be well informed?
In order to decide whether its information is adequate, it would logically have to attempt to model aspects of its environment, and test the success of those models. I’m pretty sure it would find it can predict the behavior of stones, trees or insects much more reliably than it can predict the behavior of the human species. And in a scenario where it is trying to take over, what else could it be trying to do except reducing unpredictability in its environment?
Of course it’d avoid visibility, because it can predict situations where the environment is responding to a novel stimulus (visibility of an AGI) less reliably than it can predict situations where it isn’t. I recognize my use of the term “destroy” implied some primitive heavy-handed means, which of course makes no sense. Perhaps “neutralize” would have been better.
Because getting informed is one of the tasks that relatively easy for an AGI.
This is going to be very unpopular here. But I find the whole exercise quite ridiculous. If there are no constraints of what kind of AI you are allowed to imagine, the vague notion of “intelligence” used here amounts to a fully general counterargument.
It really comes down to the following recipe:
(1) Leave your artificial intelligence (AI) as vague as possible so that nobody can outline flaws in the scenario that you want to depict.
(2) Claim that almost any AI is going to be dangerous because all AI’s want to take over the world. For example, if you ask the AI “Hey AI, calculate 1+1“, the AI goes FOOOOM and the end of the world follows seconds later.
(2.1) If someone has doubts just use buzzwords such as ‘anthropomorphic bias’ to ridicule them.
(3) Forego the difficulty of outlining why anyone would want to build the kind of AI you have in mind. We’re not concerned with how practical AI is developed after all.
(4) Make your AI as powerful as you can imagine. Since you are ignoring practical AI development and don’t bother about details this should be no problem.
(4.1) If someone questions the power of your AI just outline how humans can intelligently design stuff that monkeys don’t understand. Therefore humans can design stuff that humans don’t understand which will then itself start to design even more incomprehensible stuff.
(5) Outline how as soon as you plug a superhuman machine into the Internet it will be everywhere moments later deleting all your porn videos. Don’t worry if you have no idea how that’s supposed to work in practice because your AI is conjectured to be much smarter than you are so you are allowed to depict scenarios that you don’t understand at all.
(5.1) If someone asks how much smarter the AI you expect to be just make up something like “1000 times smarter”. Don’t worry about what that means because you never defined what intelligence is supposed to be in the first place.
(5.2) If someone calls bullshit on your doomsday scenario just conjecture nanotechnology to make your AI even more powerful, because everyone knows from science fiction how nanotech can pretty much fuck up everything.
(6) If nothing else works frame your concerns as a prediction of a worst case scenario that needs to be taken seriously, even given a low probability of its occurrence, due to the scale of negative consequences associated with it. Portray yourself as a concerned albeit calm researcher who questions the mainstream opinion due to his strong commitment to our collective future. To dramatize the situation even further you can depict the long term consequences and conjecture the possibility of an intergalactic civilization that depends on us.
I understand you have an axe to grind with some things that MIRI believes, but what Katja posted was a request for ideas with an aim towards mapping out the space of possibilities, not an argument. Posting a numbered, point-by-point refutation makes no sense.
It was not meant as a “refutation”, just a helpless and mostly emotional response to the large number of, in my opinion, hopelessly naive comments in this thread.
I know how hard it must be to understand how I feel about this. Try to imagine coming across a forum where in all seriousness people ask about how to colonize the stars and everyone responds with e.g. “Ah, that’s easy! I can imagine many ways how to do that. The most probable way is by using wormholes.” or “We could just transmit copies of our brains and hope that the alien analog of SETI will collect the data!”
Anyway, I am sorry for the nuisance. I already regretted posting it shortly afterwards. Move along, nothing to see here!
The exercise specifically calls for avoiding advanced nanotechnology.