An example of deadly non-general AI
In a previous post, I mused that we might be focusing too much on general intelligences, and that the route to powerful and dangerous intelligences might go through much more specialised intelligences instead. Since it’s easier to reason with an example, here is a potentially deadly narrow AI (partially due to Toby Ord). Feel free to comment and improve on it, or suggest you own example.
It’s the standard “pathological goal AI” but only a narrow intelligence. Imagine a medicine designing super-AI with the goal of reducing human mortality in 50 years—i.e. massively reducing human population in the next 49 years. It’s a narrow intelligence, so it has access only to a huge amount of human biological and epidemiological research. It must gets its drugs past FDA approval; this requirement is encoded as certain physical reactions (no death, some health improvements) to people taking the drugs over the course of a few years.
Then it seems trivial for it to design a drug that would have no negative impact for the first few years, and then causes sterility or death. Since it wants to spread this to as many humans as possible, it would probably design something that interacted with common human pathogens—colds, flues—in order to spread the impact, rather than affecting only those that took the disease.
Now, this narrow intelligence is less threatening than if it had general intelligence—where it could also plan for possible human countermeasures and such—but it seems sufficiently dangerous on its own that we can’t afford to worry only about general intelligences. Some of the “AI superpowers” that Nick mentions in his book (intelligence amplification, strategizing, social manipulation, hacking, technology research, economic productivity) could be enough to cause devastation on their own, even if the AI never developed other abilities.
We still could be destroyed by a machine that we outmatch in almost every area.
- Another type of intelligence explosion by 21 Aug 2014 14:49 UTC; 28 points) (
- 22 Aug 2014 10:12 UTC; 4 points) 's comment on Another type of intelligence explosion by (
Thing is: a narrow AI that doesn’t model human minds and attempts to disrupt it’s strategies isn’t going to hide how it plans to do it.
So you build your narrow super-medicine-bot and ask it to plan out how it will achieve the goal you’ve given it and to provide a full walkthrough and description.
it’s not a general AI, it doesn’t have any programming for understanding lying or misleading anyone so it lays out the plan in full for the human operator. (why would it not?)
who promptly changes the criteria for success and tries again.
I could well be confused about this, but: if the AI “doesn’t model human minds” at all, how could it interpret the command to “provide a full walkthrough and description”?
Until they stumble upon an AI that lies, possibly inadvertently, and then we’re dead...
But I do agree that general intelligence is more dangerous, it’s just that narrow intelligence isn’t harmless.
How do you convincingly lie without having the capability to think up a convincing lie?
Think you’re telling the truth.
Or be telling the truth, but be misinterpreted.
Every statement an AI tells us will be a lie to some extent, simply in terms of being a simplification so that we can understand it. If we end up selecting against simplifications that reveal nefarious plans...
But the narrow AI I had above might not even be capable of lying—it might just simply spit out the drug design, with a list of estimated improvements according to the criteria it’s been given, without anyone ever realising that “reduced mortality” was code for “everyone’s dead already”.
Not so. You can definitely ask questions about complicated things that have simple answers.
Yes, that was an exaggeration—I was thinking of most real-world questions.
I was thinking of most real-world questions that aren’t of the form ‘Why X?’ or ‘How do I X?’.
“How much/many X?” → number
“When will X?” → number
“Is X?” → boolean
“What are the chances of X if I Y?” → number
Also, any answer that simplifies isn’t a lie if its simplified status is made clear.
I think this sums it up well. To my understanding, I think it would only require someone “looking over its shoulder”, asking its specific objective for each drug and the expected results of the drug. I doubt a “limited intelligence” would be able to lie. That is, unless it somehow mutated/accidentally became a more general AI, but then we’ve jumped rails into a different problem.
It’s possible that I’m paying too much attention to your example, and not enough attention to your general point. I guess the moral of the story is, though, “limited AI can still be dangerous if you don’t take proper precautions”, or “incautiously coded objectives can be just as dangerous in limited AI as in general AI”. Which I agree with, and is a good point.
Greg Egan’s story The Moral Virologist is a bit similar :-)
In general, yeah, narrow AIs specializing in stuff like biotech, nanotech or weapons research can be really dangerous.
Narrow AI can be dangerous too is an interesting idea, but I don’t think this is very convincing. I think you’ve accidentally snuck in some things not inside its narrow domain. In this scenario the AI has to model the actual population, including the quantity of the population, which doesn’t seem too relevant. Also, it seems unlikely that people would use reducing absolute number of deaths as the goal function as opposed to chance of death for those already alive.
I think there can be a specialized AI that predicts human behaviour using just statistical methods, without modelling individual minds at all.
First of all, i admire narrow AI more than AGI. I can’t prove that it won’t affect us but your logic fails because of the phrase “then causes sterility or death.” You just assumed that the AI will cause death or sterility out of the blue and that’s totally false. The beauty of narrow AI is that there will be a human in the middle that can control what the AI outputs than an AGI which is totally uncontrollable.
Regarding the “reducing mortality” example, in biostats, mortality is “death due to X, divided by population”. So “reducing cardiovascular mortality” would be dangerous, because it might kill its patients with a nerve poison. Reducing general mortality, though, shouldn’t cause it to kill people, as long as it agrees with your definition of “death.” (Presumably you would also have it list all side effects, which SHOULD catch the nerve-poison &etc.)
It can also prevent births. And if it has a “reduce mortality rate in 50 years time” kind of goal, it can try and kill everyone before those 50 years are up.
Doesn’t a concept such as “mortality” require some general intelligence to understand in the first place?
There’s no XML tag that comes with living beings that says whether they are “alive” or “dead” at one moment or another. We think this is a simple binary thing to see whether something is alive or dead, mainly because our brains have modules (modules whose functioning we don’t understand and can’t yet replicate in computer code) for distinguishing “animate” objects from “inaminate” objects. But doctors know that the dividing line between “alive” and “dead” is a much murkier one than deciding whether a beam of laser light is closer to 500 nm or 450 nm in wavelength (which would be a task that a narrow-intelligence AI could probably figure out). Already the concept of “mortality” is a bit too advanced for any “narrow AI.”
It’s a bit like, if you wanted to design a narrow intelligence to tackle the problem of mercury pollution in freshwater streams, and you came up with the most simple way of phrasing the command, like: “Computer: reduce the number of mercury atoms in the following 3-dimensional GPS domain (the state of Ohio, from 100 feet below ground to 100 feet up in the air, for example), while leaving all other factors untouched.
The computer might respond with something to the effect of, “I cannot accomplish that because any method of reducing the number of mercury atoms in that domain will require re-arranging some other atoms upstream (such as the atoms comprising the coal power plant that is belching out tons of mercury pollution).”
So then you tell the narrow AI, “Okay, try to figure out how to reduce the number of mercury atoms in the domain, and you can modify SOME other atoms upstream, but nothing IMPORTANT.” Well, then we are back to the problem of needing a general intelligence to interpret things like the word “important.”
This is why we can’t just build an Oracle AI and command it to, “Tell us a cure for X disease, leaving all other things in the world unchanged.” And the computer might say, “I can’t keep everything else in the world the same and change just that one thing. To make the medical devices that you will need to cure this disease, you are going to have to build a factory to make the medical devices, and you are going to have to employ workers to work in that factory, and you are going to have to produce the food to feed those workers, and you are going to have to transport that food, and you are going to have to divert some gasoline supplies to the transportation of that food, and that is going to change the worldwide price of gasoline by an average of 0.005 cents, which will produce a 0.000006% chance of a revolution in France....” and so on.
So you tell the computer, “Okay, just come up with a plan for curing X disease, and change as little as possible, and if you do need to change other things, try not to change things that are IMPORTANT, that WE HUMANS CARE ABOUT.”
And we are right back to the problem of having to be sure that we have successfully encoded all of human morality and values into this Oracle AI.
Concepts are generally easier to encode for narrow intelligences. For an AI that designs drugs, “the heart stopped/didn’t stop” is probably sufficient, as it’s very tricky to get cunning with that definition within the limitations of drug-design.
Do you mean increasing human mortality?
Reducing, through reducing the population beforehand.
I am sorely tempted to adjust its goal. Since this is a narrow AI, it shouldn’t be smart enough to be friendly; we can’t encode the real utility function into it, even if we knew what it is. I wonder if that means it can’t be made safe, or just that we need to be careful?
Best I can tell, the lesson is to be very careful with how you code the objective.
No, reducing.