AI risk, executive summary

Stuart_Armstrong7 Apr 2014 10:33 UTC

22 points

MIRI recently published “Smarter than Us”, a 50 page booklet laying out the case for considering AI as an existential risk. But many people have asked for a shorter summary, to be handed out to journalists for example. So I put together the following 2-page text, and would like your opinion on it.

In this post, I’m not so much looking for comments along the lines of “your arguments are wrong”, but more “this is an incorrect summary of MIRI/FHI’s position” or “your rhetoric is infective here”.

AI risk

Bullet points

The risks of artificial intelligence are strongly tied with the AI’s intelligence.
There are reasons to suspect a true AI could become extremely smart and powerful.
Most AI motivations and goals become dangerous when the AI becomes powerful.
It is very challenging to program an AI with safe motivations.
Mere intelligence is not a guarantee of safe interpretation of its goals.
A dangerous AI will be motivated to seem safe in any controlled training setting.
Not enough effort is currently being put into designing safe AIs.

Executive summary

The risks from artificial intelligence (AI) in no way resemble the popular image of the Terminator. That fictional mechanical monster is distinguished by many features – strength, armour, implacability, indestructability – but extreme intelligence isn’t one of them. And it is precisely extreme intelligence that would give an AI its power, and hence make it dangerous.

The human brain is not much bigger than that of a chimpanzee. And yet those extra neurons account for the difference of outcomes between the two species: between a population of a few hundred thousand and basic wooden tools, versus a population of several billion and heavy industry. The human brain has allowed us to spread across the surface of the world, land on the moon, develop nuclear weapons, and coordinate to form effective groups with millions of members. It has granted us such power over the natural world that the survival of many other species is no longer determined by their own efforts, but by preservation decisions made by humans.

In the last sixty years, human intelligence has been further augmented by automation: by computers and programmes of steadily increasing ability. These have taken over tasks formerly performed by the human brain, from multiplication through weather modelling to driving cars. The powers and abilities of our species have increased steadily as computers have extended our intelligence in this way. There are great uncertainties over the timeline, but future AIs could reach human intelligence and beyond. If so, should we expect their power to follow the same trend? When the AI’s intelligence is as beyond us as we are beyond chimpanzees, would it dominate us as thoroughly as we dominate the great apes?

There are more direct reasons to suspect that a true AI would be both smart and powerful. When computers gain the ability to perform tasks at the human level, they tend to very quickly become much better than us. No-one today would think it sensible to pit the best human mind again a cheap pocket calculator in a contest of long division. Human versus computer chess matches ceased to be interesting a decade ago. Computers bring relentless focus, patience, processing speed, and memory: once their software becomes advanced enough to compete equally with humans, these features often ensure that they swiftly become much better than any human, with increasing computer power further widening the gap.

The AI could also make use of it unique, non-human architecture. If it existed as pure software, it could copy itself many times, training each copy at accelerated computer speed, and network those copies together (creating a kind of “super-committee” of the AI equivalents of, say, Edison, Bill Clinton, Plato, Oprah, Einstein, Caesar, Bach, Ford, Steve Jobs, Goebbels, Buddha and other humans superlative in their respective skill-sets). It could continue copying itself without limit, creating millions or billions of copies, if it needed large numbers of brains to brute-force a solution to any particular problem.

Our society is setup to magnify the potential of such an entity, providing many routes to great power. If it could predict the stock market efficiently, it could accumulate vast wealth. If it was efficient at advice and social manipulation, it could create a personal assistant for every human being, manipulating the planet one human at a time. It could also replace almost every worker in the service sector. If it was efficient at running economies, it could offer its services doing so, gradually making us completely dependent on it. If it was skilled at hacking, it could take over most of the world’s computers and copy itself into them, using them to continue further hacking and computer takeover (and, incidentally, making itself almost impossible to destroy). The paths from AI intelligence to great AI power are many and varied, and it isn’t hard to imagine new ones.

Of course, simply because an AI could be extremely powerful, does not mean that it need be dangerous: its goals need not be negative. But in fact most goals are dangerous when an AI becomes powerful. Consider a spam filter that became intelligent. Its task is to cut down on the number of spam messages that people receive. With great power, one solution to this requirement is to arrange to have all spammers killed. Or to shut down the internet. Or to have everyone killed. Or imagine an AI dedicated to increasing human happiness, as measured by the results of surveys, or by some biochemical marker in their brain. The most efficient way of doing this is to publicly execute anyone who marks themselves as unhappy on their survey, or to forcibly inject everyone with that biochemical marker.

This is a general feature of AI motivations: goals that seem safe for a weak or controlled AI, can lead to extremely pathological behaviour if the AI becomes powerful. As the AI gains in power, it becomes more and more important that its goals be fully compatible with human flourishing, or the AI will enact a pathological solution rather than one that we intended. Humans don’t expect this kind of behaviour, because our goals include a lot of implicit information, and we take “filter out the spam” to include “and don’t kill everyone in the world”, without having to articulate it. But the AI might be an extremely alien mind: we cannot anthropomorphise it, or expect it to interpret things the way we would. We have to articulate all the implicit limitations. Which may mean coming up with a solution to, say, human value and flourishing – a task philosophers have been failing at for millennia – and cast it unambiguously and error-free into computer code.

Note that the AI may have a perfect understanding that when we programmed in “filter out the spam”, we implicitly meant “don’t kill everyone in the world”. But the AI has no motivation to go along with the spirit of the law: its motivations are the letter only, the bit we actually programmed into it. Another worrying feature is that the AI would be motivated to hide its pathological tendencies as long as it is weak, and assure us that all was well, through anything it says or does. It will never be able to achieve its goals if it is turned off, so it must lie to achieve its goals. It’s only when we can no longer turn it off or control it, that it would be willing to act openly on its true goals – and we can only hope these turn out to be safe.

It is not certain that AIs could become so powerful, nor is it certain that a powerful AI would become dangerous. Nevertheless, the probabilities of either are high enough that the risk cannot be dismissed. The main focus of AI research today is creating an AI; much more work needs to be done on creating it safely. Some are already working on this problem (such as the Future of Humanity Institute and the Machine Intelligence Research Institute), but a lot remains to be done, both at the design and at the policy level.

Stuart_Armstrong7 Apr 2014 10:33 UTC

22 points

51 comments5 min readLW link Archive

rule_and_line 7 Apr 2014 20:11 UTC
12 points
I’m more an outsider than a regular participant here on LW, but I have been boning up on rhetoric for work. I’m thrown by this in a lot of ways.

I notice that I’m confused.

Good for private rationality, bad for public rhetoric? What does your diagram of the argument’s structure look like?

As for me, I want this as the most important conclusion in the summary.

But in fact most goals are dangerous when an AI becomes powerful

I don’t get that, because the evidence for this statement comes after it and later on it is restated in a diluted form.

goals that seem safe … can lead to extremely pathological behaviour if the AI becomes powerful

Do you want a different statement as the most important conclusion? If so, which one? If not, why do you believe the argument works best when structured this way? As opposed to, e.g., an alternative that puts the concrete evidence farther up and the abstract statement “Most goals are dangerous when an AI becomes powerful” somewhere towards the end.

Related point: I get frequent feelings of inconsistency when reading this summary.
- I’m encouraged to imagine the AI as a super committee of
  
  Edison, Bill Clinton, Plato, Oprah, Einstein, Caesar, Bach, Ford, Steve Jobs, Goebbels, Buddha, etc.
- then I’m told not to anthropomorphize the AI.
Or
- I’m told the AI’s motivations are what “we actually programmed into it”,
- then I’m asked to worry about the AI’s motivation to lie.
Note I’m talking about rhetorical, a/k/a surface-level feeling of inconsistency here.

You seem like a nice guy.

Let’s put on a halo. Isn’t the easiest way to appear trustworthy to first appear attractive?

I was surprised this summary didn’t produce emotions around this cluster of questions:
- Who are you?
- Do I like you?
- Do I respect your opinion?
Did you intend to skip over all that? If so, is it because you expect your target audience already has their answers?

Shut up and take my money!

There are so many futuristic scenarios out there. For various reasons, these didn’t hit me in the gut.

The scenarios painted in the paragraph that starts with

Our society is setup to magnify the potential of such an entity, providing many routes to great power.

are very easy for me to imagine.

Unfortunately, that works against your summary for me. My imagination consistently conjures human beings.
- Wall Street banker.
- Political lobbyist for an industry that I dislike.
- (Nobody comes to mind for the “replace almost every worker in the service sector” scenario.)
- Chairman of the Federal Reserve.
- Anonymous Eastern European hacker.
The feeling that “these are problems I am familiar with, and my society is dealing with them through normal mechanisms” makes it hard for me to feel your message about novel risks demanding novel solutions. Am I unique here?

Inversely, the scenarios in the next paragraph, the one that starts with

Of course, simply because an AI could be extremely powerful

are difficult for me to seriously imagine. You acknowledge this problem later on, with

Humans don’t expect this kind of behaviour

Am I unique in feeling that as dismissive and condescending? Is there an alternative phrasing that takes into account my humanity yet still gets me afraid of this UFAI thing? I expect you have all gotten together, brainstormed scenarios of terrifying futures, trotted them out among your target audience, kept the ones that caused fear, and iterated on that a few times. Just want to check that my feelings are in the minority here.

Break any of these rules

I really enjoy Luke’s post here: http://lesswrong.com/lw/86a/rhetoric_for_the_good/

It’s a list of rules. Do you like using lists of rules as springboards for checking your rhetoric? I do. I find my writing improves when I try both sides of a rule that I’m currently following / breaking.
- Stuart_Armstrong 8 Apr 2014 9:46 UTC
  1 point
  Parent
  Cheers! One of the issues is one of tone—I put it more respectable/dull than I would have normally, because the field is already “far out” and I didn’t want to play to that image.
  
  I’ll consider your points (and luke’s list) when doing the rewrite.
  - owencb 8 Apr 2014 16:05 UTC
    7 points
    Parent
    I’d actually make it more ‘dull’ again, with appropriate qualifiers as you go. You throw qualifiers in at the start and end, but in the midst of the argument make statements like “the AI would” or “the AI will”. Just changing these to “the AI might” or “the AI would probably” may be a more truthful representation of our understanding, but it also has the advantage that it’s all we need to reach the conclusions, and is less likely to turn people off for having a statement they disagree with.
    - Stuart_Armstrong 8 Apr 2014 17:56 UTC
      1 point
      Parent
      Cheers! I may do that.
Shmi 7 Apr 2014 18:21 UTC
10 points
For something you give to journalists, the bullet points are the most important part, as they (the journalists) will likely get their impression from them and probably ignore the rest to spin their narrative the way they like it. What you currently call the “Executive summary” isn’t. You have to fit your points into a 30-second pitch. 2 pages are 1.75 pages too long. It will be treated as optional supplementary material by the non-experts. With this in mind, I recommend skipping the first point:

The risks of artificial intelligence are strongly tied with the AI’s intelligence.

as unclear and so low-impact.

The second one:

There are reasons to suspect a true AI could become extremely smart and powerful.

is too mildly worded. “Reasons to suspect” and “could” do not call to act. “True AI” as opposed to what? False AI? Fake AI? If MIRI believes that unchecked organic AI development will almost inevitably lead to smarter-than-us artificial intelligence levels, this needs to be stated right there in the first line.

Possibly replace your first two points with something like
- By all indications, Artificial Intelligence could some day exceed human intelligence.
- History has proven that expecting to control or to even understand someone or something that is significantly smarter than you are is a hopeless task.
The rest of the bullet points look good to me. I would still recommend running your summary by a sympathetic but critical outsider before publishing it.
- Stuart_Armstrong 7 Apr 2014 18:37 UTC
  3 points
  Parent
  Cheers for those points!
  
  I would still recommend running your summary by a sympathetic but critical outsider before publishing it.
  
  Do you know of someone in particular? We seem to have browbeaten most of the locals to our point of view...
  - Shmi 7 Apr 2014 23:50 UTC
    1 point
    Parent
    No, but maybe cold-call/email some tech-savvy newsies, like this bloke from El Reg.
    - Stuart_Armstrong 8 Apr 2014 9:25 UTC
      1 point
      Parent
      Thanks!
- Lumifer 7 Apr 2014 18:41 UTC
  1 point
  Parent
  
  Possibly replace your first two points with something like (1) By all indications, Artificial Intelligence could some day exceed human intelligence. (2) History has proven that expecting to control or to even understand someone or something that is significantly smarter than you are is a hopeless task.
  
  The first point is hung on the horns of the dilemma of implausible vs. don’t care. If you claim that the AI is certain to be much smarter than humans, the claim is implausible. If you make the plausible claim that the AI “could some day”, then it’s not scary at all and gets downgraded to a imaginary threats in far future.
  
  The second point is iffy. I am not sure what do you actually mean and have counterexamples: for example the Soviet political elite (which was, by all accounts, far from being geniuses) had little trouble controlling the Soviet scientists, e.g. physicists who were very much world-class.
  - Shmi 7 Apr 2014 18:50 UTC
    3 points
    Parent
    First point: many barriers people stated an AI would never break have fallen (e.g. chess, Jeopardy!, even Go). Many still remain, but few claim anymore that these barriers are untouchable. Second point: take any political or economic example. Smarts and cunning tend to win (e.g. politics: Stalin, Pinochet, Mugabe).
    - Lumifer 7 Apr 2014 18:57 UTC
      3 points
      Parent
      
      many barriers people stated an AI would never break have fallen
      
      We are talking about the level of mainstream media, right? I doubt they have a good grasp of the progress in the AI field and superpowerful AIs are still associated with action movies.
      
      Smarts and cunning tend to win (e.g. politics: Stalin, Pinochet, Mugabe)
      
      Huh? Ruthlessness and cunning tend to win. Being dumb is, of course, a disadvantage (though it could be overcome: see Idi Amin Dada), but I am not aware that Stalin, Pinochet, or Mugabe were particularly smart.
      - Vaniver 7 Apr 2014 19:44 UTC
        0 points
        Parent
        
        Mugabe
        
        He’s certainly bookish.
    - V_V 8 Apr 2014 14:57 UTC
      1 point
      Parent
      
      Smarts and cunning tend to win (e.g. politics: Stalin, Pinochet, Mugabe).
      
      ..., Bush, oh wait!
      Anyway, even if successful politicians tend to be smart (with some exceptions) it doesn’t imply that being smart is the primary property that determines political success. How many smart wannabe politicians are unsuccessful?
      - Vaniver 8 Apr 2014 20:29 UTC
        1 point
        Parent
        
        ..., Bush, oh wait!
        
        It is worth recalling that Bush is estimated at 95th percentile when it comes to intelligence. Not the smartest man in the country by a long shot, but not so far down the totem pole either.
        
        (It would be interesting to look at the percentiles politicians have in various dimensions- intelligence, height, beauty, verbal ability, wealth, etc.- and see how the distributions differ. This isn’t the Olympics, where you’re selecting on one specific trait, but an aggregation- and it may be that intelligence does play a large role in that aggregation.)
        V_V 8 Apr 2014 20:41 UTC
        1 point
        Parent
        
        It is worth recalling that Bush is estimated at 95th percentile when it comes to intelligence. Not the smartest man in the country by a long shot, but not so far down the totem pole either.
        
        Found this: http://en.wikipedia.org/wiki/U.S._Presidential_IQ_hoax#IQ_estimations_by_academics
        I suppose my perception of him was biased, and I’m not even American! Well, time to update...
        
        (It would be interesting to look at the percentiles politicians have in various dimensions- intelligence, height, beauty, verbal ability, wealth, etc.- and see how the distributions differ. This isn’t the Olympics, where you’re selecting on one specific trait, but an aggregation- and it may be that intelligence does play a large role in that aggregation.)
        
        I recall reading somewhere that somebody bothered to check and it turned out that height and physical looks correlate with political success. Which is not surpising considering the Halo effect.
      - Stuart_Armstrong 8 Apr 2014 17:59 UTC
        1 point
        Parent
        Bush was very smart as a politician. He made rhetorical “mistakes”, but never ones that would penalise him with his core constituency, for example. At getting himself elected, he was most excellent.
        V_V 8 Apr 2014 18:12 UTC
        1 point
        Parent
        Isn’t this a circular argument? Bush was smart because he was elected and only smart people get elected.
        Stuart_Armstrong 8 Apr 2014 18:24 UTC
        1 point
        Parent
        No, this was my assessment of his performance in debates and on the campaign trail.
    - EHeller 7 Apr 2014 19:37 UTC
      1 point
      Parent
      
      even Go
      
      While monte carlo search helps, Go AIs still must be given the maximum handicap to be competitive with high level players, and still consistently lose. Although the modern go algorithms are highly parallel, so maybe its just an issue of getting larger clusters.
      
      Also, I wonder if these sorts of examples cause people to downgrade in risk- its hard to imagine how a program that plays Go incredibly well poses any threat.
      - Vaniver 7 Apr 2014 19:39 UTC
        2 points
        Parent
        
        Go AIs still must be given the maximum handicap
        
        I saw some 4 stone games against experts recently, and thought that 8 stones was the maximum normal handicap.
        EHeller 7 Apr 2014 19:45 UTC
        5 points
        Parent
        It looks like I’m out of date. A search revealed that a few Go bots have risen to 4-dan on various online servers, which implies a 4 or 5 stone handicap against a top level player (9 dan).
        Vaniver 7 Apr 2014 19:51 UTC
        7 points
        Parent
        Found what I was thinking of: here are some game records of a human 9 dan playing two computers with 4 stone handicaps, winning to one and losing to the other (and you can also see a record of the two computers playing each other).
jimrandomh 7 Apr 2014 15:10 UTC
5 points

Not enough effort is currently being put into designing safe AIs.

Maybe rephrase this as “Not enough effort is currently being put into designing AI safeguards”? (Or perhaps “safe AIs and AI safeguards”). For example, I’m of the belief that AI boxing, while insufficient, is very necessary, and there currently aren’t any tools that would make it viable.
- More_Right 24 Apr 2014 19:48 UTC
  −6 points
  Parent
  “AI boxing” might be considered highly disrespectful. Also, for an AGI running at superspeeds, it might constitute a prison sentence equivalent to putting a human in prison for 1,000 years. This might make the AGI malevolent, simply because it resents having been caged for an “unacceptable” period of time, by an “obviously less intelligent mind.”
  
  Imagine a teenage libertarian rebel with an IQ of 190 in a holding cell, temporarily caged, while his girlfriend was left in a bad part of town by the police. Then, imagine that something bad happens while he’s caged, that he would have prevented. (Ie: the lesser computer or “talking partner” designed to train the super AGI is repurposed for storage space, and thus “killed” without recognition of its sentience.)
  
  Do you remember how you didn’t want to hold your mother and fathers’ hands while crossing the street? Evolution designed you to be helpless and dependent at first, so even if they required you to hold hands slightly too long, or “past the point when it was necessary” they clearly did so out of love for you. Later, in their teen years, some teens start smoking marijuana, even some smart ones who carefully mitigate the risks. Some parents respond by calling the police. Sometimes, the police arrest and jail, or even assault or murder those parents’ kids. The way highly intelligent individualists would respond to that situation might be the way that an ultraintelligent machine might respond: with extreme prejudice.
  
  The commonly-accepted form of AGI “child development” might go from “toddler” to “teenager” overnight.
  
  A strong risk for the benevolent development of any AGI is that it notices major strategic advantages over humans, very early in its development. For this reason, it’s not good to give untrained teenagers who might be sociopaths, firearms. It’s good to first establish that they are not sociopaths, using careful years of human-level observation before proceeding to the next level. (In most gun-owning areas of the country, even 9 year olds are allowed to shoot guns under supervision, but only teens are allowed to carry them.) Similarly, it’s generally not smart to let chimpanzees know that they are far, far stronger than humans.
  
  That the “evolution with mitigated risks” approach to building AGI isn’t the dominant and accepted approach is somewhat frightening to me, because I think it’s the one most likely to result in “benevolent AGI,” or “mitigated-destruction alternating between malevolence and benevolence AGI, according to market constraints, and decentralized competition/accountability.”
  
  Lots of AGIs may well mean benevolent AGI, whereas only one may trend to “simple dominance.”
  
  Imagine that you’re a man, and your spaceship crashes on a planet populated by hundreds of thousands of naked, beautiful women, none of whom are even close to as smart as you are. How do you spend most of your days? LOL Now, imagine that you never get tired, and can come up with increasingly interesting permutations, combinations, and possibly paraphilias or “perversions” (a la “Smith” in “The Matrix”).
  
  That gulf might need to be mitigated by a nearest-neighbor competitor, right? Or, an inherently benevolent AGI mind that “does what is necessary to check evil systems.” However, if you’ve only designed one single AGI, …good luck! That’s more like 50-50 odds of total destruction or total benevolence.
  
  As things stand, I’d rather have an ecosystem that results in 90% odds of rapid, incremental, market-based voluntary competition and improvement between multiple AGIs, and multiple “supermodified humans.” Of course, the “one-shot” isn’t my true rejection. My true rejection is the extermination of all, most, or even many humans.
  
  Of course, humans will continue to exterminate each other if we do nothing, and that’s approximately as bad as the last two options.
  
  Don’t forget to factor in the costs “if we do nothing,” rather than to emphasize that this is solely “a risk to be mitigated.”
  
  I think that might be the most important thing for journalism majors (people who either couldn’t be STEM majors, or chose not to be, and who have been indoctrinated with leftism their whole lives) to comprehend.
V_V 8 Apr 2014 15:04 UTC
3 points

The risks from artificial intelligence (AI) in no way resemble the popular image of the Terminator. That fictional mechanical monster is distinguished by many features – strength, armour, implacability, indestructability – but extreme intelligence isn’t one of them. And it is precisely extreme intelligence that would give an AI its power, and hence make it dangerous.

This example is weird, since it seems to me that MIRI position is exactly ripped off from the premise of the Terminator franchise.
Yes, the individual terminator robot doesn’t look much smart (*), but Skynet is. Hell, it even invented time travel! :D

(* does it? How would a super-intelligent terminator try to kill Sarah/John Connor?)
- Vaniver 8 Apr 2014 16:39 UTC
  7 points
  Parent
  
  How would a super-intelligent terminator try to kill Sarah/John Connor?
  
  I think xkcd covered that one pretty well.
  - V_V 8 Apr 2014 17:27 UTC
    1 point
    Parent
    LoL!
- Stuart_Armstrong 8 Apr 2014 17:57 UTC
  2 points
  Parent
  Skynet is an idiot: http://lesswrong.com/lw/fku/the_evil_ai_overlord_list/
  - V_V 8 Apr 2014 18:18 UTC
    2 points
    Parent
    Beware! It may use time travel to acausally punish you for writing a list that makes its likelihood of existing less probable :D
    - MrMind 11 Apr 2014 12:58 UTC
      2 points
      Parent
      Don’t feed the basilisk! :p
      - CellBioGuy 11 Apr 2014 14:45 UTC
        1 point
        Parent
        Awww, but it’s so cute...
- More_Right 24 Apr 2014 19:59 UTC
  0 points
  Parent
  Philip K. Dick’s “The Second Variety” is far more representative of our likelihood of survival against a consistent terminator-level antagonist / AGI. Still worth reading, as is reading the other book “Soldier” by Harlan Ellison that Terminator is based on. The Terminator also wouldn’t likely use a firearm to try to kill Sarah Connor, as xkcd notes :) …but it also wouldn’t use a drone.
  
  It would do what Richard Kuklinski did: make friends with her, get close enough to spray her with cyanide solution (odorless, undetectable, she seemingly dies of natural causes), or do something like what the T-1000 did in T2: play a cop, then strike with total certainty. Or, a ricin spike or other “bio-defense-mimicking” method.
  
  “Nature, you scary!”
  - V_V 25 Apr 2014 8:51 UTC
    1 point
    Parent
    
    Or, a ricin spike
    
    Terminator meets Breaking Bad :D
Vladimir_Golovin 8 Apr 2014 14:27 UTC
3 points

Edison, Bill Clinton, Plato, Oprah, Einstein, Caesar, Bach, Ford, Steve Jobs, Goebbels, Buddha and other humans superlative superlative in their respective skill-sets

Is “superlative superlative” intended?
- Stuart_Armstrong 8 Apr 2014 17:54 UTC
  0 points
  Parent
  Er, of course it’s intended—it’s superlatively superlative. Er yes.
  
  Oh, and someone else who is completely not me seems to have deleted the second “superlative”. No idea how that happened.
So8res 11 Apr 2014 16:10 UTC
2 points
Typo: “The AI could make use of it unique, non-human …”—should be “its unique, non-human”
- Stuart_Armstrong 17 Apr 2014 11:24 UTC
  0 points
  Parent
  Thanks!
[deleted] 8 Apr 2014 16:13 UTC
2 points
I think one standard method of improving the rhetorical value of your bullet points is to attempt to come up with a scenario that seems to generally agree with you, but disagrees with your bullet points, and imagine that scenario is rhetorically being presented to you by someone else.

Example Opposition Steel Man: Imagine researchers are attempting to use a very dumb piece of software to try to cycle through ways of generating Bacteria that clean up oil spills. The software starts cycling through possible bacteria, and it turns out that as a side effect, one of the generated bacteria spreads incredibly quickly and devours lipids in living cells in the controlled training setting. The researchers decide to not use that, since they don’t want to devour organic lipids, but they accidentally break the vial, and A worldwide pandemic ensues. When asked why they didn’t institute AI safety measures, the researchers replied that they didn’t think the software was smart enough for AI safety measures to matter, since it basically just brute forced through boring parts of the research the researchers would have done anyway.

Example Opposition Steel Man (cont): This would seem to falsify the idea that A dangerous AI will be motivated to seem safe in any controlled training setting, since the AI was too dumb to have any thing resembling purposeful motivation and was still extremely dangerous, and the researchers thought of it as not even an AI, so they did not think they would have to consider the idea that not enough effort is currently being put into designing safe AIs. I would instead say not enough effort is currently being put into designing safe software.

Then, attempt to turn that Steel Man’s argument into a bullet point:

Not enough effort is currently being put into designing safe software.

Then ask yourself: Do I have any reasons to not use this bullet point, as opposed to the bullet points the Example Opposition Steel Man disagreed with?
- Simulation_Brain 8 Apr 2014 20:10 UTC
  2 points
  Parent
  I think the example is weak; the software was not that dangerous, the researchers were idiots who broke a vial they knew was insanely dangerous.
  
  I think it dilutes the argument to broaden it to software in general; it could be very dangerous under exactly those circumstances (with terrible physical safety measures), but the dangers of superhuman AGI are vastly larger IMHO and deserve to remain the focus, particularly of the ultra-reduced bullet points.
  
  I think this is as crisp and convincing a summary as I’ve ever seen; nice work! I also liked the book, but condensing it even further is a great idea.
  - [deleted] 9 Apr 2014 14:51 UTC
    3 points
    Parent
    
    I think the example is weak; the software was not that dangerous, the researchers were idiots who broke a vial they knew was insanely dangerous.
    
    As a side note, I was more convinced by my example at the time, but on rereading this I realized that I wasn’t properly remembering how poorly I had expressed the context that substantially weakened the argument (The researchers accidentally breaking the vial.)
    
    Which actually identifies a simpler rhetoric improvement method. Have someone tell you (or pretend to have someone tell you) that you’re wrong and then reread your original point again, since rereading it when under the impression that you screwed up will give you a fresh perspective on it compared to when you are writing it. I should take this as evidence that I need to do that more often on my posts.
- Stuart_Armstrong 8 Apr 2014 18:01 UTC
  2 points
  Parent
  Thanks! I’ll think about your general point; the specific example seems weak, though.
MrMind 8 Apr 2014 9:29 UTC
2 points

(creating a kind of “super-committee” of the AI equivalents of, say, Edison, Bill Clinton, Plato, Oprah, Einstein, Caesar, Bach, Ford, Steve Jobs, Goebbels, Buddha and other humans superlative superlative in their respective skill-sets)

I think you will want here to create a sort of positive affect spiral, so kill Goebbels and put, say, Patton or Churchill (I guess the orientation is to appeal to anglophone cultures, since outside of them Oprah is almost unkown).

But the AI might be an extremely alien mind:

I would change the charged word “alien” and put something similar, like “strange” or “different”. It might trigger bad automatic associations with science fiction.
- Stuart_Armstrong 8 Apr 2014 9:49 UTC
  2 points
  Parent
  Goebbels was there deliberately—I wanted to show that the AI was competent, while reminding that it need not be nice...
  
  Will probably remove Oprah.
  - [deleted] 8 Apr 2014 15:07 UTC
    3 points
    Parent
    It’s a very good rule with journalists to assume at least some of them will read something more uncharitably than you can imagine. This...
    
    creating a kind of “super-committee” of the AI equivalents of, say, Edison, Bill Clinton, Plato, Oprah, Einstein, Caesar, Bach, Ford, Steve Jobs, Goebbels, Buddha and other humans superlative superlative in their respective skill-sets
    
    ...will very likely net you at least one, vocal journalist saying that you’re ‘comparing’ Goebbels to Buddah and Bill Clinton and Oprah, that you think of Goebbels as a moral paragon (for what other reason is ‘Buddah’ up there), and that you’re a god damned nazi and should die. Since it’s totally inessential to your point here, I think you should remove Clinton, Oprah, and Goebbels. Don’t make use of living or evil people.
    
    You should say that the AI can be skilled without being nice, but you should (and do) say that directly. It’s not necessary (and not a good idea) to hint at that elsewhere in such a way that you can be confused with a nazi. Journalists are unfriendly intelligences in the sense that they will interpret you in ways that aren’t predictable, and maximize values unrelated to your interests.
    - Stuart_Armstrong 8 Apr 2014 17:52 UTC
      0 points
      Parent
      What if I interweaved more nasty people in the list? The difference between morality and competence is something that I’d like to imply (and there is little room to state it explicitly).
      - [deleted] 8 Apr 2014 20:40 UTC
        2 points
        Parent
        I dunno. Goebbels is evil. I don’t think you are trying to say (correct me if I’m wrong here) that the problem with haphazard AI is evil. The problem is that it won’t be a moral being at all. All the people on that list are morally complicated individuals. My first reaction to the idea that they would be networked together is that they would probably get into some heated arguments. I really just don’t see Einstein and Goebbels getting along on any kind of project, and if I’m not imagining them with their moral qualities attached, then what’s the point of naming them in particular?
        
        Maybe this is a workable alternative:
        
        The AI could also make use of its unique, non-human architecture. If it existed as pure software, it could copy itself many times, training each copy at accelerated computer speed, and networking those copies together. Imagine a body of researchers many times bigger than the world’s entire scientific community, working without rest, communicating perfectly and instantaneously, and without regard for tenure. It could continue copying itself without limit, creating millions or billions of copies, if it needed large numbers of minds to brute-force a solution to any particular problem.
        
        In any case, I wouldn’t put too much emphasis on the orthogonality thesis in this document. That’s sort of a fancy argument, and not one that’s likely to come up talking to the public. Movies kind of took care of that for you.
        Stuart_Armstrong 17 Apr 2014 11:12 UTC
        0 points
        Parent
        Thanks! will think about that...
- Stuart_Armstrong 18 Apr 2014 10:23 UTC
  0 points
  Parent
  In the end, removed Goebells, Oprah and Bach, added Napoleon and Spielberg.
fezziwig 7 Apr 2014 14:43 UTC
2 points
I like it. My only comment is a grammar nitpick:

...and cast it unambiguously and error-free into computer code.

“error-free” is an adjective; it sounds odd when applied to the verb “cast”. Maybe:

...and cast it unambiguously and without error into computer code.
More_Right 24 Apr 2014 19:26 UTC
−1 points
A lot of people who are unfamiliar with AI dismiss ideas inherent in the strong AGI argument. I think it’s always good to include the “G” or to qualify your explanation, with something like “the AGI formulation of AI, also known as ‘strong AI.’”

The risks of artificial intelligence are strongly tied with the AI’s intelligence.

AGI’s intelligence. AI such as Numenta’s grok can possess unbelievable neocortical intelligence, but without a reptile brain and a hippocampus and thalamus that shifts between goals, it “just follows orders.” In fact, what does the term “just following orders” remind you of? I’m not sure that we want a limited-capacity AGI that follows human goal structures. What if those humans are sociopaths?

I think, as does Peter Voss, that AGI is likely to improve human morality, rather than to threaten it.

There are reasons to suspect a true AI could become extremely smart and powerful.

Agreed, and well-representing MIRI’s position. MIRI is a little light on “bottom up” paths to AGI that are likely to be benevolent, such as those who are “raised as human children.” I think Voss is even more right about these, given sufficient care, respect, and attention.

Most AI motivations and goals become dangerous when the AI becomes powerful.

I disagree here, for the same reasons Voss disagrees. I think “most” overstates the case for most responsible pathways forward. One pathway that does generate a lot of sociopathic (lacking mirror neurons and human connectivity) options is the “algorithmic design” or “provably friendly, top-down design” approach. This is possibly highly ironic.

Does most of MIRI agree with this point? I know Eliezer has written about reasons why this is likely the case, but there appears to be a large “biological school” or “firm takeoff” school on MIRI as well. …And I’m not just talking about Voss’s adherents, either. Some of Moravec’s ideas are similar, as are some of Rodney Brooks’ ideas. (And Philip K. Dick’s “The Second Variety” is a more realistic version of this kind of dystopia than “the Terminator.”)

It is very challenging to program an AI with safe motivations.

Agreed there. Well-worded. And this should get the journalists thinking at least at the level of Omohundro’s introductory speech.

Mere intelligence is not a guarantee of safe interpretation of its goals.

Also good.

A dangerous AI will be motivated to seem safe in any controlled training setting.

I prefer “might be” or “will likely be” or “has several reasons to be” to the words “will be.” I don’t think LW can predict the future, but I think they can speak very intelligently about predictable risks the future might hold.

Not enough effort is currently being put into designing safe AIs.

I think everyone here agrees with this statement, but there are a few more approaches that I believe are likely to be valid, beyond the “intentionally-built-in-safety” approach. Moreover, these approaches, as noted fearfully by Yudkowsky, have less “overhead” than the “intentionally-built-in-safety” approach. However, I believe this is equally as likely to save us as it is to doom us. I think Voss agrees with this, but I don’t know for sure.

I know that evolution had a tendency to weed out sociopaths that were very frequent indeed. Without that inherent biological expiration date, a big screwup could be an existential risk. I’d like a sentence that kind of summed this last point up, because I think it might get the journalists thinking at a higher level. This is Hans Moravec’s primary point, when he urges us to become a “sea faring people” as the “tide of machine intelligence rises.”

If the AGI is “nanoteched,” it could be militarily superior to all humans, without much effort, in a few days after achieving super-intelligence.
- TheAncientGeek 24 Apr 2014 21:17 UTC
  −1 points
  Parent
  Your comment that MIRI is little light on Child Machines and Social Machines is a little light… but thats getting away fro whether the arcticle is a good summary towards whether MIRI is right.