Students asked to defend AGI danger update in favor of AGI riskiness

lukeprog18 Oct 2011 5:24 UTC

7 points

In the Spring semester of 2011, I decided to see how effectively I could communicate the idea of a threat from AGI to my undergraduate classes. I spent three sessions on this for each of my two classes. My goal was to convince my students that all of us are going to be killed by an artificial intelligence. My strategy was to induce the students to come up with the ideas themselves. I gave out a survey before and after. An analysis of the survey responses indicates that the students underwent a statistically significant shift in their reported attitudes. After the three sessions, students reported believing that AGI would have a larger impact1 and also a worse impact2 than they originally reported believing.

Not a surprising result, perhaps, but the details of how Geoff taught AGI danger and the reactions of his students are quite interesting.

lukeprog18 Oct 2011 5:24 UTC

7 points

38 comments1 min readLW link Archive

lessdazed 18 Oct 2011 10:36 UTC
31 points

In the (season) semester of (year), I decided to see how effectively I could communicate the idea of a threat from (noun) to my undergraduate classes. I spent three sessions on this for each of my two classes. My goal was to convince my students that all of us are going to be killed by a(n) (noun). My strategy was to induce the students to come up with the ideas themselves. I gave out a survey before and after. An analysis of the survey responses indicates that the students underwent a statistically significant shift in their reported attitudes. After the three sessions, students reported believing that (noun) would have a larger impact1 and also a worse impact2 than they originally reported believing.

This is less than surprising. I can’t think of any threat already existing in the minds of some undergraduates that a competent believing professor requiring attendance couldn’t, on average, increase. Control groups are needed.
- Normal_Anomaly 18 Oct 2011 21:07 UTC
  8 points
  Parent
  What would you want to do with the control groups? Teach them that AGI won’t destroy the world? Not teach them anything in particular about AI? Teach them that invading aliens will destroy the world, or that the biblical End Times are near? Any of these would yield useful information. Which one(s) do you favor?
  - lessdazed 18 Oct 2011 22:15 UTC
    0 points
    Parent
    
    Not teach them anything in particular about AI? Teach them that...the biblical End Times are near?
    
    I was specifically thinking of these exact two conditions, which is why I said “groups”, for they are different in kind. The aliens example is even better than the supernatural end times one.
    
    I thought of but rejected “Teach them that AGI won’t destroy the world?” when I couldn’t think of how to implement that neutrally. How would one do that?
    - Normal_Anomaly 18 Oct 2011 23:35 UTC
      5 points
      Parent
      
      I thought of but rejected “Teach them that AGI won’t destroy the world?” when I couldn’t think of how to implement that neutrally.
      
      True. Most arguments against the AGI-apocalypse scenario are responses to arguments for it; it would be difficult to present only one side of the question.
Shmi 18 Oct 2011 18:31 UTC
18 points
This seems like an incredibly biased presentation, with the author never realizing the depth of the bias. Then again, he writes “My goal was to convince my students that all of us are going to be killed by an artificial intelligence,” not to probe the validity of the point, so his bottom line was already written.

He says “I presented a neutral summary” after judiciously guiding the students through one-sided claims and refutations about the AGI (can never play -- plays ), not any of the claims that have not (yet) been refuted, then spicing it up with the Terminator quotes.

He says “At all points in the discussion I did my best to appear neutral and to not reveal my views.” right after scaring them with a bomb in a trashcan.

He assigned no homework and gave no time outside the class for the students to come up with counter-arguments.

He writes:

In my classes, my primary goal was to teach students how to construct and assess arguments… Arguments can be assessed. If an argument has flaws, you can find those flaws. If you find flaws in an argument, the argument is refuted. If you cannot find flaws, you can take time to think about it more. If you still cannot find flaws, you should consider the possibility that the argument has no flaws. And if there are no flaws in an argument, then the conclusion of that argument has to be true, no matter what that conclusion might be.

The idea that an argument can sometimes be tested experimentally seems utterly foreign to him (even when it is in his favor, like the “AI can never be better at chess” one). Must be something about the philosophers in general, I suppose.

He primed his students in advance:

Up to this point, I had not presented any AI material to anyone in any of the classes. I had only remarked a couple of times that the AI arguments were “awesome” or “epic” or some such.

He did not attempt to provide a balanced context by inviting (or at least quoting) an expert in the area who does not share his views.

So his conclusion, that it is possible to convince a person who never thought about a topic before of the dangers of an AGI, was a foregone one. He could probably have convinced them that AGI is the second coming of Christ, if he bothered (it is a Catholic college, so the leap is not that large).
- lessdazed 19 Oct 2011 5:07 UTC
  3 points
  Parent
  
  “My goal was to convince my students that all of us are going to be killed by an artificial intelligence,” not to probe the validity of the point, so his bottom line was already written.
  
  Sort of. Assuming he was basically convinced that ”...all of us are going to be killed by an artificial intelligence,” he knew he was trying to convince his students of that but he did not know if he would succeed at doing so with this method. He wasn’t testing the dangers of AI, he was testing a method of persuasion.
timtyler 18 Oct 2011 13:45 UTC
9 points
The paper gives what it describes as the “AGI Apocalypse Argument”—which ends with the following steps:

_12. For almost any goals that the AGI would have, if those goals are pursued in a way that would yield an overwhelmingly large impact on the world, then this would result in a catastrophe for humans.

_13. Therefore, if an AGI with almost any goals is invented, then there will be a catastrophe for humans.

_14. If humans will invent an AGI soon and if an AGI with almost any goals is invented, then there will be a catastrophe for humans, then there will be an AGI catastrophe soon.

_15. Therefore, there will be an AGI catastrophe soon.

It is hard to tell whether anyone took this seriously—but it seems that an isomorphic argument ‘proves’ that computer programs will crash—since “almost any” computer program crashes. The “AGI Apocalypse Argument” as stated thus appears to be rather silly.

If the stated aim was: “to convince my students that all of us are going to be killed by an artificial intelligence”—why start with such a flawed argument?
- Bongo 19 Oct 2011 22:34 UTC
  6 points
  Parent
  
  it seems that an isomorphic argument ‘proves’ that computer programs will crash—since “almost any” computer program crashes.
  
  More obviously, an isomorphic argument ‘proves’ that books will be gibberish—since “almost any” string of characters is gibberish. An additional argument that non-gibberish books are very difficult to write and that naively attempting to write a non-gibberish book will almost certainly fail on the first try, is required. The analogous argument exists for AGI, of course, but is not given there.
  - timtyler 20 Oct 2011 15:53 UTC
    0 points
    Parent
    Right—so we have already had 50+ years of trying and failing. A theoretical argument that we won’t succeed the first time does not tell us very much that we didn’t already know.
    
    What is more interesting is the track record of engineers of not screwing up or killing people the first time.
    
    We have records about engineers killing people for cars, trains, ships, aeroplanes and rockets. We have failure records from bridges, tunnels and skyscrapers.
    
    Engineers do kill people—but often it is deliberately—e.g. nuclear bombs—or with society’s approval—e.g. car accidents. There are some accidents which are not obviously attributable to calculated risks—e.g. the Titanic, or the Tacoma Narrows bridge—but they typicallly represent a small fraction of the overall risks involved.
- shinoteki 18 Oct 2011 16:35 UTC
  2 points
  Parent
  
  It is hard to tell whether anyone took this seriously—but it seems that an isomorphic argument ‘proves’ that computer programs will crash—since “almost any” computer program crashes. The “AGI Apocalypse Argument” as stated thus appears to be rather silly.
  
  I don’t see why this makes the argument seem silly. It seems to me that the isomorphic argument is correct, and that computer programs do crash.
  - timtyler 18 Oct 2011 21:26 UTC
    5 points
    Parent
    Some computer programs crash—just as some possible superintelligences would kill alll humans.
    
    However, the behavior of a computer program chosen at random tells you very little about how an actual real-world computer program will behave—since computer programs are typically produced by selection processes performed by intelligent agents.
    
    The “for almost any goals” argument is bunk.
    - Eugine_Nier 19 Oct 2011 6:07 UTC
      2 points
      Parent
      
      Some computer programs crash—just as some possible superintelligences would kill all humans.
      
      No most computer programs crash, it’s just that most people never see them because said programs are repeatedly tested and modified until they no longer crash before being shown to people (this process is called “debugging”). With a self-modifying AI this is a lot harder to do.
      - timtyler 19 Oct 2011 17:34 UTC
        1 point
        Parent
        
        Some computer programs crash—just as some possible superintelligences would kill all humans.
        
        No *most” computer programs crash [...]
        
        By “no”, you apparently mean “yes”.
        
        With a self-modifying AI this is a lot harder to do.
        
        Well, that is a completely different argument—and one that would appear to be in need of supporting evidence—since automated testing, linting and the ability to program in high-level languages are all improving simultaneously.
        
        I am not aware of any evidence that real computer programs are getting more crash-prone with the passage of time.
        Eugine_Nier 19 Oct 2011 23:35 UTC
        1 point
        Parent
        
        With a self-modifying AI this is a lot harder to do.
        
        Well, that is a completely different argument—and one that would appear to be in need of supporting evidence—since automated testing, linting and the ability to program in high-level languages are all improving simultaneously.
        
        The point is that the first time you run the seed AI it will attempt to take over the world, so you don’t have the luxury of debugging it.
        timtyler 20 Oct 2011 15:01 UTC
        1 point
        Parent
        
        The point is that the first time you run the seed AI it will attempt to take over the world, so you don’t have the luxury of debugging it.
        
        That is not a very impressive argument, IMHO.
        
        We will have better test harnesses by then—allowing such machines to be debugged.
        asr 20 Oct 2011 0:47 UTC
        1 point
        Parent
        Almost certainly, the first time you run the seed AI, it’ll crash quickly. I think it’s very unlikely that you construct a successful-enough-to-be-dangerous AI without a lot of mentally crippled ones first.
        wedrifid 20 Oct 2011 13:48 UTC
        0 points
        Parent
        
        Almost certainly, the first time you run the seed AI, it’ll crash quickly. I think it’s very unlikely that you construct a successful-enough-to-be-dangerous AI without a lot of mentally crippled ones first.
        
        If so then we are all going to die. That is, if you have that level of buggy code then it is absurdly unlikely that the first time the “intelligence” part works at all it works well enough to be friendly. (And that scenario seems likely.)
        timtyler 20 Oct 2011 15:03 UTC
        0 points
        Parent
        The first machine intellligences we build will be stupid ones.
        
        By the time smarter ones are under developpment we will have other trustworthy smart machines on hand to help keep the newcomers in check.
    - J_Taylor 19 Oct 2011 3:14 UTC
      2 points
      Parent
      I am not entirely sure I disagree with you. However, I am having difficulty modeling you.
      
      “Achieving a goal” seems to mean, for our purposes, something along the lines of “Bringing about a world-state.” Most possible world-states do not involve human existence. Thus, it seems that for most possible goals, achieving a goal entails human extinction.
      
      However, your mention of computer programs being produced by intelligent agents is interesting. Are you implying that most AGI’s (assume these intelligences can go FOOM) would not result in human extinction?
      
      If this is not what you were implying, I apologize for modeling you poorly. If this is what you were implying, I would like to indicate that this post was non-hostile.
      - timtyler 19 Oct 2011 17:41 UTC
        0 points
        Parent
        
        Are you implying that most AGI’s (assume these intelligences can go FOOM) would not result in human extinction?
        
        Questions about fractions of infinite sets require an enumeration strategy to be specified—or they don’t make much sense. Assuming lexicographic ordering of their source code—and only considering the set of superintelligent programs—no: I don’t mean to imply that.
    - Logos01 19 Oct 2011 3:24 UTC
      1 point
      Parent
      
      The “for almost any goals” argument is bunk.
      
      A statement which we can derive from the simple fact that the mere existence of general intelligence (apes) does not result automatically in catastrophe.
      
      I wonder how long it’ll take before people catch onto the notion that artificial “dumbness” is in many ways a more interesting field than artificial “intelligence”? (As in, how much could an AGI no smarter than a dog, but hooked into expert systems similar to Watson, do?)
      - TheOtherDave 19 Oct 2011 3:53 UTC
        3 points
        Parent
        It was pretty well accepted at MIT’s Media Lab back when my orbit took me around there periodically, a decade or so ago, that there was a huge amount of low-hanging fruit in this area… not necessarily of academic interest, but damned useful (and commercial).
        JoshuaZ 19 Oct 2011 4:02 UTC
        2 points
        Parent
        That’s interesting since my impression if anything is the exact opposite. There seem to be a lot of people trying to apply Bayesian learning systems and expert learning systems to all sorts of different practical problems. I wonder if this is a new thing or whether I simply don’t have a good view of the field.
        Logos01 19 Oct 2011 4:07 UTC
        0 points
        Parent
        For what it’s worth, I consider Bayesian learning systems and expert learning systems to be “narrow” AI—hence the example I gave of Watson.
        
        I think Ben Goertzel’s Novamente project is the closest extant project to a ‘general’ AI of any form that I’ve heard of.
        JoshuaZ 19 Oct 2011 4:09 UTC
        2 points
        Parent
        I can see that for expert systems, but Bayesian learning systems seem to be a distinct category. The primary limits seem to be scalibility not architecture.
        Logos01 19 Oct 2011 6:12 UTC
        0 points
        Parent
        Bayesian learning systems are essentially another form of trainable neural network. That makes them very good in a narrow range of categories but also makes them insufficient to the cause of achieving general intelligence.
        
        I do not see that scaling Bayesian learning networks would ever achieve general intelligence. No matter how big the hammer, it’ll never be a wrench. That being said, I do believe that some form of pattern recognition and ‘selective forgetting’ is important to cognition and as such Bayesian learning architecture is a good tool towards that end.
        Logos01 19 Oct 2011 12:50 UTC
        0 points
        Parent
        
        not necessarily of academic interest, but damned useful (and commercial).
        
        Actually, I’m curious that isn’t seen as an area of significan academic interest—designing artificial systems around being efficient parsers of extraneous data. I recall that one of the major differences between Deep Blue and Deep Fritz in the Kasperov chess matches was precisely that Fritz was designed around not probing every last possible set of playable moves; that is, Deep Fritz was “learning to forget the right things”.
        
        It seems to me that understanding this mechanism and how it behaves in humans could have huge potential for opening up the understanding of general intelligence and cognition. And that’s a very academic concern.
- jhuffman 18 Oct 2011 14:27 UTC
  0 points
  Parent
  Yes, 15 does not follow unless we resolve this question from 14:
  
  _14. If humans will invent an AGI soon
[deleted] 18 Oct 2011 13:32 UTC
6 points
Does anyone else see a problem with the data table on page 22 of that PDF file?

The paper mentions this criteria on page 22:

Cells in the “change” column where I did not receive both a “before” answers and an “after” answer have been left blank.

Yet the data points at 15 8:30am, 16 8:30am, and 17 8:30am on page 22 all appear to have blank Before, present After, and a Change which is identical to the After column.

This also appears to affect the change column statistics on page 24, (8:30 a.m) and on page 26 (Both Classes Combined) For page 28, the people who only have one survey are dropped entirely, and I no longer see this problem.

Since he uses the statistics on page 28 for his conclusion, this may not change the conclusion, but I did want to point it out.
- Geoff_Anders 19 Oct 2011 17:28 UTC
  5 points
  Parent
  Thanks for pointing this out. There was in fact an error. I’ve fixed the error and updated the study. Some of the conclusions embedded in tables change; the final conclusions reported stay the same.
  
  I’ve credited you on p.3 of the new version. If you want me to credit you by name, please let me know.
  
  Thanks again!
Geoff_Anders 19 Oct 2011 18:56 UTC
5 points
Hi everyone. Thanks for taking an interest. I’m especially interested in (a) errors committed in the study, (b) what sorts of follow-up studies would be the most useful, (c) how the written presentation of the study could be clarified.

On errors, Michaelos already found one—I forgot to delete some numbers from one of the tables. That error has been fixed and Michaelos has been credited. Can anyone see any other errors?

On follow-up studies, lessdazed has suggested some. I don’t know if we need to see what happens when nothing is presented on AGI; I think our “before” surveys are sufficient here. But trying to teach some alternative threat is an interesting idea. I’m interested in other ideas as well.

On clarity of presentation, it will be worth clarifying a few things. For instance, the point of the study was to test a method of persuasion, not to see what students would do with an unbiased presentation of evidence. I’ll try to make that more obvious in the next version of the document. It would be good to know what other things might be misunderstood.
- [deleted] 20 Oct 2011 12:57 UTC
  2 points
  Parent
  I appreciate the credit and have sent you a message with my name, but I have to let you know that while Version 1.2 contains the fix, version 1.3 appears to have reverted back to the unfixed version as if it was made off of version 1.0 instead of 1.2.
  - Geoff_Anders 21 Oct 2011 0:53 UTC
    0 points
    Parent
    Fixed.
Eugine_Nier 19 Oct 2011 6:10 UTC
4 points
This post seems to serve no purpose except to promote the dark arts.
- PhilGoetz 21 Oct 2011 16:41 UTC
  1 point
  Parent
  A quote from the PDF:
  
  Also, wherever possible I tried to choose material that was freaky. The Big Dog video is particularly good example, as lots of people seem to find Big Dog freaky. Of course, at no point did I comment on the freakiness. I did not want my students to think that I wanted to unsettle them. I simply wanted them to experience their own natural reactions as they witnessed the power of artificial intelligence unfolding in front of them.
  
  I then played two clips from the movie Terminator 2: Judgment Day. The first clip, from the very beginning of the movie, showed the future war between humans and robots. The second clip showed John Connor, Sarah Connor and the Terminator discussing the future of humanity and how the artificial intelligence Skynet was built. I chose the first clip in order to vividly present the image of an AGI catastrophe. I chose the second clip in order to present the following pieces of dialogue. … These were the most ominous and portentous bits of dialogue I could find.
  
  So, yes, dark arts. But the way he kept asking “And how would the AI do that?” was excellent.
Douglas_Knight 18 Oct 2011 17:33 UTC
4 points
I found this an extremely surprising result. Geoff Anders claims immediate effects from essentially only two interventions:

But you see, a plan can’t be very good if it can be thwarted by some mild fluctuations in the weather. Let’s say there’s a thunderstorm and the power goes out. Well, then the AGI will turn off. And if it turns off, it won’t be able to accomplish its goal of becoming the best possible chess player. You see, if we humans executed your plans, we would all die of starvation. We would study the rules of chess, we’d calculate chess moves and then we’d die...

and

But what if humans don’t want to install a backup generator? … Alright, you have made some progress. You’ve solved the power source problem. But in doing so you replaced it with another problem: the human compliance problem.

There were more interventions between these and the surveys of average belief, but these interventions caused at least a few students to generate the idea that AGIs are much more creative and powerful than in Terminator 2. The effect on the tail seems to me more important and surprising than the effect on the mean.

There were a lot of interventions before these two, including whatever idiosyncrasies Anders’s philosophy course had, but the outcome before these two interventions seemed pretty standard. The first AI day seemed pretty standard. The chess exercise is probably not common and the two quotes above require its context, but the initial reaction to the chess exercise did not surprise me.
JoshuaZ 18 Oct 2011 12:48 UTC
4 points
I don’t know of any studies backing this up, but I’d be expect that if a position doesn’t go against deeply held beliefs then having to argue for it will make you update in that direction.
- Logos01 19 Oct 2011 3:17 UTC
  0 points
  Parent
  Sounds like a classic case of priming to me.