ZacHirschman comments on Debunking Fallacies in the Theory of AI Motivation

ZacHirschman 16 May 2015 0:02 UTC
0 points
Stepping in as an interlocuter; while I agree that “all-powerful” is poor terminology, I think the described power here is likely with AGI. One feature AGI is nearly certain to have is superhuman processing power; this allows large numbers of Monte Carlo simulations which an AGI could use to predict human responses; especially if there is a Bayesian calibrating mechanism. An above-human ability to predict human responses is an essential component to near-perfect social engineering. I don’t see this as an outrageous, magic-seeming power. Such an AGI could theoretically have the power to convince humans to adopt any desired response. I believe your paper maintains that an AGI wouldn’t use this power, and not that such a power is outrageous.

My personal feelings twards this article are that is sounds suspiciously close to a “No true Scotsman” argument. “No true (designed with friendly intentions) AI would submit to these catastrophic tendencies.” While your arguments are persuasive, I wonder if a catastrophe did occur, would you dismiss it as the work of “not a true AI?” By way of disclaimer, my strengths are in philosophy and mathematics, and decidedly not computer science. I hope you have time to reply anyways.
- [deleted] 16 May 2015 16:26 UTC
  0 points
  Parent
  The only problem with this kind of “high-level” attack on the paper (by which I mean, trying to shoot it down by just pigeonholing it as a “no true scotsman” argument) is that I hear nothing about the actual, meticulous argument sequence given in the paper.
  
  Attacks of that sort are commonplace. They show no understanding of what was actually said.
  
  It is almost as if Einstein wrote his first relativity paper, and it got attacked with comments like “The author seems to think that there is some kind of maximum speed in the universe—an idea so obviously incorrect that it is not worth taking the time to read his convoluted reasoning.”
  
  I don’t mean to compare myself to Albert, I just find it a bit well, pointless when people either (a) completely misunderstand what was said in the paper, or (b) show no sign that they took the time to read and think about the very detailed argument presented in the paper.
  - ZacHirschman 16 May 2015 19:02 UTC
    0 points
    Parent
    You have my apologies if you thought I was attacking or pigeonholing your argument. While I lack the technical expertise to critique the technical portion of your argument, I think it could benefit from a more explicit avoidance of the fallacy mentioned above. I thought the article was very interesting and I will certainly come back to it if I ever get to the point where I can understand your distinctions between swarm intelligence and CFAI. I understand you have been facing attacks for your position in this article, but that is not my intention. Your meticulous arguments are certainly impressive, but you do them a disservice by dismissing well intentioned critique, especially as it applies to the structure of your argument and not the substance.
    
    Einstein made predictions about what the universe would look like if there were a maximum speed. Your prediction seems to be that well built ai will not misunderstand its goals (please assume that I read your article thoroughly and that any misunderstandings are benign). What does the universe look like if this is false?
    
    I probably fall under category a in your disjunction. Is it truly pointless to help me overcome my misunderstanding? From the large volume of comments, it seems likely that this misunderstanding is partially caused by a gap between what you are trying to say, and what was said. Please help me bridge this gap instead of denying its existence or calling such an exercise pointless.
    - [deleted] 18 May 2015 2:18 UTC
      1 point
      Parent
      Hey, no problem. I was really just raising an issue with certain types of critique, which involve supposed fallacies that actually don’t apply.
      
      I am actually pressed for time right now, so I have to break off and come back to this when I can. Just wanted to clarify if I could.
      - ZacHirschman 18 May 2015 15:57 UTC
        0 points
        Parent
        Feel free to disengage; TheAncientGeek helped me shift my paradigm correctly.
    - [deleted] 18 May 2015 16:52 UTC
      0 points
      Parent
      Let me see if I can deal with the “no true scotsman” line of attack.
      
      The way that that fallacy might apply to what I wrote would be, I think, something like this:
      
      MIRI says that a superintelligence might unpack a goal statement like “maximize human happiness” by perpetrating a Maverick Nanny attack on humankind, but Loosemore says that no TRUE superintelligence would do such a thing, because it would be superintelligent enough to realize that this was a ‘mistake’ (in some sense).
      
      This would be a No True Scotsman fallacy, because the term “superintelligence” has been, in effect, redefined by me to mean “something smart enough not to do that”.
      
      Now, my take on the NTS idea is that it cannot be used if there are substantive grounds for saying that there are two categories involved, rather than a real category and a fake category that is (for some unexplained reason) exceptional.
      
      Example: Person A claims that a sea-slug caused the swimmer’s leg to be bitten off, but Person B argues that no “true” sea-slug would have done this. In this example, Person B is not using a No True Scotsman argument, because there are darned good reasons for supposing that sea-slugs cannot bite off the legs of swimmers.
      
      So it all comes down to whether someone accused of NTS is inventing a ficticious category distinction (“true” versus “non-true” Scotsman) solely for the purposes of supporting their argument.
      
      In my case, what I have argued is right up there with the sea-slug argument. What I have said, in effect, is that if we sit down and carefully think about the type of “superintelligence” that MIRI et al. put into their scenarios, and if we explore all the implications of what that hypothetical AI would have to be like, we quickly discover some glaring inconsistencies in their scenarios. The sea-slug, in effect, is supposed to have bitten through bone with a mouth made of mucous. And the sea-slug is so small it could not wrap itself around the swimmer’s leg. Thinking through the whole sea-slug scenario leads us into a mass of evidence indicating that the proposed scenario is nuts. Similarly, thinking through the implications of an AI that is so completely unable to handle context, that it can live with Grade A contradictions at the heart of its reasoning, leads us to a mass of unbelievable inconsistencies in the ‘intelligence’ of this supposed superintelligence.
      
      So, where the discussion needs to be, in respect of the paper, is in the exact details of why the proposed SI might not be a meaningful hypothetical. It all comes down to a meticulous dissection of the mechanisms involved.
      
      To conclude: sorry if I seemed to come down a little heavy on you in my first response. I wasn’t upset, it was just that the NTS critique had occurred before. In some of those previous cases the NTS attack was accompanied by language that strongly implied that I had not just committed an NTS fallacy, but that I was such an idiot that my idiocy was grounds for recommending to all not to even read the paper. ;-)
      - ZacHirschman 18 May 2015 20:23 UTC
        −2 points
        Parent
        ″… thinking through the implications of an AI that is so completely unable to handle context, that it can live with Grade A contradictions at the heart of its reasoning, leads us to a mass of unbelievable inconsistencies in the ‘intelligence’ of this supposed superintelligence.”
        
        This is all at once concise, understandable, and reassuring. Thank you. I still wonder if we are accurately broadening the scope of defined “intelligence” out too far, but my wonder comes from gaps in my specific knowledge and not from gaps in your argument.
    - TheAncientGeek 17 May 2015 7:09 UTC
      −1 points
      Parent
      
      Einstein made predictions about what the universe would look like if there were a maximum speed. Your prediction seems to be that well built ai will not misunderstand its goals
      
      Or a (likely to be built) AI won’t even have the ability to compartmentalise it’s goals from its knowledge base.
      
      It’s not No True Scotsnan to say that no competent researcher would do it that way.
      - ZacHirschman 17 May 2015 15:10 UTC
        0 points
        Parent
        Thank you for responding and attempting to help me clear up my misunderstanding. I will need to do another deep reading, but a quick skim of the article from this point of view “clicks” a lot better for me.
- TheAncientGeek 16 May 2015 11:42 UTC
  0 points
  Parent
  Loosemore’s claim could be steelmanned into the claim that the Maverick Nanny isnt likely...it requires an AI with goals, with harcoded goals, with hardcoded goals including a full explicit definition of happiness, and a buggy full explicit definition of happiness.. That’s a chain of premises.
  - [deleted] 16 May 2015 16:26 UTC
    0 points
    Parent
    That isn’t even remotely what the paper said. It’s a parody.
    - TheAncientGeek 16 May 2015 18:45 UTC
      0 points
      Parent
      Since it is a steelman it isnt supposed to be what the paper is saying,
      
      Are you maintaining, in contrast, that the maverick nanny is flatly impossible?
      - [deleted] 18 May 2015 20:09 UTC
        0 points
        Parent
        Sorry, I may have been confused about what you were trying to say because you were responding to someone else, and I hadn’t come across the ‘steelman’ term before.
        
        I withdraw ‘parody’ (sorry!) but … it isn’t quite what the logical structure of the paper was supposed to be.
        
        It feels like you steelmanned it onto some other railroad track, so to speak.