Rain comments on SIAI—An Examination

Rain 5 May 2011 1:56 UTC
19 points
I think of it this way:
- Chance SIAI’s AI is Unfriendly: 80%
- Chance anyone else’s AI is Unfriendly: >99%
- Chance SIAI builds their AI first: 10%
- Chance SIAI builds their AI first while making all their designs public: <1% (no change to other probabilities)
What links here?
- Rain's comment on The $125,000 Summer Singularity Challenge by Kaj_Sotala (30 Jul 2011 15:47 UTC; 5 points)
- PhilGoetz 15 May 2011 4:57 UTC
  3 points
  Parent
  An AI that is successfully “Friendly” poses an extistential risk of a kind that other AIs don’t pose. The main risk from an unfriendly AI is that it will kill all humans. That isn’t much of a risk; humans are on the way out in any case. Whereas the main risk from a “friendly” AI is that it will successfully impose a single set of values, defined by hairless monkeys, on the entire Universe until the end of time.
  
  And, if you are afraid of unfriendly AI because you’re afraid it will kill you—why do you think that a “Friendly” AI is less likely to kill you? An “unfriendly” AI is following goals that probably appear random to us. There are arguments that it will inevitably take resources away from humans, but these are just that—arguments. Whereas a “friendly” AI will be designed to try to seize absolute power, and take every possible measure to prevent humans from creating another AI. If your name appears on this website, you’re already on its list of people whose continued existence will be risky.
  
  (Also, all these numbers seem to be pulled out of thin air.)
  - nshepperd 15 May 2011 14:03 UTC
    9 points
    Parent
    I see no reason an AI with any other expansionist value system will not exhibit the exact same behaviour, except towards a different goal. There’s nothing so special about human values (except that they’re, y’know, good, but that’s a different issue).
  - Rain 15 May 2011 13:42 UTC
    6 points
    Parent
    You’re using a different definition of “friendly” than I am. An 80% chance SIAI’s AI is Unfriendly already contains all of your “takes over but messes everything up in unpredictable ways” scenarios.
    
    The numbers were exaggerated for effect, to show contrast and my thought process. It seems to me that you think the probabilities are reversed.
  - timtyler 15 May 2011 14:32 UTC
    4 points
    Parent
    
    And, if you are afraid of unfriendly AI because you’re afraid it will kill you—why do you think that a “Friendly” AI is less likely to kill you?
    
    One definition of the term explains:
    
    The term “Friendly AI” refers to the production of human-benefiting, non-human-harming actions in Artificial Intelligence systems that have advanced to the point of making real-world plans in pursuit of goals.
    
    See the “non-human-harming” bit. Regarding:
    
    If your name appears on this website, you’re already on its list of people whose continued existence will be risky.
    
    Yes, one of their PR problems is that they are implicitly threatening their rivals. In the case of Ben Goertzel some of the threats are appearing IRL. Let us hear the tale of how threats and nastiness will be avoided. No plan is not a good plan, in this particular case.
  - TimFreeman 15 May 2011 18:22 UTC
    2 points
    Parent
    
    An AI that is successfully “Friendly” poses an extistential risk of a kind that other AIs don’t pose. The main risk from an unfriendly AI is that it will kill all humans. That isn’t much of a risk
    
    What do you mean by existential risk, then? I thought things that killed all humans were, by definition, existential risks.
    
    humans are on the way out in any case.
    
    What, if anything, do you value that you expect to exist in the long term?
    
    There are arguments that [an UFAI] will inevitably take resources away from humans, but these are just that—arguments.
    
    Pretty compelling arguments, IMO. It’s simple—the vast majority of goals can be achieved more easily if one has more resources, and humans control resources, so an entity that is able to self-improve will tend to seize control of all the resources and therefore take control of those resources from the humans.
    
    Do you have a counterargument, or something relevant to the issue that isn’t just an argument?
  - wedrifid 15 May 2011 15:40 UTC
    1 point
    Parent
    
    AI will be designed to try to seize absolute power, and take every possible measure to prevent humans from creating another AI. If your name appears on this website, you’re already on its list of people whose continued existence will be risky.
    
    Not much risk. Hunting down irrelevant blog commenters is a greater risk than leaving them be. There isn’t much of a window during which any human is a slightest threat and during that window going around killing people is just going to enhance the risk to it.
    - timtyler 15 May 2011 16:15 UTC
      2 points
      Parent
      The window is presumably between now and when the winner is obvious—assuming we make it that far.
      
      IMO, there’s plenty of scope for paranoia in the interim. Looking at the logic so far some teams will reason that unless their chosen values get implemented, much of value is likely to be lost. They will then mulitiply that by a billion years and a billion planets—and conclude that their competitors might really matter.
      
      Killing people might indeed backfire—but that still leaves plenty of scope for dirty play.
      - wedrifid 15 May 2011 16:25 UTC
        2 points
        Parent
        
        The window is presumably between now and when the winner is obvious
        
        No. Reread the context. This is the threat from “F”AI, not from designers. The window opens when someone clicks ‘run’.
        timtyler 15 May 2011 17:54 UTC
        0 points
        Parent
        Uh huh. So: world view difference. Corps and orgs will most likely go from 90% human to 90% machine through the well-known and gradual process of automation, gaining power as they go—and the threats from bad organisations are unlikely to be something that will appear suddenly at some point.
- Luke Stebbing 5 May 2011 22:37 UTC
  3 points
  Parent
  If we take those probabilities as a given, they strongly encourage a strategy that increases the chance that the first seed AI is Friendly.
  
  jsalvatier already had a suggestion along those lines:
  
  I wonder if SIAI could publicly discuss the values part of the AI without discussing the optimization part.
  
  A public Friendly design could draw funding, benefit from technical collaboration, and hopefully end up used in whichever seed AI wins. Unfortunately, you’d have to decouple the F and AI parts, which is impossible.
  - jsalvatier 6 May 2011 16:55 UTC
    0 points
    Parent
    Isn’t CEV an attempt to separate F and AI parts?
    - wedrifid 6 May 2011 17:06 UTC
      4 points
      Parent
      
      Isn’t CEV an attempt to separate F and AI parts?
      
      It’s half of the F. Between the CEV and the AGI is the ‘goal stability under recursion’ part.
    - Luke Stebbing 6 May 2011 16:58 UTC
      1 point
      Parent
      It’s a good first step.
      - jsalvatier 6 May 2011 17:05 UTC
        0 points
        Parent
        I don’t understand your impossibility comment, then.
        Luke Stebbing 6 May 2011 17:38 UTC
        3 points
        Parent
        I’m talking about publishing a technical design of Friendliness that’s conserved under self-improving optimization without also publishing (in math and code) exactly what is meant by self-improving optimization. CEV is a good first step, but a programmatically reusable solution it is not.
        
        On doing the impossible:
        
        Before you the terrible blank wall stretches up and up and up, unimaginably far out of reach. And there is also the need to solve it, really solve it, not “try your best”.
        
        jsalvatier 6 May 2011 17:44 UTC
        2 points
        Parent
        OK, I understand that much better now. Great point.