TheAncientGeek comments on LessWrong’s attitude towards AI research

TheAncientGeek 22 Sep 2014 11:04 UTC
1 point
It’s foolish to build things without off switches, which translates to building flexible iinteligences that only pursue one goal.
- ChristianKl 22 Sep 2014 11:11 UTC
  2 points
  Parent
  Nobody said something about no off switches. Off-switches mean that you need to understand that the program is doing something wrong to switch it off. A complex AGI that acts in complex ways might produce damage that you can’t trace. Furthermore self modification might destroy an off switch.
  - TheAncientGeek 22 Sep 2014 11:51 UTC
    −1 points
    Parent
    By an off switch I mean a backup goal.
    
    I know nobody mentioned it. The point is that Clippie has one main goal, any no backup goal, so off switches, in my sense, are being IMPLICITLY omitted.
    
    Goals are standardly regarded as immune self modification, so an off switch, in my sense, would be too.
    - ChristianKl 22 Sep 2014 11:53 UTC
      2 points
      Parent
      
      Goals are standardly regarded as immune self modification, so an off switch, in my sense, would be too.
      
      No. Part of what making an FAI is about is to produce agents that keeps their values constant under self modification. It’s not something where you expect that someone accidently get’s it right.
      - TheAncientGeek 22 Sep 2014 12:50 UTC
        2 points
        Parent
        Tht isn’t a fact. MIRI assumes goal stability is desirable for safety, but at the same time, MIRIs favourite UFAI is only possible with goal stability.
        A1987dM 22 Sep 2014 12:55 UTC
        4 points
        Parent
        
        MIRIs favourite UFAI is only possible with goal stability.
        
        A paperclip maximizer wouldn’t become that much less scary if it accidentally turned itself into a paperclip-or-staple maximizer, though.
        Mark_Friedenbach 22 Sep 2014 15:46 UTC
        1 point
        Parent
        What if it decided making paperclips was boring, and spent some time in deep meditation formulating new goals for itself?
        ChristianKl 22 Sep 2014 14:32 UTC
        1 point
        Parent
        Paperclip maximizers serve as illustration of a principle. I think that most MIRI folks consider UFAI to be more complicated than simple paperclip maximizers.
        
        Goal stability also get’s harder the more complicated the goal happens to be. A paperclip maximizer can have a off switch but at the same time prevent anyone from pushing that switch.
    - warbo 22 Sep 2014 12:13 UTC
      1 point
      Parent
      
      By an off switch I mean a backup goal. Goals are standardly regarded as immune self modification, so an off switch, in my sense, would be too.
      
      This is quite a subtle issue.
      
      If the “backup goal” is always in effect, eg. it is just another clause of the main goal. For example, “maximise paperclips” with a backup goal of “do what you are told” is the same as having the main goal “maximise paperclips while doing what you are told”.
      
      If the “backup goal” is a separate mode which we can switch an AI into, eg. “stop all external interaction”, then it will necessarily conflict with the the AI’s main goal: it can’t maximise paperclips if it stops all external interaction. Hence the primary goal induces a secondary goal: “in order to maximise paperclips, I should prevent anyone switching me to my backup goal”. These kind of secondary goals have been raised by Steve Omohundro.
      - TheAncientGeek 22 Sep 2014 12:46 UTC
        1 point
        Parent
        You haven’t dealt with the case where the safety goals are the primary ones.
        
        These kinds of primary goals have been raised by Isaac Asimov.
        FeepingCreature 22 Sep 2014 16:13 UTC
        1 point
        Parent
        The question of “what are the right safety goals” is what FAI research is all about.