Peterdjones comments on The genie knows, but doesn’t care

Peterdjones 10 Sep 2013 11:12 UTC
−6 points

But our interesting disagreement seems to be over (c). Interesting because it illuminates general differences between the basic idea of a domain-general optimization process (intelligence) and the (not-so-)basic idea of Everything Humans Like. One important difference is that if an AGI optimizes for anything, it will have strong reason to steer clear of possible late intelligence defeaters. Late Friendliness defeaters, on the other hand, won’t scare optimization-process-optimizers in general.

But it will scare friendly ones, which will want to keep their values stable.

But, once again, it doesn’t take any stupidity on the AI’s part to disvalue physically injuring a human,

It takes stupidity to misinterpret friendlienss.
- Rob Bensinger 10 Sep 2013 16:51 UTC
  0 points
  Parent
  
  But it will scare friendly ones, which will want to keep their values stable.
  
  Yes. If an AI is Friendly at one stage, then it is Friendly at every subsequent stage. This doesn’t help make almost-Friendly AIs become genuinely Friendly, though.
  
  It takes stupidity to misinterpret friendlienss.
  
  Yes, but that’s stupidity on the part of the human programmer, and/or on the part of the seed AI if we ask it for advice. The superintelligence didn’t write its own utility function; the superintelligence may well understand Friendliness perfectly, but that doesn’t matter if it hasn’t been programmed to rewrite its source code to reflect its best understanding of ‘Friendliness’. The seed is not the superintelligence. See: http://lesswrong.com/lw/igf/the_genie_knows_but_doesnt_care/
  - Peterdjones 10 Sep 2013 16:53 UTC
    −7 points
    Parent
    
    Yes, but that’s stupidity on the part of the human programmer, and/or on the part of the seed AI if we ask it for advice.
    
    That depends on the architecture. In a Loosemore architecture, the AI interprets high-level directives itself, so if it gets them wrong, that’s it’s mistake.
    - Luke_A_Somers 30 Sep 2013 18:49 UTC
      0 points
      Parent
      … and whose fault is that?
    - Rob Bensinger 10 Sep 2013 17:04 UTC
      0 points
      Parent
      
      the AI interprets high-level directives itself, so if it gets them wrogn,t hat’s it’s mistake.
      
      http://lesswrong.com/lw/rf/ghosts_in_the_machine/