Rob Bensinger comments on The genie knows, but doesn’t care

Rob Bensinger 10 Sep 2013 16:51 UTC
0 points

But it will scare friendly ones, which will want to keep their values stable.

Yes. If an AI is Friendly at one stage, then it is Friendly at every subsequent stage. This doesn’t help make almost-Friendly AIs become genuinely Friendly, though.

It takes stupidity to misinterpret friendlienss.

Yes, but that’s stupidity on the part of the human programmer, and/or on the part of the seed AI if we ask it for advice. The superintelligence didn’t write its own utility function; the superintelligence may well understand Friendliness perfectly, but that doesn’t matter if it hasn’t been programmed to rewrite its source code to reflect its best understanding of ‘Friendliness’. The seed is not the superintelligence. See: http://lesswrong.com/lw/igf/the_genie_knows_but_doesnt_care/
- Peterdjones 10 Sep 2013 16:53 UTC
  −7 points
  Parent
  
  Yes, but that’s stupidity on the part of the human programmer, and/or on the part of the seed AI if we ask it for advice.
  
  That depends on the architecture. In a Loosemore architecture, the AI interprets high-level directives itself, so if it gets them wrong, that’s it’s mistake.
  - Luke_A_Somers 30 Sep 2013 18:49 UTC
    0 points
    Parent
    … and whose fault is that?
  - Rob Bensinger 10 Sep 2013 17:04 UTC
    0 points
    Parent
    
    the AI interprets high-level directives itself, so if it gets them wrogn,t hat’s it’s mistake.
    
    http://lesswrong.com/lw/rf/ghosts_in_the_machine/