Stuart_Armstrong comments on Siren worlds and the perils of over-optimised search

Stuart_Armstrong 12 May 2014 11:19 UTC
3 points

Superintelligent but dumb doesn’t make logical sense.

And you’ve redefined “anything but perfectly morally in tune with humanity” as “dumb”. I’m waiting for an argument as to why that is so.
- TheAncientGeek 12 May 2014 12:13 UTC
  0 points
  Parent
  There’s an argument that an SAI will figure out the correct morality, and there’s an argument that it wont misinterpret directives. They are different arguments, and the second is much stronger.
  - Stuart_Armstrong 12 May 2014 16:36 UTC
    1 point
    Parent
    I now see your point. I still don’t see how you plan to code a “interpret these things properly” piece of the AI. I think working through a specific design would be useful.
    
    I also think you should work your argument into a less wrong post (and send me a message when you’ve done that, in case I miss it) as 12 or so levels deep into a comment thread is not a place most people will ever see.
    
    They are different arguments, and the second is much stronger.
    
    Not really. Given the first, we can instruct “only do things that [some human or human group with nice values] would approve of” and we’ve got an acceptable morality.
    - TheAncientGeek 14 May 2014 12:21 UTC
      0 points
      Parent
      By “interpret these things correctly”, do you mean linguistic competence, or a goal?
      
      The linguistic competence is aready assumed in any .AI that can talk it’s way out of a box (ie not AIXI like), without provision of a design by MIRI.
      
      An AIXI can’t even conceptualise that it’s in a box, so it doesn’t matter if it gets its goals wrong, It can be rendered safe by boxing.
      
      Which combination of assumptions is the problem?
      - Stuart_Armstrong 14 May 2014 14:49 UTC
        0 points
        Parent
        
        An AIXI can’t even conceptualise that it’s in a box, so it doesn’t matter if it gets its goals wrong, It can be rendered safe by boxing.
        
        I’m not so sure about that… AIXI can learn certain ways of behaving as if it were part of the universe, even with the Cartesian dualism in its code: http://lesswrong.com/lw/8rl/would_aixi_protect_itself/
        
        By “interpret these things correctly”, do you mean linguistic competence, or a goal?
        
        A goal. If the AI becomes superintelligent, then it will develop linguistic competence as needed. But I see no way of coding it so that that competence is reflected in its motivation (and it’s not from lack of searching for ways of doing that).
        TheAncientGeek 14 May 2014 15:24 UTC
        0 points
        Parent
        So is it safe to run AIXI approximations in boxes today?
        
        By code it, do you mean “code, train, or evolve it”?
        
        Note that we dont know much about coding higher level goals in general.
        
        Note that “get things right except where X is concerned” is more complex than “get things right”. Humans do the former because of bias. The less anthropic nature of an .AI might be to our advantage.
        [deleted] 14 May 2014 15:25 UTC
        0 points
        Parent
        
        So is it safe to run AIXI approximations in boxes today?
        
        IMHO, yes. The computational complexity of AIXItl is such that it can’t be used for anything significant on modern hardware.