Gunnar_Zarncke comments on Open Thread, Apr. 06 - Apr. 12, 2015

Gunnar_Zarncke 7 Apr 2015 15:56 UTC
2 points
I don’t know why this is downvoted so much without an explanation. The problem from the interaction with sub-agents is real even if already known, but G0W51 may not know that.
- gjm 7 Apr 2015 23:59 UTC
  4 points
  Parent
  I suspect because the Big Wall Of Text writing style is offputting. Perhaps, more specifically, because it’s annoying to slog one’s way through a Big Wall Of Text without an exciting new insight or something as payoff.
  
  (I didn’t downvote it. I’m just speculating.)
- G0W51 8 Apr 2015 1:39 UTC
  1 point
  Parent
  
  The problem from the interaction with sub-agents is real even if already known...
  
  Do you happen to know of any good places to read about this issue? I have yet to look into what others have thought of it.
  - NancyLebovitz 8 Apr 2015 16:22 UTC
    1 point
    Parent
    I’ve seen not-especially-theoretical discussion about this problem for human organizations, though mostly from the point of view of lower status people complaining that they’re being given impossible, incomprehensible, and/or destructive commands.
    
    This points at another serious problem—you need communication (even if mostly not forceful communication) to flow up the hierarchy as well as down the hierarchy.
    
    You’re pointing at a failure mode (I have a JOB! I wanna do my JOB!) which is quite real, but there are equal and opposite failure modes. For example, obsessive compulsive disorder would be an example of lower level functions insisting on taking charge. Eating disorders are (at least in some cases) examples of higher level functions overriding competent lower level functions.
    
    What I’m concluding from this is that if Friendliness is to work, it has to pervade the hierarchy of agents.
    - G0W51 10 Apr 2015 3:24 UTC
      0 points
      Parent
      
      I’ve seen not-especially-theoretical discussion about this problem for human organizations, though mostly from the point of view of lower status people complaining that they’re being given impossible, incomprehensible, and/or destructive commands.
      
      Remember that making humans in organizations cooperate is a rather different from making the many parts of an AI cooperate with the other parts, because people in organizations can’t (feasibly) be reprogrammed to have their values be aligned, but AI HLAs can.
      
      What I’m concluding from this is that if Friendliness is to work, it has to pervade the hierarchy of agents.
      
      True. The real issue is that if you give a lower-level action the same goals and utility functions as the higher ones, you’ve lost all benefit of having HLAs!.