Manfred comments on Work harder on tabooing “Friendly AI”

Manfred 20 May 2012 22:51 UTC
1 point
I am at least claiming that in the context of designing a good AI, “utility function” should be taken to be a function of some external world, yes.

Otherwise you may run into problems. For example, you could offer to change a robot’s sensory contents and internal state to something with higher utility than its current state—and if the agent refuses, you will reset it. If we were using a “utility wrapper” model, all modeled agents would say yes. But the trivial example of an agent that always says “I would prefer not to” (BartlebeyBot) demonstrates that not all agents make choices that maximize some function of their internal state.
- timtyler 20 May 2012 23:50 UTC
  0 points
  Parent
  So: the only information available to any agent is in the form of its internal state and its sensory channels. Any function it computes must have that domain (or some subset of it). Confining the agent to that domain isn’t any kind of restriction. All utility functions calulated over the state of the world necessarily correspond to other utility functions calulated over the domain of internal state and sensory input.
  
  Your example seems wrong to me. The problem is with:
  
  For example, you could offer to change a robot’s sensory contents and internal state to something with higher utility than its current state—and if the agent refuses, you will reset it. If we were using a “utility wrapper” model, all modeled agents would say yes.
  
  That’s not correct. For one thing, the agent may not believe what you say.
  - Manfred 21 May 2012 0:30 UTC
    1 point
    Parent
    
    the only information available to any agent is in the form of its internal state and its sensory channels. Any function it computes must have that domain (or some subset of it).
    
    Good point. So any function it computes has to be some function of its internal state. However, not all choices correspond to maximizing such a function—any time choices go in a circle, for instance, you’re not maximizing a function. We could imagine a very simple machine with a 3-state memory. It wants to go from A to B, and from B to C, and from C to A. Its choices are always a function if its internal state. But its choices don’t maximize a function of its internal state.
    
    That’s not correct. For one thing, the agent may not believe what you say.
    
    Okay. Replace “offer it a choice” with “offer it a choice, and provide sufficient Bayesian evidence that this is this choice faced.” This doesn’t lead anywhere anyhow.
    - timtyler 21 May 2012 9:40 UTC
      −1 points
      Parent
      
      not all choices correspond to maximizing such a function—any time choices go in a circle, for instance, you’re not maximizing a function. We could imagine a very simple machine with a 3-state memory. It wants to go from A to B, and from B to C, and from C to A. Its choices are always a function if its internal state. But its choices don’t maximize a function of its internal state.
      
      Here’s the corresponding utility function—assuming that state transitions are tied to actions.
      
      If IAM(A) { U(A) = 0, U(B) = 1 U(C) = 0; }
      If IAM(B) { U(A) = 0, U(B) = 0 U(C) = 1; }
      If IAM(C) { U(A) = 1, U(B) = 0 U(C) = 0; }
      
      Using simple maximisation algorithms (e.g. gradient descent) on that utility landscape will produce the behaviour in question. More sophisticted algorithms will do no better.
      
      For one thing, the agent may not believe what you say.
      
      Okay. Replace “offer it a choice” with “offer it a choice, and provide sufficient Bayesian evidence that this is this choice faced.” This doesn’t lead anywhere anyhow.
      
      Your “BartlebeyBot” agent totally ignored Bayesian evidence. By what rule does “my” example agent have to listen and respond to such evidence, while “yours” does not? Again, I don’t think your proposed counter example is remotely convincing.
      
      Why do you think there’s a counter-example? Did you read the referenced Dewey paper about O-Maximisers?
      - Manfred 21 May 2012 15:41 UTC
        0 points
        Parent
        
        Here’s the corresponding utility function.
        
        Any function of the internal state can be expressed with a number of entries equal to the number of possible internal states.
        
        You’ve given me something that’s still interesting, which is all the expected utilities.
        
        By what rule does “my” example agent have to listen and respond to such evidence, while “yours” does not? Again, I don’t think your proposed counter example is remotely convincing.
        
        Because one maximizes a utility function, and the other just says “no” all the time.
        
        Why do you think there’s a counter-example? Did you read the referenced Dewey paper about O-Maximisers?
        
        Thank you for linking that again. Hm, I guess I did assume that agents could have different utilities at different timesteps. Just putting “1” for everything resolves how an O-maximizer can refuse the offer to raise its utility. But then, they assume that the tape of a turing machine is infinite, so the cycle above still is a problem.