timtyler comments on Work harder on tabooing “Friendly AI”

timtyler 20 May 2012 11:56 UTC
4 points

This definition has the important feature of restricting “Friendly AI” to designs that have a utility function.

That doesn’t seem important—for the reason described here—where it says:

Utility maximisation is a general framework which is powerful enough to model the actions of any computable agent. The actions of any computable agent—including humans—can be expressed using a utility function.
- David_Gerard 20 May 2012 19:05 UTC
  3 points
  Parent
  
  The actions of any computable agent—including humans—can be expressed using a utility function.
  
  This is a highly questionable statement concerning humans, and the paper linked from that page doesn’t appear to prove it.
  
  Edit: ah, this includes “functions” that anyone else would call a “stupidly complicated state machine” and which may not actually be feasible to calculate.
  - timtyler 20 May 2012 20:24 UTC
    6 points
    Parent
    The term “function”—as used on the page—is a technical term with a clearly-established meaning.
    - David_Gerard 10 Jun 2012 17:09 UTC
      1 point
      Parent
      Yes indeed, and the only way to fit that function to the human state machine is to include a “t” term, over the life of the human in question. Which is pretty much infeasible to calculate unless you invoke “and then a miracle occurs”.
      - timtyler 11 Jun 2012 9:50 UTC
        −5 points
        Parent
        Utility-based models are no more “infeasible to calculate” than any other model. Indeed you can convert any model of an agent into a utility-based model by an I/O-based “wrapper” of it—as described here. The idea that utility-based models of humans are more computationally intractible than other models is just wrong.
        What links here?
        Richard_Kennaway's comment on We Don’t Have a Utility Function by [deleted] (4 Apr 2013 13:15 UTC; 5 points)
        Richard_Kennaway 11 Jun 2012 12:10 UTC
        9 points
        Parent
        
        Indeed you can convert any model of an agent into a utility-based model by an I/O-based “wrapper” of it—as described here.
        
        You keep repeating this Texas Sharpshooter Utility Function fallacy (earlier appearances in the link you gave, and here and here, of observing what the agent does, and retrospectively labelling that with utility 1 and everything else with utility 0. And as often as you do that, I will point out it’s a fallacy. Something that can only be computed after the action is known cannot be used before the fact to choose the action.
        timtyler 11 Jun 2012 23:15 UTC
        −2 points
        Parent
        I was talking about wrapping a model of a human—thus converting a non-utility-based model into a utility-based one. That operation is, of course, not circular. If you think the argument is circular, you haven’t grasped the intended purpose of it.
        Richard_Kennaway 12 Jun 2012 7:07 UTC
        3 points
        Parent
        It doesn’t give you a utility-based model. A model is a structure whose parts correspond to parts of the thing modelled, and which interact in the same way as in the thing modelled. This post-hoc utility function does not correspond to anything.
        
        What next? Label with 1 everything that happens and 0 everything that doesn’t and call that a utliity-based model of the universe?
        timtyler 12 Jun 2012 9:50 UTC
        −4 points
        Parent
        Here, I made it pretty clear from the beginning that I was starting with an existing model—and then modifying it. A model with a few bits strapped onto it is still a model.
        Richard_Kennaway 12 Jun 2012 10:10 UTC
        7 points
        Parent
        If I stick a hamburger on my car, the car is still a car—but the hamburger plays no part in what makes it a car.
        timtyler 12 Jun 2012 10:30 UTC
        −2 points
        Parent
        AFAICS, I never made the corresponding claim—that the utility function was part of what made the model a model.
        Expand this thread
        Richard_Kennaway 12 Jun 2012 12:02 UTC
        3 points
        Parent
        How else can I understand your words “utility-based models”? This is no more a utility-based model than a hamburger on a car is a hamburger-based car.
        timtyler 12 Jun 2012 23:28 UTC
        −2 points
        Parent
        Well, I would say “utilitarian”, but that word seems to be taken. I mean that the model calculates utilities associated with its possible actions—and then picks the action with the highest utility.
        Richard_Kennaway 13 Jun 2012 8:57 UTC
        3 points
        Parent
        But that is exactly what this wrapping in a post-hoc utility function doesn’t do. The model first picks an action in whatever way it does, then labels that with utility 1.
- Manfred 20 May 2012 16:39 UTC
  1 point
  Parent
  The trouble, as usual, being that most of these descriptive utility functions are very complicated relative to the storage space we have available—they start out in the format of “one number for every possible history of the universe,” and don’t get compressed much from there.
  - timtyler 20 May 2012 20:19 UTC
    0 points
    Parent
    That is not a problem. A compact utility-based description of an agent’s behaviour is only ever slightly longer than the shortest description of it available. It’s easy to show that by considering a utility-based “wrapper” around the shortest description.
    - Manfred 20 May 2012 21:54 UTC
      0 points
      Parent
      That’s a good way to get effective expected utilities. But expected utilities aren’t utility functions. Hm, there may be a way to fix that that I haven’t noticed, though. But maybe not.
      - timtyler 20 May 2012 22:02 UTC
        0 points
        Parent
        Your comment doesn’t seem very clear to me. Are you thinking that a “utility function” needs to have a specific domain which is not simply sensory contents and internal state? If so, do you have a reference for that notion?
        Manfred 20 May 2012 22:51 UTC
        1 point
        Parent
        I am at least claiming that in the context of designing a good AI, “utility function” should be taken to be a function of some external world, yes.
        
        Otherwise you may run into problems. For example, you could offer to change a robot’s sensory contents and internal state to something with higher utility than its current state—and if the agent refuses, you will reset it. If we were using a “utility wrapper” model, all modeled agents would say yes. But the trivial example of an agent that always says “I would prefer not to” (BartlebeyBot) demonstrates that not all agents make choices that maximize some function of their internal state.
        timtyler 20 May 2012 23:50 UTC
        0 points
        Parent
        So: the only information available to any agent is in the form of its internal state and its sensory channels. Any function it computes must have that domain (or some subset of it). Confining the agent to that domain isn’t any kind of restriction. All utility functions calulated over the state of the world necessarily correspond to other utility functions calulated over the domain of internal state and sensory input.
        
        Your example seems wrong to me. The problem is with:
        
        For example, you could offer to change a robot’s sensory contents and internal state to something with higher utility than its current state—and if the agent refuses, you will reset it. If we were using a “utility wrapper” model, all modeled agents would say yes.
        
        That’s not correct. For one thing, the agent may not believe what you say.
        Manfred 21 May 2012 0:30 UTC
        1 point
        Parent
        
        the only information available to any agent is in the form of its internal state and its sensory channels. Any function it computes must have that domain (or some subset of it).
        
        Good point. So any function it computes has to be some function of its internal state. However, not all choices correspond to maximizing such a function—any time choices go in a circle, for instance, you’re not maximizing a function. We could imagine a very simple machine with a 3-state memory. It wants to go from A to B, and from B to C, and from C to A. Its choices are always a function if its internal state. But its choices don’t maximize a function of its internal state.
        
        That’s not correct. For one thing, the agent may not believe what you say.
        
        Okay. Replace “offer it a choice” with “offer it a choice, and provide sufficient Bayesian evidence that this is this choice faced.” This doesn’t lead anywhere anyhow.
        timtyler 21 May 2012 9:40 UTC
        −1 points
        Parent
        
        not all choices correspond to maximizing such a function—any time choices go in a circle, for instance, you’re not maximizing a function. We could imagine a very simple machine with a 3-state memory. It wants to go from A to B, and from B to C, and from C to A. Its choices are always a function if its internal state. But its choices don’t maximize a function of its internal state.
        
        Here’s the corresponding utility function—assuming that state transitions are tied to actions.
        
        If IAM(A) { U(A) = 0, U(B) = 1 U(C) = 0; }
        If IAM(B) { U(A) = 0, U(B) = 0 U(C) = 1; }
        If IAM(C) { U(A) = 1, U(B) = 0 U(C) = 0; }
        
        Using simple maximisation algorithms (e.g. gradient descent) on that utility landscape will produce the behaviour in question. More sophisticted algorithms will do no better.
        
        For one thing, the agent may not believe what you say.
        
        Okay. Replace “offer it a choice” with “offer it a choice, and provide sufficient Bayesian evidence that this is this choice faced.” This doesn’t lead anywhere anyhow.
        
        Your “BartlebeyBot” agent totally ignored Bayesian evidence. By what rule does “my” example agent have to listen and respond to such evidence, while “yours” does not? Again, I don’t think your proposed counter example is remotely convincing.
        
        Why do you think there’s a counter-example? Did you read the referenced Dewey paper about O-Maximisers?
        Manfred 21 May 2012 15:41 UTC
        0 points
        Parent
        
        Here’s the corresponding utility function.
        
        Any function of the internal state can be expressed with a number of entries equal to the number of possible internal states.
        
        You’ve given me something that’s still interesting, which is all the expected utilities.
        
        By what rule does “my” example agent have to listen and respond to such evidence, while “yours” does not? Again, I don’t think your proposed counter example is remotely convincing.
        
        Because one maximizes a utility function, and the other just says “no” all the time.
        
        Why do you think there’s a counter-example? Did you read the referenced Dewey paper about O-Maximisers?
        
        Thank you for linking that again. Hm, I guess I did assume that agents could have different utilities at different timesteps. Just putting “1” for everything resolves how an O-maximizer can refuse the offer to raise its utility. But then, they assume that the tape of a turing machine is infinite, so the cycle above still is a problem.
- ChrisHallquist 20 May 2012 13:26 UTC
  1 point
  Parent
  Following the links, at first glance it looks like there’s an argument there that anything with computable behavior will have behavior expressible as a utility function. Is that correct?
  - timtyler 20 May 2012 13:49 UTC
    3 points
    Parent
    Yes.