kjmiller comments on Morality is not about willpower

kjmiller 8 Oct 2011 3:05 UTC
8 points

You can construct a set of values and a utility function to fit your observed behavior, no matter how your brain produces that behavior.

I’m deeply hesitant to jump into a debate that I don’t know the history of, but...

Isn’t it pretty generally understood that this is not true? The Utility Theory folks showed that behavior of an agent can be captured by a numerical utility function iff the agent’s preferences conform to certain axioms, and Allais and others have shown that human behavior emphatically does not.

Seems to me that if human behavior were in general able to be captured by a utility function, we wouldn’t need this website. We’d be making the best choices we could, given the information we had, to maximize our utility, by definition. In other words, “instrumental rationality” would be easy and automatic for everyone. It’s not, and it seems to me a big part of what we can do to become more rational is try and wrestle our decision-making algorithms around until the choices they make are captured by some utility function. In the meantime, the fact that we’re puzzled by things like moral dilemmas looks like a symptom of irrationality.
- TimFreeman 9 Oct 2011 5:20 UTC
  5 points
  Parent
  The Utility Theory folks showed that behavior of an agent can be captured by a numerical utility function iff the agent’s preferences conform to certain axioms, and Allais and others have shown that human behavior emphatically does not.
  
  A person’s behavior can always be understood as optimizing a utility function, it just that if they are irrational (as in the Allais paradox) the utility functions start to look ridiculously complex. If all else fails, a utility function can be used that has a strong dependency on time in whatever way is required to match the observed behavior of the subject. “The subject had a strong preference for sneezing at 3:15:03pm October 8, 2011.”
  
  From the point of view of someone who wants to get FAI to work, the important question is, if the FAI does obey the axioms required by utility theory, and you don’t obey those axioms for any simple utility function, are you better off if:
  - the FAI ascribes to you some mixture of possible complex utility functions and helps you to achieve that, or
  - the FAI uses a better explanation of your behavior, perhaps one of those alternative theories listed in the wikipedia article, and helps you to achieve some component of that explanation?
  I don’t understand the alternative theories well enough to know if the latter option even makes sense.
  - Richard_Kennaway 9 Oct 2011 9:57 UTC
    7 points
    Parent
    
    A person’s behavior can always be understood as optimizing a utility function, it just that if they are irrational (as in the Allais paradox) the utility functions start to look ridiculously complex. If all else fails, a utility function can be used that has a strong dependency on time in whatever way is required to match the observed behavior of the subject. “The subject had a strong preference for sneezing at 3:15:03pm October 8, 2011.”
    
    This is the Texas Sharpshooter fallacy again. Labelling what a system does with 1 and what it does not with 0 tells you nothing about the system. It makes no predictions. It does not constrain expectation in any way. It is woo.
    
    Woo need not look like talk of chakras and crystals and angels. It can just as easily be dressed in the clothes of science and mathematics.
    - TimFreeman 9 Oct 2011 23:59 UTC
      4 points
      Parent
      
      This is the Texas Sharpshooter fallacy again. Labelling what a system does with 1 and what it does not with 0 tells you nothing about the system.
      
      You say “again”, but in the cited link it’s called the “Texas Sharpshooter Utility Function”. The word “fallacy” does not appear. If you’re going to claim there’s a fallacy here, you should support that statement. Where’s the fallacy?
      
      It makes no predictions. It does not constrain expectation in any way. It is woo.
      
      The original claim was that human behavior does not conform to optimizing a utility function, and I offered the trivial counterexample. You’re talking like you disagree with me, but you aren’t actually doing so.
      
      If the only goal is to predict human behavior, you can probably do it better without using a utility function. If the goal is to help someone get what they want, so far as I can tell you have to model them as though they want something, and unless there’s something relevant in that Wikipedia article about the Allais paradox that I don’t understand yet, that requires modeling them as though they have a utility function.
      
      You’ll surely want a prior distribution over utility functions. Since they are computable functions, the usual Universal Prior works fine here, so far as I can tell. With this prior, TSUF-like utility functions aren’t going to dominate the set of utility functions consistent with the person’s behavior, but mentioning them makes it obvious that the set is not empty.
      - Richard_Kennaway 10 Oct 2011 9:05 UTC
        0 points
        Parent
        
        You’ll surely want a prior distribution over utility functions. Since they are computable functions, the usual Universal Prior works fine here, so far as I can tell. With this prior, TSUF-like utility functions aren’t going to dominate the set of utility functions consistent with the person’s behavior
        
        How do you know this? If that’s true, it can only be true by being a mathematical theorem, which will require defining mathematically what makes a UF a TSUF. I expect this is possible, but I’ll have to think about it.
        TimFreeman 10 Oct 2011 18:30 UTC
        0 points
        Parent
        
        With [the universal] prior, TSUF-like utility functions aren’t going to dominate the set of utility functions consistent with the person’s behavior
        
        How do you know this? If that’s true, it can only be true by being a mathematical theorem...
        
        No, it’s true in the same sense that the statement “I have hands” is true. That is, it’s an informal empirical statement about the world. People can be vaguely understood as having purposeful behavior. When you put them in strange situations, this breaks down a bit and if you wish to understand them as having purposeful behavior you have to contrive the utility function a bit, but for the most part people do things for a comprehensible purpose. If TSUF’s were the simplest utility functions that described humans, then human behavior would be random, which is isn’t. Thus the simplest utility functions that describe humans aren’t going to be TSUF-like.
      - Richard_Kennaway 10 Oct 2011 9:00 UTC
        0 points
        Parent
        
        You say “again”, but in the cited link it’s called the “Texas Sharpshooter Utility Function”. The word “fallacy” does not appear. If you’re going to claim there’s a fallacy here, you should support that statement. Where’s the fallacy?
        
        I was referring to the same fallacy in both cases. Perhaps I should have written out TSUF in full this time. The fallacy is the one I just described: attaching a utility function post hoc to what the system does and does not do.
        
        The original claim was that human behavior does not conform to optimizing a utility function, and I offered the trivial counterexample. You’re talking like you disagree with me, but you aren’t actually doing so.
        
        I am disagreeing, by saying that the triviality of the counterexample is so great as to vitiate it entirely. The TSUF is not a utility function. One might as well say that a rock has a utility of 1 for just lying there and 0 for leaping into the air.
        
        If the goal is to help someone get what they want, so far as I can tell you have to model them as though they want something
        
        You have to model them as if they want many things, some of them being from time to time in conflict with each other. The reason for this is that they do want many things, some of them being from time to time in conflict with each other. Members of LessWrong regularly make personal posts on such matters, generally under the heading of “akrasia”, so it’s not as if I was proposing here some strange new idea of human nature. The problem of dealing with such conflicts is a regular topic here. And yet there is still a (not universal but pervasive) assumption that acting according to a utility function is the pinnacle of rational behaviour. Responding to that conundrum with TSUFs is pretty much isomorphic to the parable of the Heartstone.
        
        I know the von Neumann-Morgenstern theorem on utility functions, but since they begin by assuming a total preference ordering on states of the world, it would be begging the question to cite it in support of human utility functions.
        TimFreeman 10 Oct 2011 18:56 UTC
        −5 points
        Parent
        
        The fallacy is the one I just described: attaching a utility function post hoc to what the system does and does not do.
        
        A fallacy is a false statement. (Not all false statements are fallacies; a fallacy must also be plausible enough that someone is at risk of being deceived by it, but that doesn’t matter here.) “Attaching a utility function post hoc to what the system does and does not do” is an activity. It is not a statement, so it cannot be false, and it cannot be a fallacy. You’ll have to try again if you want to make sense here.
        
        The TSUF is not a utility function.
        
        It a function that maps world-states to utilities, so it is a utility function. You’ll have to try again if you want to make sense here too.
        
        We’re nearly at the point where it’s not worth my while to listen to you because you don’t speak carefully enough. Can you do something to improve, please? Perhaps get a friend to review your posts, or write things one day and reread them the next before posting, or simply make an effort not to say things that are obviously false.
        lessdazed 10 Oct 2011 19:26 UTC
        12 points
        Parent
        
        A fallacy is a false statement
        
        Not a pattern of an invalid argument?
        Richard_Kennaway 11 Oct 2011 8:02 UTC
        0 points
        Parent
        Tim, lessdazed has just spoken for me.
        Richard_Kennaway 11 Oct 2011 8:09 UTC
        2 points
        Parent
        
        We’re nearly at the point where it’s not worth my while to listen to you because you don’t speak carefully enough.
        
        Perhaps you are not reading carefully enough.
        Richard_Kennaway 12 Oct 2011 10:58 UTC
        1 point
        Parent
        
        A fallacy is a false statement.
        
        It a function that maps world-states to utilities, so it is a utility function.
        
        As lessdazed has said, that is simply not what the word “fallacy” means. Neither is a utility function, in the sense of VNM, merely a function from world states to numbers; it is a function from lotteries over outcomes to numbers that satisfies their axioms. The TSUF does not satisfy those axioms. No function whose range includes 0, 1, and nothing in between can satisfy the VNM axioms. The range of a VNM utility function must be an interval of real numbers.
        
        We’re nearly at the point where it’s not worth my while to listen to you because you
        
        Ignored.
  - taw 13 Oct 2011 9:14 UTC
    2 points
    Parent
    
    A person’s behavior can always be understood as optimizing a utility function
    
    Models relying on expected utility make extremely strong assumption about treatment of probabilities with utility being strictly linear in probability, and these assumptions can be very easily demonstrated to be wrong.
    
    They also make assumptions that many situations are equivalent (pay $50 for 50% chance to win $100 vs accept $50 for 50% chance of losing $100) where all experiments show otherwise.
    
    Utility theory without these assumptions predicts nothing whatsoever.
  - kjmiller 9 Oct 2011 18:47 UTC
    0 points
    Parent
    Seems to me we’ve got a gen-u-ine semantic misunderstanding on our hands here, Tim :)
    
    My understanding of these ideas is mostly taken from reinforcement learning theory in AI (a la Sutton & Barto 1998). In general, an agent is determined by a policy pi that determines the probability that the agent will make a particular action in a particular state, P = pi(s,a). In the most general case, Pi can also depend on time, and is typically quite complicated, though usually not complex ;).
    Any computable agent operating over any possible state and action space can be represented by some function pi, though typically folks in this field deal in Markov Decision Processes since they’re computationally tractable. More on that in the book, or in a longer post if folks are interested. It seems to me that when you say “utility function”, you’re thinking of something a lot like pi. If I’m wrong about that, please let me know
    
    When folks in the RL field talk about “utility functions”, generally they’ve got something a little different in mind. Some agents, but not all of them, determine their actions entirely using a time-invariant scalar function U(s) over the state space. U takes in future states of the world and outputs the reward that the agent can expect to receive upon reaching that state (loosely “how much the agent likes s”). Since each action in general leads to a range of different future states with different probabilities, you can use U(s) to get an expected utility U’(a,s):
    
    U’(a,s) = sum((p(s,a,s’)*U(s’)),
    
    where s is the state you’re in, a is the action you take, s’ are the possible future states, and p is the probability than action a taken in state s will lead to state s’. Once your agent has a U’, some simple decision rule over that is enough to determine the agent’s policy. There are a bunch of cool things about agents that do this, one of which (not the most important) is that their behavior is much easier to predict. This is because behavior is determined entirely by U, a function over just the state space, whereas Pi is over the conjunction of state and action spaces. From a limited sample of behavior, you can get a good estimate of U(s), and use this to predict future behavior, including in regions of state and action space that you’ve never actually observed. If your agent doesn’t use this cool U(s) scheme, the only general way to learn Pi is to actually watch the thing behave in every possible region of action and state space. This I think is why von Neumann was so interested in specifying exactly when an agent could and could not be treated as a utility-maximizer.
    
    Hopefully that makes some sense, and doesn’t just look like an incomprehensible jargon-filled snow job. If folks are interested in this stuff I can write a longer article about it that’ll (hopefully) be a lot more clear.
    - TimFreeman 10 Oct 2011 0:21 UTC
      2 points
      Parent
      
      Some agents, but not all of them, determine their actions entirely using a time-invariant scalar function U(s) over the state space.
      
      If we’re talking about ascribing utility functions to humans, then the state space is the universe, right? (That is, the same universe the astronomers talk about.) In that case, the state space contains clocks, so there’s no problem with having a time-dependent utility function, since the time is already present in the domain of the utility function.
      
      Thus, I don’t see the semantic misunderstanding—human behavior is consistent with at least one utility function even in the formalism you have in mind.
      
      (Maybe the state space is the part of the universe outside of the decision-making apparatus of the subject. No matter, that state space contains clocks too.)
      
      The interesting question here for me is whether any of those alternatives to having a utility function mentioned in the Allais paradox Wikipedia article are actually useful if you’re trying to help the subject get what they want. Can someone give me a clue how to raise the level of discourse enough so it’s possible to talk about that, instead of wading through trivialities? PM’ing me would be fine if you have a suggestion here but don’t want it to generate responses that will be more trivialities to wade through.
- torekp 9 Oct 2011 15:50 UTC
  1 point
  Parent
  Allais did more than point out that human behavior disobeys utility theory, specifically the “Sure Thing Principle” or “Independence Axiom”. He also argued—to my mind, successfully—that there needn’t be anything irrational about violating the axiom.