Mark_Friedenbach comments on Underappreciated points about utility functions (of both sorts)

Mark_Friedenbach 5 Jan 2020 16:50 UTC
4 points
I’m responding here rather than deeper in the thread. It’s not the whole response I wanted to do, which probably deserves an entire sequence, but it gets the ball rolling at least.
My whole quote was:
Preferences cannot be inconsistent, only stated preferences which are not the same thing.
Let’s break that down a bit. First of all I’m taking “preferences” to be shorthand for “preferences over outcomes,” which is to say a qualitative ranking of possible future worlds: E.g. getting no ice cream is bad, me getting ice cream is good, everyone getting ice cream is better, etc.
To quantify this, you can assign numeric values to outcomes in order of increasing preference. Now we have a utility function that scores outcomes. But despite adding numbers we are really just specifying a ranked distribution of outcomes based on preference. U(no ice cream) < U(I have ice cream) < U(everyone has ice cream).
Now what does it mean for preferences to be inconsistent? The simplest example would be the following:
U(no ice cream) < U(I have ice cream)
U(I have ice cream) < U(everyone has ice cream)
U(everyone has ice cream) < U(no ice cream)
If you’ve followed closely so far, alarm bells should be going off. This supposed ordering is nonsensical. No ice cream is worse than just me having ice cream, which is worse than everyone having ice cream… which is worse than no ice cream at all? We’ve looped around! There’s no way to maximize this ordering in order to find the “best” outcome. All outcomes are “better” than all other outcomes, and at the same time “worse” as well.
Inconsistent utility functions (“preferences”) are nonsensical. You can’t reason about them. You can’t use them to solve problems. And when you look at them in detail it doesn’t really make sense as a preference either: how can one outcome be strictly better than another, and also strictly worse at the same time? Make up your mind already!
But of course when we consider a real problem like, say, abortion rights, our preferences feel conflicted. I would argue this is just what it feels like on the inside to have a weighted utility function, with some terms positive and some terms negative. If we consider the case of abortion, we may feel bad for the terminated potential human represented by the fetus, but feel good about the woman’s control of their body or the prevention of an unhappy childhood stemming from an unwanted pregnancy. We also know that every real abortion circumstance is a complex jumble of a bunch of other factors, far too many to write here or even consider in one sitting in real life. So what we end up feeling is:
(positive term) + (negative term) + (...ugh...)
Most people seek reasons to ignore either the positive or negative terms so they can feel good or righteous about themselves for voting Blue or Green. But I would argue that in an issue which feels unresolved and conflicted, the term which dominates in magnitude is really the …ugh… term. We can feel preferences for and against, but we are unable to weigh the balance because the gigantic unresolved pool of …ugh… we have yet to wade into.
What this feels like from the inside is that your preferences are inconsistent: you feel simultaneous pull towards or against multiple outcomes, without a balance of weight to one side or the other.
And yes like you mentioned in another comment what I think you should do is delve deeper and explore that …ugh… field. You don’t have to uncover literally every piece of evidence, but you do have to keep at it until it is balanced by the revealed positive and negative terms, as which point you will no longer feel so conflicted and your preferences will be clear.
Now inconsistent preferences often show up when considering things like virtue ethics: “Stealing is wrong. We want to be good people, so we should never steal.” But what if I need to steal bread to feed my starving family? Providing for my family is also virtuous, is it not? “Inconsistency!” you shout.
No. I defy that notion. Nobody is born with a strong moral instinct against such a complex social virtual such as a prohibition of theft, which requires a concept of other minds, a notion of property, and parent values of fairness and equality. At best you are perhaps born with a spectrum of hedonic likes and dislikes, a few social instincts, and based on child development trajectories, maybe also some instinctual values relating to fairness and social status. As you grow up you develop and/or have imprinted upon you some culturally transmitted terminal values, as well as the non-linear weights to these terminal values which make up your own personal utility function. Since you have to make choices day-to-day in which you evaluate outcomes, you also develop a set of heuristics for maximizing your utility functions: these are the actual reasons you have for the things you do, and include both terminal and instrumental values as well as functional heuristics. To an even smaller extent you also have a meta-level awareness of these goals and heuristics, which upon request you might translate into words. These are your stated preferences.
In circumstances that you are likely to encounter as a rationalist (e..g no cult reprogramming), seeking reflective equilibrium should cause your instrumental goals and functional heuristics to more closely match your underlying utility function without changing the weights, thereby not altering your actual preferences, even if you do occasionally change your stated preferences as a result.
This comment has already gotten super long, and I’ve run out of time. There’s more I wanted to say about on a more mathematical and AI basis about how seeking reflective equilibrium must always create instrumental values which reflect stable terminal values under reflection. But that would be a post equally long and more math heavy...
What links here?
- Said Achmiz's comment on Underappreciated points about utility functions (of both sorts) by Sniffnoy (6 Jan 2020 0:26 UTC; 2 points)
- Said Achmiz 5 Jan 2020 20:07 UTC
  2 points
  Parent
  You are talking, here, about preferences that are intransitive.
  
  The von Neumann–Morgenstern utility theorem specifies four axioms which an agent’s preferences must conform to, in order for said preferences to be formalizable as a utility function. Transitivity of preferences is one of these axioms.
  
  However, the VNM theorem is just a formal mathematical result: it says that if, and only if, an agent’s preferences comply with these four axioms, then there exists (up to positive affine transformation) a utility function which describes these preferences.
  
  The axioms are often described as rules that a “rational agent” must comply with, or as being axioms of “rationality”, etc., but this is a tendentious phrasing—one which is in no way implicit in the theorem (which, again, is only a formally proved result in mathematics), nor presupposed by the theorem. Whether compliance with the VNM axioms is normative (or, equivalently, whether it constitutes, or is required by, “rationality”) is thus an open question.
  
  (Note that whether the actual preferences of existing agents (i.e., humans) comply with the VNM axioms is not an open question—we know that they do not.)
  
  It may interest you to know that, of the four VNM axioms, transitivity is one which I (like you) find intuitively and obviously normative. I cannot see any good reason to have preferences that are intransitive upon reflection; this would be clearly irrational.
  
  But there are three other axioms: independence, continuity, and completeness. I do not find any of those three to be obviously normative. In fact, there are good reasons to reject each of the three. And my actual preferences do indeed violate at least the independence and continuity axioms.
  
  If you search through my comment history, you will find discussions of this topic dating back many years (the earliest, I think, would have been around 2011; the most recent, only a few months ago). My opinion has not materially shifted, over this period; in other words, my views on this have been stable under reflection.
  
  Thus we have the situation I have been describing: my preferences are “inconsistent” in a certain formal sense (namely, they are not VNM-compliant), and thus cannot be represented with a utility function. This property of my preferences is stable under reflection, and furthermore, I endorse it as normative.
  
  P.S.: There are certain other things in your comment which I disagree with, but, as far as I can tell, all are immaterial to the central point, so I am ignoring them.
  - Mark_Friedenbach 6 Jan 2020 0:16 UTC
    2 points
    Parent
    Note that whether the actual preferences of existing agents (i.e., humans) comply with the VNM axioms is not an open question—we know that they do not.
    I defy the data. Give me a hard example please, or I don’t think there’s much benefit to continuing this.
    - Said Achmiz 6 Jan 2020 0:26 UTC
      2 points
      Parent
      Certainly I can do this (in fact, you can find several examples yourself by, as I said, looking through my comment history—but yes, I’m willing to dig them up for you).
      
      But before I do, let me ask: what sorts of examples will satisfy you? After all, suppose I provide an example; you could then say: “ah, but actually this is not a VNM axiom violation, because these are not your real preferences—if you thought about it rationally, you would conclude that your real preferences should instead be so-and-so” (in a manner similar to what you wrote in your earlier comment). Then suppose I say “nope; I am unconvinced; these are definitely my real preferences and I refuse to budge on this—my preferences are not up for grabs, no matter what reasoning you adduce”. Then what? Would you, in such a case, accept my example as an existence proof of my claim? Or would you continue to defy the data?
      - Mark_Friedenbach 6 Jan 2020 0:37 UTC
        2 points
        Parent
        Well I don’t know how I would react without seeing it, which is why I’m asking. But yes my better-odds expectation is that it will only be apparently inconsistent and we’d either be able to unravel the real underlying terminal values or convincingly show that the ramifications of the resulting inconsistency are not compatible with your preferences. If you think that’d be a waste of your time you’re free not to continue with this, with no assumed fault of course.
        Said Achmiz 6 Jan 2020 1:40 UTC
        2 points
        Parent
        Well, let’s say this: I will take some time (when I can, sometime within the next few days) to find some of the comments in question, but if it turns out that you do think that none of the claimed examples are sufficient, then I make no promises about engaging with the proposed “unraveling of real underlying terminal values” or what have you—that part I do think is unlikely to be productive (simply because there is usually not much to say in response to “no, these really are my preferences, despite any of these so-called ‘contradictions’, ‘incompatibilities’, ‘inconsistencies’, etc.”—in other words, preferences are, generally, prior to everything else^[1]).
        
        In the meantime, however, you might consider (for your own interest, if nothing else) looking into the existing (and quite considerable) literature on VNM axiom violations in the actual preferences of real-world humans. (The Wikipedia page on the VNM theorem should be a good place to start chasing links and citations for this.)
        
        ↩︎
        This, of course, avoids the issue of higher-order preferences, which I acknowledge is an important complicating factor, but which I think ought to be dealt with as a special case, and with full awareness of what exactly is being dealt with. (Robin Hanson’s curve-fitting approach is the best framework I’ve seen for thinking about this sort of thing.)
- TAG 11 Jan 2020 17:50 UTC
  1 point
  Parent
  You are proving that if preferences are well-defined , they also need to be consistent.
  
  What does it feel like from the inside to have badly defined preferences? Presumably it feels like sometimes being unable to make decisions, which you report is the case.
  
  You can’t prove that preferences are consistent without first proving they are well defined.