Utility is unintuitive
EDIT: My original post was wrong. I will leave it quoted at the end for the purposes of preserving information, but it is now replaced with a new post that correctly expresses my sentiments. The original title of this post was “expected utility maximization is not rational”.
As many people are probably aware, there is a theorem, called the Von Neumann-Morgenstern utility theorem, which states that anyone expressing consistent preferences must be maximizing the expected value of some function. The definition of consistent preferences is as follows:
Let A, B, and C be probability distributions over outcomes. Let A < B denote that B is preferred to A, and A = B denote that someone is indifferent between A and B. Then we assume
Either A < B, A > B, or A = B. In other words, you have to express a preference. This is reasonable because in the real world, you always have to make a decision (even “lack of action” is a decision).
If A < B, and B < C, then A < C. I believe that this is also clearly reasonable. If you have three possible actions, leading to distributions over outcomes A, B, and C, then you have to choose one of the three, meaning one of them is always preferred. So you can’t have cycles of preferences.
If A < B, then (1-x)A+xC < B for some x in (0,1) that is allowed to depend on A, B, and C. In other words, if B is preferred to A then B is also preferred to sufficiently small changes to A.
If A < B then pA+(1-p)C < pB+(1-p)C for all p in (0,1). This is the least intuitive of the four axioms to me, and the one that I initially disagreed with. But I believe that you can argue in favor of it as follows: I flip a coin with weight p, and draw from X if p is heads and C if p is tails. I let you choose whether you want x to be A or B. It seems clear that if you prefer B to A, then you should choose B in this situation. However, I have not thought about this long enough to be completely sure that this is the case. Most other people seem to also think this is a reasonable axiom, so I’m going to stick with it for now.
Given these axioms, we can show that there exists a real-valued function u over outcomes such that A < B if and only if EA[u] < EB[u], where EX is the expected value with respect to the distribution X.
Now, the important thing to note here is that this is an existence proof only. The function u doesn’t have to look at all reasonable, it merely assigns a value to every possible outcome (in particular, even if E1 and E2 seem like completely unrelated events, there is no reason as far as I can tell why u([E1 and E2]) has to have anything to do with u(E1)+u(E2), for instance. Among other things, u is only defined up to an additive constant and so not only is there no reason to be true, it will be completely false for almost all possible utility functions, *even if you keep the person whose utility you are considering fixed*.
In particular, it seems ridiculous that we would worry about an outcome that only occurs with probability 10-100. What this actually means is that our utility function is always much smaller than 10100, or rather that the ratio of the difference in utility between trivially small changes in outcome and arbitrarily large changes in outcome is always much larger than 10-100. This is how to avoid issues like Pascal’s mugging, even in the least convenient possible world (since utility is an abstract construction, no universe can “make” a utility function become unbounded).
What this means in particular is that saying that someone must maximize expected utility to be rational is not very productive. In particular, unless the other person has a sufficiently good technical grasp of what this means, they will probably do the wrong thing. Also, unless *you* have a good technical grasp of what it means, something that appears to violated expected utility might not. Remember, because utility is an artificial construct that has no reason to look reasonable, someone with completely reasonable preferences could have a very weird-*looking* utility function. Instead of telling people to maximize expected utility, we should identify which of the four above axioms they are violating, then explain why they are being irrational (or, if the purpose is to educate in advance, explain to them why the four axioms above should be respected). [Note however that just because a perfectly rational person *always* satisfies the above axioms, doesn’t mean that you will be better off if you satisfy the above axioms more often. Your preferences might have a complicated cycle that you are unsure how to correctly resolve. Picking a resolution at random is unlikely to be a good idea.]
Now, utility is this weird function that we don’t understand at all. Then why does it seem like there’s something called utility that **both** fits our intuitions and that people should be maximizing? The answer is that in many cases utility *can* be equated with something like money + risk aversion. The reason why is due to the law of large numbers, formalized through various bounds such as Hoeffding’s inequality and the Chernoff bound, as well as more powerful arguments likeconcentration of measure. What these arguments say is that if you have a large number of random variables that are sufficiently uncorrelated and that have sufficiently small standard deviation relative to the mean, then with high probability their sum is very close to their expected sum. So when our variables all have means that are reasonable close to each other (as is the case for most every day events), we can say something like the total *monetary* value of our combined actions will be very close to the sum of the expected monetary values of our individual actions (and likewise for other quantities like time). So in situations where, e.g., your goal is to spend as little time on undesirable work as possible, you want to minimize expected time spent on undesirable work, **as a heuristic that holds in most practical cases**. While this might make it *look* like your utility function is time in this case, I believe that the resemblance is purely coincidental, and you certainly shouldn’t be willing to make very low-success-rate gambles with large time payoffs.
Old post:
I’m posting this to the discussion because I don’t plan to make a detailed argument, mainly because I think this point should be extremely clear, even though many people on LessWrong seem to disagree with me.
Maximizing expected utility is not a terminal goal, it is a useful heuristic. To see why always maximizing expected utility is clearly bad, consider an action A with a 10-10 chance of giving you 10100 units of utility, and a 1-10-10 chance of losing you 1010 units of utility. Then expected utility maximization requires you to perform A, even though it is obviously a bad idea. I believe this has been discussed here previously as Pascal’s mugging.
For some reason, this didn’t lead everyone to the obvious conclusion that maximizing expected utility is the wrong thing to do, so I’m going to try to dissolve the issue by looking at why we would want to maximize expected utility in most situations. I think once this is accomplished it will be obvious why there is no particular reason to maximize expected utility for very low-probability events (in fact, one might consider having a utility function over probability distributions rather than actual states of the world).
The reason that you normally want to maximize expected utility is because of the law of large numbers, formalized through various bounds such as Hoeffding’s inequality and the Chernoff bound, as well as more powerful arguments like concentration of measure. What these arguments say is that if you have a large number of random variables that are sufficiently uncorrelated and that have sufficiently small variance relative to the mean, then with high probability their sum is very close to their expected sum. Thus for events with probabilities that are bounded away from 0 and 1 you always expect your utility to be very close to your expected utility, and should therefore maximize expected utility in order to maximize actual utility. But once the probabilities get small (or the events correlated, e.g. you are about to make an irreversible decision), these bounds no longer hold and the reasons for maximizing expected utility vanish. You should instead consider what sort of distribution over outcomes you find desirable.
- 12 Dec 2010 6:16 UTC; -1 points) 's comment on A Thought on Pascal’s Mugging by (
I don’t think you understand what the word utility means. In particular utility is not linear in money. If the you had a 10^-10 chance of giving you $10^100, and a 1-10^-10 chance of losing you $10^10, you would be correct. That’s because you exponentially discount the value of large amounts of money. However, utility is defined to already take the exponential discounting into account.
Unfortunately, your brain is wired to exponentially discount. Even though the utility values have already taken this into account, your intuition doesn’t realize this and wants to exponentially discount again.
Another way to see what’s going on is that your intuition is getting confused by the large numbers (after all 10^100 doesn’t look much bigger then 10^10). Since you didn’t specify what units you were measuring utility in, let’s rescale them by 10^10 and see what your statement looks like:
consider an action A with a 10^-10 chance of giving you 10^90 units of utility, and a 1-10^-10 chance of losing you 1 unit of utility.
Now it should hopefully be clearer why you do indeed want to perform action A.
I used to think I understood this stuff, but now jsteinhardt has me confused. Could you, or someone else familiar with economic orthodoxy, please tell me whether the following is a correct summary of the official position?
A lottery ticket offers one chance in a thousand to win a prize of $1,000,000. The ticket has an expected value of $1000. If you turn down a chance to purchase such a ticket for $900 you are said to be money risk averse.
A rational person can be money risk averse.
The “explanation” for this risk aversion in a rational person is that the person judges that money has decreasing marginal utility with wealth. That is, the person (rationally) judges that $1,000,000 is not 1000 times as good (useful) as $1000. An extra dollar means less to a rich man, than to a poor man.
This shifting relationship between money and utility can be expressed by a “utility function”. For example, it may be the case for this particular rational individual that one util corresponds to $1. But $1000 corresponds to 800 utils and $1,000,000 corresponds to 640,000 utils.
And the rationality of not buying the lottery ticket can be seen by considering the transaction in utility units. The ticket costs 800 utils, but the expected utility of the ticket is only 640 utils. A rational, expected utility maximizing agent will not play this lottery.
ETA: One thing I forgot to insert at this point. How do we create a utility function for an agent? I.e. how do we know that $1,000,000 is only worth 640,000 utils to him. We do so by offering a lottery ticket paying $1,000,000 and then adjusting the odds until he is willing to pay $1 (equal to 1 util by definition) for the ticket. In this case, he buys the ticket when the odds improve to 640,000 to 1.
Now imagine a lottery paying 1,000,000 utils, again with 0.001 probability of winning. The ticket costs 900 utils. An agent who turns down the chance to buy this ticket could be called utility risk averse.
An agent who is utility risk averse is irrational. By definition. Money risk aversion can be rational, but that is explained by diminishing utility of money. There is no such thing as diminishing utility of utility.
That is my understanding of the orthodox position. Now, the question that jsteinhardt asks is whether it is not time to challenge that orthodoxy. In effect, he is asking us to change our definition of “rational”. (It is obvious, of course, that humans are not always “rational” by this definition—it is even true that they have biases which make them systematically deviate from rationality, for reasons which seem reasonable to them. But this, by itself, is not reason to change our definition of “rationality”.)
Recall that the way we rationalized away money risk aversion was to claim that money units become less useful as our wealth increases. Is there some rationalization which shows that utility units become less pleasing as happiness increases? Strikes me as a question worth looking into.
If we define a utility function the way you recommend (which I don’t know if it’s standard to do so, but it seems reasonable), then you’re just not ever going to have utility risk averse individuals. By definition.
If a lottery pays 1M utils with 0.001 probability of winning, and the ticket costs 900 utils, an agent just wouldn’t turn it down. If the agent did turn it down, this means that the lottery wasn’t actually worth 1M utils, but less, because that’s how we determine how much the lottery is worth in the first place.
It is, however, possible that the utility function is bounded and can never reach 1M utils. This, I think, may lead to some confusion here: in that case, the agent would turn down a lottery with a ticket price of 1000 and a probability of winning of 0.1%, no matter the payoff. This seems to imply that he turns down the 1M lottery, but it isn’t irrational in this case.
I’m really enjoying the contrast between your comment and mine.
It’s not every day that the same comment can elicit “By definition, this just can’t be true of anyone” and “Yeah, I think this is true of me.”
Yeah, the utility lottery is a bizarre lottery. For one thing, even if it’s only conducted in monetary payoffs, both the price of the ticket and the amount of money you win depends on your overall well-being. In particular, if you’re on the edge of starvation, the ticket would become close to (but not quite) free.
I can’t imagine how it could be conducted in monetary payoffs, at least without a restrictive upper bound. Not only does the added utility of money decrease with scale, but you can only get so much utility out of money in a finite economy.
I’d be a bit surprised if, outside a certain range, utilons can be described as a function of money at all.
That’s the issue of the usefulness of the Axiom of Independence—I believe.
You can drop that—though you are still usually left with expected utility maximisation.
Then you become a money pump.
It is the most commonly dropped axiom. Dropping it has the advantage of allowing you use the framework to model a wider range of intelligent agents—increasing the scope of the model.
What is the issue? Where, in my account, does AoI come into play? And why do you suggest that AoI only sometimes makes a difference?
My comments about independence were triggered by:
The independence axiom says “no”—I think—though it is “just” an axiom.
For the last question, if you drop axioms you are still usually left with expected utility maximisation—though it depends on exactly how much you drop at once. Maybe it will just be utility maximisation that is left—for example.
Well, for what it’s worth: if I’m living my life in the 1000-utils range and am content there, and I have previously lived in the 100-utils range and it really really really sucked, I think I’d turn down that ticket.
That is to say, under some scenarios I am utility risk averse.
I’m not exactly sure where to go from there, though.
Having 1000 utils is by definition exactly 10 times better than having 100 utils. If this relationship does not hold, you are talking about something other than utils.
I don’t see how that is a response to what I said, so I am probably missing your point entirely. If you’re in the mood to back up some inferential steps, I might appreciate it.
Utils aren’t just another form of currency. They’re a currency that we’ve adjusted the value of so that we’re exactly twice as happy with two utils than with one. Hence if the 1000 utils range is contentment, the 100 utils range can’t possibly really really suck, instead it’s exactly 10 times worse than contentment.
Thinking about them as a currency is in general misleading, since they’re not fungible—what is worth 1 util to you isn’t necessarily worth 1 util to anyone else.
I understand that part, as far as it goes.
So, by definition, if I’m content with 1000 utils, then given 100 utils I’m a tenth of content. (Cononet?) And being cononet is, again by definition, not bad enough that a 99% chance of it is something I could rationally choose to avoid, even if it meant giving up a 1% chance at being gloriously conthousandt.
Well, OK, if y’all say so. But I don’t understand where those quantitative judgments are coming from.
Put another way: if the range I’m in right now is 1000 utils, and I want to estimate what range I was in during the month after my stroke, when things really really sucked… a state that I would not willingly accept a 99% chance of returning to… well, how do I estimate that?
I mean, I understand that 100 is too high a number, though I don’t know how you calculated that, but what is the right number? 10? 1? −100? −10000? What equations apply?
Imagine this bet: If you win, you’ll get to a point that is twice as good as the one you’re at right now: 2000 utils. If you lose, you’ll be at a point that sucked as much as that post-stroke month. What would the probabilty of winning have to be for you to be indifferent to this bet?
The utility of the awful state is then x in the equation 2000P(w) + x(1 - P(w)) = 1000, where P(w) is the probability of winning.
If x were 100, the bet would be worth taking if it offered odds better than 9 in 19. If you wouldn’t take that bet, x is lower. On this scale, I suspect your utility for x is very, very far below 0.
OK… that question at least makes sense to me. Thank you.
Hm. Would I take, say, a 50% bet along those lines? I flip a coin, heads I have another stroke, tails things suddenly get as much better than they are now as now is better than then. Nope, I don’t take that bet.
A 25% chance? Hm.
No, I don’t take that bet.
A 5% chance? Hmmmmmmmm… that is tempting. And I’ll probably always wonder what-if. But no, I don’t think I take that bet.
A 1% chance? Yeah, OK. I probably take that bet.
So P(w) is somewhere between .01 and .05; probably closer to .01. Call it .015.
I think what I’ve probably just demonstrated is that I’m subject to cognitive biases involving small percentage chances… I think my mind is just rounding 1% to “close enough to zero as makes no difference.”
But, OK, I guess utils can measure biased preferences as well as rational ones. All that matters is that it’s my preference, right, not why it’s my preference.
So, all right. 2000P(w) + x(1 - P(w)) = 1000 ⇒ 2000(.015) + x(1 - .015) = 1000 ⇒ 30 + .985x = 1000 ⇒ x=(1000-30)/.985 = ~985.
OK, cool. So my current condition is 1000 util, and my stroke condition (which really really sucks) is 985 utils.
What does that tell us?
You got your numbers flipped. P(w) is your chance of winning. You want
2000(.985) + x(1 - .985) = 1000 ⇒ 1970 + .015x = 1000 ⇒ x = (1000-1970)/.015 = −64,666.66...
That tells you that you really don’t want to have another stroke. Which is hopefully unsurprising.
Ah! That makes far more sense. (That number seemed really implausible, but after triple-checking my math I shrugged my shoulders and went with it.)
OK. So, it no longer seems nearly so plausible that I’d turn down the original bet… it really does help me to have something concrete to attach these numbers to. Thanks.
And, yeah, that is profoundly unsurprising.
No, wait. Thinking about this some more, I realize I’m being goofy.
You offered me a series of bets about “twice as good as the one you’re at right now: 2000 utils” vs “a point that sucked as much as that post-stroke month”. I interpreted that as “I have another stroke” vs. “things suddenly get as much better than they are now as now is better than then” and evaluated those bets based on that interpretation.
But that was a false interpretation, and my results are internally inconsistent. If how-things-were-then is −64.5K, then 2000 is not as much better than they are now as now is better than then… they are merely 1/65th better. In which case I don’t accept that bet, after all… a 1% chance of another stroke vs a 99% chance of a 1/65th improvement in my life is not nearly as compelling.
More generally, I accepted the initial statement that the state we labeled 2000 is “twice as good as” the state we labeled 1000, because that seemed to make sense when we were talking about numbers. But now that I’m trying to actually map those numbers to something, it’s less clear to me that it makes sense.
I mean, it follows that my stroke was “-64 times worse” than how things are now, and… well, what does that even mean?
Sorry… I’m not trying to be a pedant here, I’m just trying to make sure I actually understand what we’re talking about, and it’s pretty clear that I don’t.
Yeah, the notion of “twice as good as things are now” doesn’t actually make sense, because utility is only defined up to affine transformations. (That is, if you decided to raise your utility for every outcome by 1000, you’d make the same decisions afterward as you did before; it’s the relative distances that matter, not the scaling or the place you call 0. It’s rather like the Fahrenheit and Celsius scales for temperature.)
But anyway, you can figure out the relative distances in the same way; call what you have right now 1000, imagine some particular awesome scenario and call that 2000, and then figure out the utility of having another stroke, relative to that. For any plausible scenario (excluding things that could only happen post-Singularity), you should wind up again with an extremely negative (but not ridiculous) number for a stroke.
On the other hand, conscious introspection is a very poor tool for figuring out our relative utilities (to the degree that our decisions can be said to flow from a utility function at all!), because of signaling reasons in particular.
Certainly. Or, really, much of anything else. Is there a better tool available in this case?
Not that I know of. Just a warning not to be too certain of the results you get from this algorithm- your extrapolations to actual decisions may be far from what you’d actually do.
Maybe, but I find it easier to fall for the opposite bias, the one known as “There’s still a chance, right?”
(nods) Sadly, my succeptability to rounding very small probabilities up when I want them to be true is not inversely correlated with my succeptability to rounding very small probabilities down when I want to ignore them. Ain’t motivated cognition grand?
I do find that I can subvert both of these failure modes by switching scales, though. That is, if I start thinking in “permil” rather than percent, all of a sudden a 1% chance (that is, a 10 permil chance) stops seeming quite so negligible.
Huh, that’s a pretty neat hack!
So, let’s say you have 1000 utils when you are offered the bet that Perplexed proposed. You have two possible choices:
You don’t take the bet. You continue to possess 1000 utils with probability 1. Expected value: 1000.
You take the bet.
There is a .999 probability that you will lose and be left with 100 utils.
There is a .001 probability that you will win, giving you a total of 1,000,100 utils.
Expected value: (.999 * 100) + (.001 * 1,000,100) = 1100.
You’re saying that you prefer the option with an expected value of 1000 utils over the option with an expected value of 1100 utils. If we were talking about dollars, you could explain this by saying that you are risk averse, i.e. that the more dollars you have, the less you want each individual dollar. Utils are essentially a measurement of how much you want a particular outcome, so an outcome that is worth 1,000,100 utils is something you want 1000.1 times more than you want an outcome worth 1000 utils. If you don’t want to take this bet, that means you don’t actually want the 1,000,100 outcome enough for it to be worth 1,000,100 utils.
For the purposes of expected utility calculations, don’t think of utils as a measure of happiness; think of them as a measure of the strength of preferences.
(shrug) Sure. As long as I don’t try to understand utils as actually existing in the world, as being interpretable as anything, and I just treat them as numbers, then sure, I can do basic math as well as the next person.
And if all we’re saying is that it’s irrational to think that (.999 100) + (.001 1,000,100) <> 1100, well, OK, I agree with that, and I withdraw my earlier assertion about being utility risk averse. I agree with everyone else; that’s pretty much impossible.
My difficulty arises when I try to unpack those numbers into something real… anything real. But it doesn’t sound like that’s actually part of the exercise.
I do like the rescaling trick.
Why do you think I meant dollars? I said units of utility. Rescaling to 1 unit of utility yields Pascal’s mugging, which I think most people would reject. I still want to perform action B, and if you don’t, then:
It turns out that I am secretly Omega and control the simulation that you live in. Please give me $5 or I will cause you to lose 3^^^3 units of utility. If you are interested in making this deal, please reply and I will give you the proper paypal account to forward your payment to.
You are assuming that there exists a state of the world so bad that facing an extremely tiny chance of being put into that state is worse than losing $5. I’m not sure even Omega could do this because to create such a state Omega might have to change my brain so much that the thing put into that state would no longer be me.
1) If you live in a simulation and I control it, I think it’s hard for you to make any assumptions about how bad a state I can put you in.
2) Your argument fails in the least convenient possible world (e.g., you are trying to get around my objection by raising a point that [might?] be true in our universe but doesn’t have to be true in general).
(2) is a good point. But on (1) before I give you the $5 don’t I at least have to make an assumption or calculation about the probability of such a bad state existing? If I’m allowed to consider numbers of 3^^^3 magnitude for my utility can’t I also assign the probability of (your being Omega and such a bad state exists) of 1/3^^^3 ?
(2) turns out to fail as well, see the modified original post.
No you aren’t.
What about the VNM-utility theorem?
In particular, which axiom should we reject, and why? It sounds like what jsteinhardt is saying is that Axiom 3, Continuity, shouldn’t hold in certain extreme cases.
We have discussed this somewhat before: Academian on VNM-utility theory, and my favorite comment on the post.
The axiom I object to there is independence, not continuity. If A < B then pA+(1-p)C shouldn’t necessarily be less than pB+(1-p)C for small values of p.
I think it would be quite hard to object to the Archimedean formulation of continuity, and completeness and transitivity are obvious. But I see no reason why independence should be true. In particular, independence is basically saying that my preference over outcomes should be linear over the space of probability distributions, which will obviously lead to maximizing expected utility theory (assuming that a utility function exists, which is the part of VNM that I would agree with).
Can someone who subscribes to VNM please justify why independence is a reasonable axiom? Given that VNM implies that we should perform ridiculous actions (like pay Pascal’s mugger), I think this is quite relevant.
Given the number of downvotes here, perhaps I will in a separate post present the version of VNM where we don’t assume independence, after I work out exactly what we end up with.
Apparently there is a book length treatment of this question, titled Rationality and dynamic choice: foundational explorations. The author writes in the introduction:
And in the conclusion:
(Note that I just found this book today and have not yet read it. It does have more than 300 citations in Google Scholar.)
Thanks for the reference! I’m impressed that you still remembered this thread 6 months after the fact.
Actually what happened was, I read a comment of yours mentioning a lab that you worked in, was curious which lab, so started scanning all your comments starting from the earliest to see if you said more about it, noticed this comment, remembered that I had read a paper about whether independence is justified, couldn’t find that paper but found the book instead.
BTW, what lab are you working in? ;)
You would end up with a Nobel Prize in economics.
Oops, apparently independence is also reasonable, as can be seen by flipping a coin with weight p and giving someone a choice between A and B if it comes up heads and giving them C if it comes up tails.
Transitivity? In The Lifespan Dilemma, Eliezer presents a sequence (L_n) in which we are convinced L_n { L_(n+1) throughout, but for which we’d prefer even L_0 to L_n for some large but finite n.
I’m not convinced L_n { L_(n+1), but I don’t seem to have his fixation with really big numbers.
You should restore the original post (possibly, with a disclaimer on top referring to the new post), and post the new post as a new post. Don’t confuse the history. Comments here are on the original post, not on the new one.
If you begin to suspect that the majority of LW believes something incorrectly, your prior probability distribution should resemble
P(“I’m wrong”) >>
P(“They’re wrong because the problem is incredibly hard”) >
P(“They’re wrong because of a subtle bias or flaw in reasoning”) >
P(“They’re wrong because they’re missing something obvious.”)
You should update on all available evidence, of course, but there is very strong evidence that this community is reasonably competent.
This hasn’t been my experience on technical issues that I have experience with. Certainly the fact that some people on LW have started worrying about extremely low-probability events and invoking expected utility maximization to say they are “required to” in order to be rational means that something is wrong.
In this case it turns out that actually VNM is correct, but misunderstood (utility is sufficiently unintuitive that conflating it with anything that you have intuition about will probably lead to issues down the road if you’re not willing to ignore the math when it [seems to] lead to ridiculous conclusions). I’m about to edit the original post to reflect this. I think it was worth the 9 units of karma to resolve my own confusion, though.
Which technical issues are these? Do they represent a large subset of the issues for which there is a significant consensus on LW?
If VNM is correct, what misunderstanding makes caring about low probability/high utility scenarios irrational? If I have the wrong idea about how to maximize utility, I would really like to know.
See the updated original post. The issue is that utility is (a) bounded and (b) probably doesn’t correspond at all to your intuition of what it should be. In particular, scenarios that you think are high utility actually just have high {monetary value, lives saved, etc.}, which may or may not have anything to do with your utility function (except that if lives saved is a terminal value for you, then your utility function is increasing with lives saved, but could increase arbitrarily slowly).
The technical issues that I have experience with are AI and cognitive science. While I’ve only had a few actual technical discussions with people on LessWrong about AI, I had the general impression that the other person ranged from failing to grasp subtle but important insights to having little more than an abstract notion of how an AGI should work. Of course, the Sequences mean that the majority of people at least know enough to understand the general idea of the Bayesian approach to statistical machine learning, but this doesn’t imply a deep knowledge of exactly why the Bayesian approach is a good idea or what the current issues are on a computational level. I consider this to be a huge gap to people who are interested in FAI—if you don’t even know how the AI is going to work, you don’t have much chance of telling it to be friendly. In particular, your conception of how to influence its decision theory might be completely different from what is actually possible.
[EDIT: I should also note as a former diehard Bayesian that the immense hate towards frequentists is entirely unjustified. First of all, LW has an entirely different definition of the term than everyone else in the world, as far as I can tell. However, I believe that many people here would still consider the Bayesian approach to be obviously superior even after getting the right definition, despite the fact that the frequentist approach to statistical machine learning is quite reasonable. I believe that this was the first instance that led me to believe that LW wasn’t quite as knowledgeable as I had originally supposed.]
I am less familiar with cognitive science, but certain cognitive biases taken for granted on LW are simply empirically nonexistent, for instance actor-observer bias, which I think is incorrectly labelled the affect heuristic on LW, although I could be misremembering.
Woah. If you have a pretty solid handle on how an AGI should work, you’re way too far out of my league for me to contribute meaningfully to this conversation.
But to try and clear up a hole in my understanding: VNM utility functions assign real number utility values, and multiplying them by arbitrary scalars doesn’t change anything meaningful. Since the reals are uncountably infinite, where do the bounds on VNM utilities come from?
Given that I originally failed to understand VNM, I doubt I’m out of anyone’s league. I’m just saying that I have a good enough general background that if a commonly held assumption seems to lead to ridiculous conclusions, and there is a simple way, with sound technical justifications, to avoid those conclusions, but that involves rejecting the assumption, I am willing to reject the assumption rather than assuming that my reasoning is incorrect. This might be a bad idea, but as long as I post my rejection so that everyone can tell me why I’m stupid, it seems to work reasonably.
Also, I certainly don’t have a solid handle on how AGI should work, but I can see the different relevant components and the progress that people seem to be making on a couple of the fronts.
But to answer your question, the bound is on min |u(x)-u(y)| / max |u(a)-u(b)|, where the min is over alternatives x and y that you would consider sufficiently different to be worth distinguishing. This gets around the fact that u is only defined up to affine transformations, since ratios of differences are invariant under affine transformations. The point is basically that if p is very small and x and y are alternatives that are different enough that I would take (1-p)x+pa over (1-p)y+pb for ANY possible a and b (even very bad a and very good b), then by expected utility maximization I must have (1-p)u(x)+pu(a) > (1-p)u(y)+pu(b) for all a and b. Algebraic manipulation of this gives a bound on p/(1-p), which is basically p for small p, and then finding the optimum bound over x, y, a, and b gives the result claimed at the beginning of this paragraph.
Hold on, aren’t you assuming your conclusion here? Unless the utility function is bounded already, a and b can be arbitrarily large values, and can always be made large enough to alter the inequality, regardless of the values of x, y, and p. That is, there are no possible alternatives x and y for which your statement holds.
My claim is that IF we don’t care about probabilities that are smaller than p (and I would claim that we shouldn’t care about probabilities less than, say, 1 in 3^^^3), then the utility function is bounded. This is because if we don’t care about probabilities smaller than p, then for any two objects x and y that are sufficiently different that we DO care about one over the other, we must have
(1-p)x+pa > (1-p)y+pb,
no matter how bad a is and how good b is. I am being somewhat sloppy in my language here. Probably differences in outcome can very continuously, so I can always find x and y that are so similar to each other that I do start carrying about other equally irrelevant differences in the distribution of outcomes. For instance, I would always choose spending 5 minutes of time and having a 10^-100 chance of being tortured for 10^1000 years over spending 10 minutes of time. But 5 minutes + chance of torture probably isn’t superior to 5+10^-1000 minutes. What I really mean is that, if x and y are sufficiently different that probabilities smaller than p just don’t come into the picture, then I can bound the range of my utility function in terms of |u(x)-u(y)| and p.
So just to be clear, my claim is that [not caring about small probabilities] implies [utility function is bounded]. This I can prove mathematically.
The other part of my claim [which I can’t prove mathematically] is that it doesn’t make sense to care about small probabilities. Of course, you can care about small probabilities while still having consistent preferences (heck, just pick some quantity that doesn’t have an obvious upper bound and maximize its expected value). But I would have a hard time believing that that is the utility function corresponding to your true set of preferences.
Expected utility is expected (subjective) value plus risk aversion, right? Do you have in mind a “desirable distribution over outcomes” that can’t be expressed that way?
If I have a bet that loses me 1 util on heads, and 2 utils on tails, I wont take it; it’s a sure loss. Add a million to those values and we get a bet that I would take. So clearly u changes on addition of constants.
Multiplicative constants won’t have an effect, but they don’t change additivity, either.
If you were really adding a constant to the utility function, you would also add a million to the utility of not taking the bet.
Let U be your original utility function and V be U + 1,000,000, differring by an additive constant.
Then
So, using U, you would refuse the bet.
And
So, using V, you would also refuse to take the bet.
As expected, adding a constant to the utility function did not change the decision.
A sure loss compared to what? Not taking the bet has a utility too.
A bet that loses 1 util on heads and 2 on tails compared to the status quo would have utilities like:
u(heads) = −1
u(tails) = −2
u(status quo) = 0
Add a constant to all these and the status quo will still be higher.
Correct, my mistake.
A rational person, by definition, maximizes expected utility. You’re fighting a definition.
Be careful about arguing by definition.
Expected utility isn’t a real thing, it’s an artificial construct defined as that which a rational person maximizes. What other definition of expected utility could you provide such that a rational person might not maximize expected utility? If we define X to be equal to 3Y then someone claiming to have found an example in which X does not equal 3Y has to be wrong because they are “fighting a definition.”
The author of the top post, I believe, made a mistake because, I think, he didn’t realize that it’s tautologically impossible for a rational person to not maximize expected utility.
In the post you cite Eliezer wrote “But eyeballing suggests that using the phrase by definition, anywhere outside of math, is among the most alarming signals of flawed argument I’ve ever found. ” Expected utility is math.
If you look up the Wikipedia entry on expected utility you find “There are four axioms[4] of the expected utility theory that define a rational decision maker.”
If you look up von Neumann-Morgenstern Utility theorem on Wikipedia you find “In 1944, John von Neumann and Oskar Morgenstern exhibited four relatively modest[1] axioms of “rationality” such that any agent satisfying the axioms has a utility function. That is, they proved that an agent is (VNM-)rational if and only if there exists a real-valued function u defined on possible outcomes such that every preference of the agent is characterized by maximizing the expected value of u, which can then be defined as the agent’s VNM-utility”
I arguably try to think rationally (and posting on LessWrong, my thinking feels clearer [1], and it helped remind me to respond rather than react in one particularly trying recent situation), but this is why definitions and wearing them may be best avoided. I don’t wear the label “rationalist”, but I try to use the techniques found here to think better. This is not quite the same thing.
[1] which, using fictitous examples, reminds me of all the stories where the clearly batshit insane protagonist speaks of how clear their thoughts feel now. That is, their internal editor is on the blink. I hope mine isn’t, but only results will tell me that.
n.b.: the parent comment may be wrong, but it’s applying thinking in a manner I thought was worth encouraging. Hence, an upvote.