Is it always correct to choose that action with the highest expected utility?
Suppose I have a choice between action A, which grants −100 utilons with 99.9% chance and +1000000 utilons with 0.1% chance, or action B which grants +1 utilon with 100% chance. A has an expected utility of +900.1 utilons, while B has an expected utility of +1 utilon. This decision will be available to me only once, and all future decision will involve utility changes on the order of a few utilons.
Intuitively, it seems like action A is too risky. I’ll almost certainly end up with a huge decrease in utility, just because there’s a remote chance of a windfall. Risk aversion doesn’t apply here, since we’re dealing in utility, right? So either I’m failing to truly appreciate the chance at getting 1M utilons—I’m stuck thinking about it as I would money—or this is a case where there’s reason to not take the action that maximizes expected value. Help?
EDIT: Changed the details of action A to what was intended
I think the non-intuitive nature of the A choice is because we naturally think of utilons as “things”. For any valuable thing (money, moments of pleasure, whatever) anybody who is minimally risk adverse would choose B. But utllons are not things, they are abstractions defined by one’s preferences. So that A is the rational choice is a tautology, in the standard versions of utility theory.
It may help to think it the other way around, starting from the actual preference. You would choose a 99.9% chance of losing ten cents and 0.1% chance of winning 10000 dollars over winning one cent with certainty, right? So then perhaps, as long as we don’t think of other bets and outcomes, we can map winning 1 cent to +1 utilon, losing 10 cents to −100 utilons and winning 10000 dollars to +10000 utilons. Then we can refine and extend the “outcomes ⇔ utilons” map by considering your actual preferences under more and more bets. As long as your preferences are self-consistent in the sense of the VNM axioms, then there will a mapping that can be constructed.
ETA: of course, it is possible that your preferences are not self-consistent. The Allais paradox is an example where many people’s intuitive preferences are not self-consistent in the VNM sense. But constructing such a case is more complicated that just considering risk-aversion on a single bet.
Since utility functions are only unique up to affine transformation, I don’t know what to make of this comment. Do you have some sort of canonical representation in mind or something?
In the context of this thread, you can consider U(status quo) = 0 and U(status quo, but with one more dollar in my wallet) = 1. (OK, that makes +10000 an unreasonable estimate of the upper bound; pretend I said +1e9 instead.)
Yes, this seems almost certainly true (and I think is even necessary if you want to satisfy the VNM axioms, otherwise you violate the continuity axiom).
Yes I’m quite aware… note that if there’s a sequence of outcomes whose values increase without bound, then you could construct a lottery that has infinite value by appropriately mixing the lotteries together, e.g. put probability 2^-k on the outcome with value 2^k. Then this lottery would be problematic from the perspective of continuity (or even having an evaluable utliity function).
Are lotteries allowed to have infinitely many possible outcomes? (The Wikipedia page about the VNM axioms only says “many”; I might look it up on the original paper when I have time.)
There are versions of the VNM theorem that allow infinitely many possible outcomes, but they either
1) require additional continuity assumptions so strong that they force your utility function to be bounded
or
2) they apply only to some subset of the possible lotteries (i.e. there will be some lotteries for which your agent is not obliged to define a utility).
I might look it up on the original paper when I have time.
The original statement and proof given by VNM are messy and complicated. They have since been neatened up a lot. If you have access to it, try “Follmer H., and Schied A., Stochastic Finance: An Introduction in Discrete Time, de Gruyter, Berlin, 2004”
See also Kreps, Notes on the Theory of Choice. Note that one of these two restrictions are required in order to specifically prevent infinite expected utility. So if a lottery spits out infinite expected utility, you broke something in the VNM axioms.
For anyone who’s interested, a quick and dirty explanation is that the preference relation is primitive, and we’re trying to come up with an index (a utility function) that reproduces the preference relation. In the case of certainty, we want a function U:O->R where O is the outcome space and R is the real numbers such that U(o1) > U(o2) if and only if o1 is preferred to o2. In the case of uncertainty, U is defined on the set of probability distributions over O, i.e. U:M(O) → R. With the VNM axioms, we get U(L) = E_L[u(o)] where L is some lottery (i.e. a probability distribution over O). U is strictly prohibited from taking the value of infinity in these definition. Now you probably could extend them a little bit to allow for such infinities (at the cost of VNM utility perhaps), but you would need every lottery with infinite expected value to be tied for the best lottery according to the preference relation.
I’m not sure, although I would expect VNM to invoke the Hahn-Banach theorem, and it seems hard to do that if you only allow finite lotteries. If you find out I’d be quite interested. I’m only somewhat confident in my original assertion (say 2:1 odds).
I’d flip that around. Whatever action you end up choosing reveals what you think has highest utility, according to the information and utility function you have at the time. It’s almost a definition of what utility is—if you consistently make choices that rank lower according to what you think your utility function is, then your model of your utility function is wrong.
If the utility function you think you have prefers B over A, and you prefer A over B, then there’s some fact that’s missing from the utility function you think you have (probably related to risk).
I’ve recently come to terms with how much fear/anxiety/risk avoidance is in my revealed preferences. I’m working on working with that to do effective long-term planning—the best trick I have so far is weighing “unacceptable status quo continues” as a risk. That, and making explicit comparisons between anticipated and experienced outcomes of actions (consistently over-estimating risks doesn’t help any, and I’ve been doing that).
I sometimes have the same intuition as banx. You’re right that the problem is not in the choice, but in the utility function and it most likely stems from thinking about utility as money.
Lets examine the previous example and make it into money (dollars):
−100 [dollars] with 99.9% chance and +10,000 [dollars] with 0.1% vs 100% chance at +1 [dollar]
When doing the math, you have to take into future consequences as well. For example, if you knew you would be offered 100 loaded bets with an expected payoff of $0.50 in the future, each of which only cost you $1 to participate in, then you have to count this in your original payoff calculation if losing the $100 would prohibit you from being able to take these other bets.
Basically, you have to think through all the long-term consequences when calculating expected payoff, even in dollars.
Then when you try to convert this to utility, it’s even more complicated. Is the utility per dollar gained in the +$10,000 case equivalent to the utility per dollar lost in the -$100 case? Would you feel guilty and beat yourself up afterwards if you took a bet that you had a 99.9% chance of losing? Even though a purely rational agent probably shouldn’t feel this, it’s still likely a factor in most actual humans’ utility functions.
TrustVectoring summed it up well above:
If the utility function you think you have prefers B over A, and you prefer A over B, then there’s some fact that’s missing from the utility function you think you have.
If you still prefer picking the +1 option, then most likely your assessment that the first choice only gives a negative utility of 100 is probably wrong. There are some other factors that make it a less attractive choice.
Depending on your preferred framework, this is in some sense backwards: utility is, by definition, that thing which it is always correct to choose the action with the highest expected value of (say, in the framework of the von Neumann-Morgenstern theorem).
Is it always correct to choose that action with the highest expected utility?
Suppose I have a choice between action A, which grants −100 utilons with 99.9% chance and +1000000 utilons with 0.1% chance, or action B which grants +1 utilon with 100% chance. A has an expected utility of +900.1 utilons, while B has an expected utility of +1 utilon. This decision will be available to me only once, and all future decision will involve utility changes on the order of a few utilons.
Intuitively, it seems like action A is too risky. I’ll almost certainly end up with a huge decrease in utility, just because there’s a remote chance of a windfall. Risk aversion doesn’t apply here, since we’re dealing in utility, right? So either I’m failing to truly appreciate the chance at getting 1M utilons—I’m stuck thinking about it as I would money—or this is a case where there’s reason to not take the action that maximizes expected value. Help?
EDIT: Changed the details of action A to what was intended
I think the non-intuitive nature of the A choice is because we naturally think of utilons as “things”. For any valuable thing (money, moments of pleasure, whatever) anybody who is minimally risk adverse would choose B. But utllons are not things, they are abstractions defined by one’s preferences. So that A is the rational choice is a tautology, in the standard versions of utility theory.
It may help to think it the other way around, starting from the actual preference. You would choose a 99.9% chance of losing ten cents and 0.1% chance of winning 10000 dollars over winning one cent with certainty, right? So then perhaps, as long as we don’t think of other bets and outcomes, we can map winning 1 cent to +1 utilon, losing 10 cents to −100 utilons and winning 10000 dollars to +10000 utilons. Then we can refine and extend the “outcomes ⇔ utilons” map by considering your actual preferences under more and more bets. As long as your preferences are self-consistent in the sense of the VNM axioms, then there will a mapping that can be constructed.
ETA: of course, it is possible that your preferences are not self-consistent. The Allais paradox is an example where many people’s intuitive preferences are not self-consistent in the VNM sense. But constructing such a case is more complicated that just considering risk-aversion on a single bet.
Also, it’s well possible that your utility function doesn’t evaluate to +10000 for any value of its argument, i.e. it’s bounded above.
Since utility functions are only unique up to affine transformation, I don’t know what to make of this comment. Do you have some sort of canonical representation in mind or something?
In the context of this thread, you can consider U(status quo) = 0 and U(status quo, but with one more dollar in my wallet) = 1. (OK, that makes +10000 an unreasonable estimate of the upper bound; pretend I said +1e9 instead.)
Yes, this seems almost certainly true (and I think is even necessary if you want to satisfy the VNM axioms, otherwise you violate the continuity axiom).
An unbounded function is one that can take arbitrarily large finite values, not necessarily one that actually evaluates to infinity somewhere.
Yes I’m quite aware… note that if there’s a sequence of outcomes whose values increase without bound, then you could construct a lottery that has infinite value by appropriately mixing the lotteries together, e.g. put probability 2^-k on the outcome with value 2^k. Then this lottery would be problematic from the perspective of continuity (or even having an evaluable utliity function).
Are lotteries allowed to have infinitely many possible outcomes? (The Wikipedia page about the VNM axioms only says “many”; I might look it up on the original paper when I have time.)
There are versions of the VNM theorem that allow infinitely many possible outcomes, but they either
1) require additional continuity assumptions so strong that they force your utility function to be bounded
or
2) they apply only to some subset of the possible lotteries (i.e. there will be some lotteries for which your agent is not obliged to define a utility).
The original statement and proof given by VNM are messy and complicated. They have since been neatened up a lot. If you have access to it, try “Follmer H., and Schied A., Stochastic Finance: An Introduction in Discrete Time, de Gruyter, Berlin, 2004”
EDIT: It’s online.
See also Kreps, Notes on the Theory of Choice. Note that one of these two restrictions are required in order to specifically prevent infinite expected utility. So if a lottery spits out infinite expected utility, you broke something in the VNM axioms.
For anyone who’s interested, a quick and dirty explanation is that the preference relation is primitive, and we’re trying to come up with an index (a utility function) that reproduces the preference relation. In the case of certainty, we want a function U:O->R where O is the outcome space and R is the real numbers such that U(o1) > U(o2) if and only if o1 is preferred to o2. In the case of uncertainty, U is defined on the set of probability distributions over O, i.e. U:M(O) → R. With the VNM axioms, we get U(L) = E_L[u(o)] where L is some lottery (i.e. a probability distribution over O). U is strictly prohibited from taking the value of infinity in these definition. Now you probably could extend them a little bit to allow for such infinities (at the cost of VNM utility perhaps), but you would need every lottery with infinite expected value to be tied for the best lottery according to the preference relation.
I’m not sure, although I would expect VNM to invoke the Hahn-Banach theorem, and it seems hard to do that if you only allow finite lotteries. If you find out I’d be quite interested. I’m only somewhat confident in my original assertion (say 2:1 odds).
Um, A actually has a utility of −89.9.
That explains why it seems less appealing!
I’d flip that around. Whatever action you end up choosing reveals what you think has highest utility, according to the information and utility function you have at the time. It’s almost a definition of what utility is—if you consistently make choices that rank lower according to what you think your utility function is, then your model of your utility function is wrong.
If the utility function you think you have prefers B over A, and you prefer A over B, then there’s some fact that’s missing from the utility function you think you have (probably related to risk).
I’ve recently come to terms with how much fear/anxiety/risk avoidance is in my revealed preferences. I’m working on working with that to do effective long-term planning—the best trick I have so far is weighing “unacceptable status quo continues” as a risk. That, and making explicit comparisons between anticipated and experienced outcomes of actions (consistently over-estimating risks doesn’t help any, and I’ve been doing that).
I sometimes have the same intuition as banx. You’re right that the problem is not in the choice, but in the utility function and it most likely stems from thinking about utility as money.
Lets examine the previous example and make it into money (dollars): −100 [dollars] with 99.9% chance and +10,000 [dollars] with 0.1% vs 100% chance at +1 [dollar]
When doing the math, you have to take into future consequences as well. For example, if you knew you would be offered 100 loaded bets with an expected payoff of $0.50 in the future, each of which only cost you $1 to participate in, then you have to count this in your original payoff calculation if losing the $100 would prohibit you from being able to take these other bets.
Basically, you have to think through all the long-term consequences when calculating expected payoff, even in dollars.
Then when you try to convert this to utility, it’s even more complicated. Is the utility per dollar gained in the +$10,000 case equivalent to the utility per dollar lost in the -$100 case? Would you feel guilty and beat yourself up afterwards if you took a bet that you had a 99.9% chance of losing? Even though a purely rational agent probably shouldn’t feel this, it’s still likely a factor in most actual humans’ utility functions.
TrustVectoring summed it up well above: If the utility function you think you have prefers B over A, and you prefer A over B, then there’s some fact that’s missing from the utility function you think you have.
If you still prefer picking the +1 option, then most likely your assessment that the first choice only gives a negative utility of 100 is probably wrong. There are some other factors that make it a less attractive choice.
Depending on your preferred framework, this is in some sense backwards: utility is, by definition, that thing which it is always correct to choose the action with the highest expected value of (say, in the framework of the von Neumann-Morgenstern theorem).
People who play with money don’t like high variance, and sometimes trade off some of the mean to reduce variance.