I believe this assumption typically comes from the Von Neumann–Morgenstern utility theorem, which says that, if your preferences are complete, transitive, continuous, and independent, then there is some utility function U such that your preferences are equivalent to “maximize expected U”.
Those four assumptions have technical meanings:
Complete means that for any A and B, you prefer A to B, or prefer B to A, or are indifferent between A and B.
Transitive means that if you prefer A to B and prefer B to C, then you prefer A to C, and also that if you prefer D to E and are indifferent between E and F, then you prefer D to F.
Continuous means that if you prefer A to X and X to Y, then there’s some probability p such that you are indifferent between X and “p to get A, else Y”
Independent means that if you prefer X to Y, then for any probability q and other outcome B, you prefer “q to get B, else X” to “q to get B, else Y”.
In my opinion, the continuity assumption is the one most likely to be violated (in particular, it excludes preferences where “no chance of any X is worth any amount of Y”), so these aren’t necessarily a given, but if you do satisfy them, then there’s some utility function that describes your preferences by maximizing it.
There is a problem with completeness that requires studying the actual theorem and its construction of utility from preference. The preference function does not range over just the possible “outcomes” (which we suppose are configurations of the world or of some local part of it). It ranges over lotteries among these outcomes, as explicitly stated on the VNM page linked above. This implies that the idea of being indifferent between a sure gain of 1 util and a 0.1% chance of 1000 utils is already baked into the setup of these theorems, even before the proof constructs the utility function. A theorem cannot be used to support its assumptions.
The question, “Why should I maximise expected utility (or if not, what should I do instead)?” is a deep one which I don’t think has ever been fully answered.
The argument from iterated gambles requires that the utility of the gambles can be combined linearly. The OP points out that this is not the case if utility is nonlinear in the currency of the gambles.
Even when the utilities do combine linearly, there may be no long run. An example now well-known is one constructed by Ole Peters, in which the longer you play, the greater the linearly expected value, but the less likely you are to win anything. Ever more enormous payoffs are squeezed into an ever tinier part of probability space.
Even gambles apparently fair in expectation can be problematic if their variances are infinite and so the Central Limit Theorem does not apply.
So iterated gambles do not answer the question.
Logarithmic utility does not help: just imagine games with the payouts correspondingly scaled up.
Bounded utility does not help, because there is no principled way to locate the bound, and because if the bound is set large enough finite versions of the problems with unbounded utility still show up. See SBF/FTX. Is that debacle offset by the alternate worlds in which the collapse never happened?
Very interesting observations. I woudln’t say the theorem is used to support his assumption because the assumptions don’t speak about utils but only about preference over possible outcomes and lotteries, but I see your point.
Actually the assumptions are implicitly saying that you are not rational if you don’t want to risk to get a 1′000′000′000′000$ debt with a small enough probability rather than losing 1 cent (this is strightforward from the archimedean property).
No it does not imply constancy or consistency over time because the 4 axioms do not stop us from adding to the utility function a real-valued argument that represent the moment in time that the definition refers to.
In other words, the 4 axioms do not constrain us to consider only utility functions over world states: utility functions over “histories” are also allowed, where a “history” is a sequence of world states evolving over time (or equivalently a function that takes a number representing an instant in time and returning a world state).
It seems indeed quite reasonable to maximize utility if you can choose an option that makes it possible, my point is why you should maximize expected utility when the choice is under uncertainty
Or abandon some part of its assumed ontology. When the axioms seem ineluctable yet the conclusion seems absurd, the framework must be called into question.
Ok we have a theorem that says that if we are not maximizing the expected value of some function “u” then our preference are apparently “irrational” (violating some of the axioms). But assume we already know our utility function before applying the theorem, is there an argument that shows how and why the preference of B over A (or maybe indifference) is irrational if E(U(A))>E(U(B))?
You don’t necessarily need to start from the preference and use the theorem to define the function, you can also start from the utility function and try to produce an intuitive explanation of why you should prefer to have the best expected value
What does it mean for something to be a “utility” function? Not just calling it that. A utility function is by definition something that represents your preferences by numerical comparison: that is what the word was coined to mean.
Suppose we are given a utility function defined just on outcomes, not distributions over outcomes, and a set of actions that each produce a single outcome, not a distribution over outcomes. It is clear that the best action is that which selects the highest utility outcome.
Now suppose we extend the utility function to distributions over outcomes by defining its value on a distribution to be its expected value. Suppose also that actions in general produce not a single outcome with certainty but a distribution over outcomes. It is not clear that this extension of the original utility function is still a “utility” function, in the sense of a criterion for choosing the best action. That is something that needs justification. The assumption that this is so is already baked into the axioms of the various utility function theorems. Its justification is the deep problem here.
Joe Carlsmith gives many arguments for Expected Utillity Maximisation, but it seems to me that all of them are just hammering on a few intuition pumps, and I do not find them conclusive. On the other hand, I do not have an answer, whether a justification of EUM or an alternative.
Eliezer has likened the various theorems around utility to a multitude of searchlights coherently pointing in the same direction. But the problems of unbounded utility, non-ergodicity, paradoxical games, and so on (Carlsmith mentions them in passing but does not discuss them) look to me like another multitude of warning lights also coherently pointing somewhere labelled “here be monsters”.
There are infinitely many ways to find utility functions that represents preferences on outcomes, for example if outcomes are monetary than any increasing function is equivalent on outcomes but not when you try to extend it to distributions and lotteries with the expected value. I wander if given a specific function u(...) on every outcome you can also chose “rational” preferences (as in the theorem) according to some other operator on the distributions that is not the average, for example what about the L^p norm or the sup of the distribution (if they are continuous)? Or is the expected value the special unique operator that have the propety stated by the VN-M theorem?
Itt seems to me that it is actually easy to define a function $u’(...)>=0$ such that the preferences are represented by $E(u’^2)$ and not by $E(u’)$: just take u’=sqrt(u), and you can do the same for any value of the exponent, so the expectation does not play a special role in the theorem, you can replace it with any $L^p$ norm.
Apparently the axioms can be considered to talk about preferences, not necessarily about probabilistic expectations. Am I wrong in seeing them in this way?
I believe this assumption typically comes from the Von Neumann–Morgenstern utility theorem, which says that, if your preferences are complete, transitive, continuous, and independent, then there is some utility function U such that your preferences are equivalent to “maximize expected U”.
Those four assumptions have technical meanings:
Complete means that for any A and B, you prefer A to B, or prefer B to A, or are indifferent between A and B.
Transitive means that if you prefer A to B and prefer B to C, then you prefer A to C, and also that if you prefer D to E and are indifferent between E and F, then you prefer D to F.
Continuous means that if you prefer A to X and X to Y, then there’s some probability p such that you are indifferent between X and “p to get A, else Y”
Independent means that if you prefer X to Y, then for any probability q and other outcome B, you prefer “q to get B, else X” to “q to get B, else Y”.
In my opinion, the continuity assumption is the one most likely to be violated (in particular, it excludes preferences where “no chance of any X is worth any amount of Y”), so these aren’t necessarily a given, but if you do satisfy them, then there’s some utility function that describes your preferences by maximizing it.
There is a problem with completeness that requires studying the actual theorem and its construction of utility from preference. The preference function does not range over just the possible “outcomes” (which we suppose are configurations of the world or of some local part of it). It ranges over lotteries among these outcomes, as explicitly stated on the VNM page linked above. This implies that the idea of being indifferent between a sure gain of 1 util and a 0.1% chance of 1000 utils is already baked into the setup of these theorems, even before the proof constructs the utility function. A theorem cannot be used to support its assumptions.
The question, “Why should I maximise expected utility (or if not, what should I do instead)?” is a deep one which I don’t think has ever been fully answered.
The argument from iterated gambles requires that the utility of the gambles can be combined linearly. The OP points out that this is not the case if utility is nonlinear in the currency of the gambles.
Even when the utilities do combine linearly, there may be no long run. An example now well-known is one constructed by Ole Peters, in which the longer you play, the greater the linearly expected value, but the less likely you are to win anything. Ever more enormous payoffs are squeezed into an ever tinier part of probability space.
Even gambles apparently fair in expectation can be problematic if their variances are infinite and so the Central Limit Theorem does not apply.
So iterated gambles do not answer the question.
Logarithmic utility does not help: just imagine games with the payouts correspondingly scaled up.
Bounded utility does not help, because there is no principled way to locate the bound, and because if the bound is set large enough finite versions of the problems with unbounded utility still show up. See SBF/FTX. Is that debacle offset by the alternate worlds in which the collapse never happened?
Very interesting observations. I woudln’t say the theorem is used to support his assumption because the assumptions don’t speak about utils but only about preference over possible outcomes and lotteries, but I see your point.
Actually the assumptions are implicitly saying that you are not rational if you don’t want to risk to get a 1′000′000′000′000$ debt with a small enough probability rather than losing 1 cent (this is strightforward from the archimedean property).
Complete is also in question for any real-world application, because it implies consistent-over-time.
No it does not imply constancy or consistency over time because the 4 axioms do not stop us from adding to the utility function a real-valued argument that represent the moment in time that the definition refers to.
In other words, the 4 axioms do not constrain us to consider only utility functions over world states: utility functions over “histories” are also allowed, where a “history” is a sequence of world states evolving over time (or equivalently a function that takes a number representing an instant in time and returning a world state).
Indeed there are quite a few interesting reasons to be skeptical of the completeness axiom.
It seems indeed quite reasonable to maximize utility if you can choose an option that makes it possible, my point is why you should maximize expected utility when the choice is under uncertainty
If you aren’t maximizing expected utility, you must choose one of the four axioms to abandon.
Or abandon some part of its assumed ontology. When the axioms seem ineluctable yet the conclusion seems absurd, the framework must be called into question.
Ok we have a theorem that says that if we are not maximizing the expected value of some function “u” then our preference are apparently “irrational” (violating some of the axioms). But assume we already know our utility function before applying the theorem, is there an argument that shows how and why the preference of B over A (or maybe indifference) is irrational if E(U(A))>E(U(B))?
In the context of utility theory, a utility function is by definition something whose expected value encodes all your preferences.
You don’t necessarily need to start from the preference and use the theorem to define the function, you can also start from the utility function and try to produce an intuitive explanation of why you should prefer to have the best expected value
What does it mean for something to be a “utility” function? Not just calling it that. A utility function is by definition something that represents your preferences by numerical comparison: that is what the word was coined to mean.
Suppose we are given a utility function defined just on outcomes, not distributions over outcomes, and a set of actions that each produce a single outcome, not a distribution over outcomes. It is clear that the best action is that which selects the highest utility outcome.
Now suppose we extend the utility function to distributions over outcomes by defining its value on a distribution to be its expected value. Suppose also that actions in general produce not a single outcome with certainty but a distribution over outcomes. It is not clear that this extension of the original utility function is still a “utility” function, in the sense of a criterion for choosing the best action. That is something that needs justification. The assumption that this is so is already baked into the axioms of the various utility function theorems. Its justification is the deep problem here.
Joe Carlsmith gives many arguments for Expected Utillity Maximisation, but it seems to me that all of them are just hammering on a few intuition pumps, and I do not find them conclusive. On the other hand, I do not have an answer, whether a justification of EUM or an alternative.
Eliezer has likened the various theorems around utility to a multitude of searchlights coherently pointing in the same direction. But the problems of unbounded utility, non-ergodicity, paradoxical games, and so on (Carlsmith mentions them in passing but does not discuss them) look to me like another multitude of warning lights also coherently pointing somewhere labelled “here be monsters”.
There are infinitely many ways to find utility functions that represents preferences on outcomes, for example if outcomes are monetary than any increasing function is equivalent on outcomes but not when you try to extend it to distributions and lotteries with the expected value.
I wander if given a specific function u(...) on every outcome you can also chose “rational” preferences (as in the theorem) according to some other operator on the distributions that is not the average, for example what about the L^p norm or the sup of the distribution (if they are continuous)?
Or is the expected value the special unique operator that have the propety stated by the VN-M theorem?
Itt seems to me that it is actually easy to define a function $u’(...)>=0$ such that the preferences are represented by $E(u’^2)$ and not by $E(u’)$: just take u’=sqrt(u), and you can do the same for any value of the exponent, so the expectation does not play a special role in the theorem, you can replace it with any $L^p$ norm.
Apparently the axioms can be considered to talk about preferences, not necessarily about probabilistic expectations. Am I wrong in seeing them in this way?