In the context of decision making under uncertainty we consider the strategy of maximizing the expected monetary revenue and expected utility; we provide an argument to show that under certain hypothesis it is rational to maximize expected monetary revenue, then we show that the argument doesn’t apply to expected utility. We are left with the question about how do we justify the rationality of the strategy of maximizing expected utility.
Expected monetary revenue
Suppose you have to choose one of two games A and B with an expected economic return of 1$ and 2$ respectively, which have a certain probability distribution.
If you play many times, say N, the law of large numbers and the central limit theorem might become relevant, and your probability distribution for the repetition of A and B will have their masses more and more sharply separated around N and 2N respectively.
At this point, it is clear that it is better for you to play B many times than to play A many times. You can predict this in advance by calculating the expected winnings of A and B. So assuming you can make “lots” of choices between A and B, you must prefer the one with the higher expected profit. But what if you can only make one choice? If the distributions of A and B overlap, does the expected profit still matter?
Even if you only have to make this particular choice once, it could be one of a long sequence of many different choices, not always between A and B, but between other sets of options. Even if the choices are different, we can still manage to take advantage of the LLN and the central limit theorem. Suppose we have a large number of choices between two games:
time 0: choose between games A0 and B0
time 1: choose between A1 and B1
time 2: …
...
time N: choose between AN and BN
We can hope that again if you always choose the game with the higher expected return the statistical randomness will be increasingly irrelevant over time for large N and you will store a larger amount of money compared to a person with a different strategy.
In order for this hope to actually happen we need that the incomes of the games Ak and Bk are uniformly bounded and “small enough” compared to N. This could not happen if for example:
To summarize: the strategy to always choose the option that maximize the expected monetary revenue is indeed rational if you can store or spend all the money and the choice is part of a set of many choices which have a small mean and standard deviation compared to the total amount of money that is expected to be stored in total.
Expected utility
The effect of adding new money to our wealth can be different depending on the wealth we already have at the present moment (10′000 $ can be much more meaningful to someone who has no money than to a millionaire). We model this fact by defining a “utility” function which represents the impact of the extra money on the life of a specific person. We could expect this utility function U(x) to be increasing with the amount of money x (the more the better) and its slope to be always decreasing (the richer we are the less we care about 1 more dollar), like this:
A rational agent will not be expected to maximize his monetary gain but rather his “utility” gain, if he can. But what happens when he has to make a choice under uncertainty with this new target?
Suppose we have to choose to play game A or game B with some probability distributions for the revenues. In analogy to what we said in the previous paragraph about monetary revenue we could be tempted to say that instead of chosing the game with me maximum expected monetary revenue we choose the one with the maximum expected utility.
Example: if the game A can make us win 1$ with probability 0.5, or otherwise nothing, and game B can make us win $2 with probability 0.3, or otherwise nothing we are no longer interested in the expected revelue, that is:
E(A)=0.5×1=0.5, E(B)=0.3×2=0.6
we do not chose B because 0.6>0.5, we compute the expected utilities
E(U(A))=0.5×U(1), E(U(B))=0.2×U(2)
So if we have for example U(x)=√x (which is a function with a shape like in the graph above) then we have E(U(A))=0.5 and E(U(B))≈0.42 and therefore we choose A.
But what if we have to make the same choice more than one time? Now the difference between monetary revenue and utility becomes more extreme:
When we compute the expected monetary revenue of two game A1 and A2 that we play in sequence we can just add the expected revenue of the two: E(A1+A2)=E(A1)+E(A2), this happens because money just add together.
When we compute the expected utility of playing two game A1 and A2 in sequence we are not allowed to add the expected utility of the two: playing game A (above) one time resulted in E(U(A))=0.5 but if you play the same game A two times you have 3 possible outcomes: 2 wins, 1 wins, 0 wins, the probabilities are 0.25, 0.5, 0.25 and the monetary revenues are 2$, 1$, 0$, so we have E(U(A+A))=0.25×√2+0.5×√1≈0.85 that is less than E(U(A))+E(U(A))=0.5+0.5=1≠0.85.
The choice that maximizes the expected utility in 1 single game can be different from the strategy that maximizes utility in 2 games or in 100 games: it can happen that if you can play only 1 time game A or game B then it is game A has the higher expected utility BUT if you can play two times the same games then you get more expected utility by playing B both times. Check this simple example.
So basically there is no point in computing expected utility of every single choice, if you have to make a sequence of choices: you actually need to compute the expected utility of every possible sequence of choices and then choose the sequence of choices that makes the expected utility as big as possible.
But how did we came to the conclusion that it was meaningful to compare expected utilities in order to make a rational decision? Because we derived a similar conclusion about expected monetary revenue and utility looked like a refinement of that line of thought. But expected monetary revenue behaved very differently! It was additive, this allowed us to make use of central limit theorem which was crucial to justify the value of the expectation in making the decision under certain specific circumstances. The situation with expected utility is completely different, we cannot reproduce the argument above to justify the value of expected utility.
So here is the question we are left with: why a rational agent should maximize expected utility?
[Question] Why does expected utility matter?
In the context of decision making under uncertainty we consider the strategy of maximizing the expected monetary revenue and expected utility; we provide an argument to show that under certain hypothesis it is rational to maximize expected monetary revenue, then we show that the argument doesn’t apply to expected utility. We are left with the question about how do we justify the rationality of the strategy of maximizing expected utility.
Expected monetary revenue
Suppose you have to choose one of two games A and B with an expected economic return of 1$ and 2$ respectively, which have a certain probability distribution.
If you play many times, say N, the law of large numbers and the central limit theorem might become relevant, and your probability distribution for the repetition of A and B will have their masses more and more sharply separated around N and 2N respectively.
At this point, it is clear that it is better for you to play B many times than to play A many times. You can predict this in advance by calculating the expected winnings of A and B. So assuming you can make “lots” of choices between A and B, you must prefer the one with the higher expected profit. But what if you can only make one choice? If the distributions of A and B overlap, does the expected profit still matter?
Even if you only have to make this particular choice once, it could be one of a long sequence of many different choices, not always between A and B, but between other sets of options. Even if the choices are different, we can still manage to take advantage of the LLN and the central limit theorem. Suppose we have a large number of choices between two games:
time 0: choose between games A0 and B0
time 1: choose between A1 and B1
time 2: …
...
time N: choose between AN and BN
We can hope that again if you always choose the game with the higher expected return the statistical randomness will be increasingly irrelevant over time for large N and you will store a larger amount of money compared to a person with a different strategy.
In order for this hope to actually happen we need that the incomes of the games Ak and Bk are uniformly bounded and “small enough” compared to N. This could not happen if for example:
a single game has infinite expectation like in the St. Petersburg game;
a single game has such a big expected revenue that would make almost irrelevant all the previous outcomes;
the sequence of games has increasingly growing expected revenues.
This could indeed happen if all the games have revenues that lie inside a “bounded” range of values while N is “big enough” compared to this range.
The elements that will make it happen are:
the additivity property of the money that we gain from the game (it is important that we can store the money and the amount will never reset);
the central limit theorem for non identical distributed variables.
To summarize: the strategy to always choose the option that maximize the expected monetary revenue is indeed rational if you can store or spend all the money and the choice is part of a set of many choices which have a small mean and standard deviation compared to the total amount of money that is expected to be stored in total.
Expected utility
The effect of adding new money to our wealth can be different depending on the wealth we already have at the present moment (10′000 $ can be much more meaningful to someone who has no money than to a millionaire). We model this fact by defining a “utility” function which represents the impact of the extra money on the life of a specific person. We could expect this utility function U(x) to be increasing with the amount of money x (the more the better) and its slope to be always decreasing (the richer we are the less we care about 1 more dollar), like this:
A rational agent will not be expected to maximize his monetary gain but rather his “utility” gain, if he can. But what happens when he has to make a choice under uncertainty with this new target?
Suppose we have to choose to play game A or game B with some probability distributions for the revenues. In analogy to what we said in the previous paragraph about monetary revenue we could be tempted to say that instead of chosing the game with me maximum expected monetary revenue we choose the one with the maximum expected utility.
Example: if the game A can make us win 1$ with probability 0.5, or otherwise nothing, and game B can make us win $2 with probability 0.3, or otherwise nothing we are no longer interested in the expected revelue, that is:
E(A)=0.5×1=0.5, E(B)=0.3×2=0.6
we do not chose B because 0.6>0.5, we compute the expected utilities
E(U(A))=0.5×U(1), E(U(B))=0.2×U(2)
So if we have for example U(x)=√x (which is a function with a shape like in the graph above) then we have E(U(A))=0.5 and E(U(B))≈0.42 and therefore we choose A.
But what if we have to make the same choice more than one time? Now the difference between monetary revenue and utility becomes more extreme:
When we compute the expected monetary revenue of two game A1 and A2 that we play in sequence we can just add the expected revenue of the two: E(A1+A2)=E(A1)+E(A2), this happens because money just add together.
When we compute the expected utility of playing two game A1 and A2 in sequence we are not allowed to add the expected utility of the two: playing game A (above) one time resulted in E(U(A))=0.5 but if you play the same game A two times you have 3 possible outcomes: 2 wins, 1 wins, 0 wins, the probabilities are 0.25, 0.5, 0.25 and the monetary revenues are 2$, 1$, 0$, so we have E(U(A+A))=0.25×√2+0.5×√1≈0.85 that is less than E(U(A))+E(U(A))=0.5+0.5=1≠0.85.
The choice that maximizes the expected utility in 1 single game can be different from the strategy that maximizes utility in 2 games or in 100 games: it can happen that if you can play only 1 time game A or game B then it is game A has the higher expected utility BUT if you can play two times the same games then you get more expected utility by playing B both times. Check this simple example.
So basically there is no point in computing expected utility of every single choice, if you have to make a sequence of choices: you actually need to compute the expected utility of every possible sequence of choices and then choose the sequence of choices that makes the expected utility as big as possible.
But how did we came to the conclusion that it was meaningful to compare expected utilities in order to make a rational decision? Because we derived a similar conclusion about expected monetary revenue and utility looked like a refinement of that line of thought. But expected monetary revenue behaved very differently! It was additive, this allowed us to make use of central limit theorem which was crucial to justify the value of the expectation in making the decision under certain specific circumstances. The situation with expected utility is completely different, we cannot reproduce the argument above to justify the value of expected utility.
So here is the question we are left with: why a rational agent should maximize expected utility?