Can you translate your complaint into a problem with the independence axiom in particular?
Your second example is not a problem of variance in final utility, but aggregation of utility. Utility theory doesn’t force “Giving 1 util to N people” to be equivalent to “Giving N util to 1 person”. That is, it doesn’t force your utility U to be equal to U1 + U2 + … + UN where Ui is the “utility for person i”.
To be concrete, suppose you want to maximise the average utility people have, but you also care about fairness so, all things equal, you prefer the utility to be clustered about its average. Then maybe your real utility function is not
Precisely the model I often have in mind (except I use the standard deviation, not the variance, as it is in the same units as the mean).
But let us now see the problem with the independence axiom. Replace expected utility with phi=”expected utility minus half the standard deviation”.
Then if A and B are two independent probability distributions, phi(A+B) >= phi(A) + phi(B) by Jensen’s inequality, as the square root is a concave function. Equality happens only if the variance of A or B is zero.
Now imagine that B and C are identical distributions with non-zero variances, and that A has no variance with phi(A) = phi(B) = phi(C). Then phi(A+B) = phi(A) + phi(B) = phi(B) + phi(C) < phi(B+C), violating independence.
(if we use variance rather than standard deviation, we get phi(2B) < 2phi(B), giving similar results)
A and B are supposed to be distributions on possible outcomes, right? What is A+B supposed to mean here? A distribution with equal mixture of A and B (i.e. 50% chance of A happening and 50% chance of B happening), or A happening followed by B happening? It doesn’t seem to make sense either way.
If it’s supposed to be 50⁄50 mixture of A and B, then phi(A+B) could be less than phi(A) + phi(B). If it’s A happening followed by B happening, then Independence/expected utility maximization doesn’t apply because it’s about aggregating utility between possible worlds, not utility of events within a possible world.
To be technical, A and B are random variables, though you can usefully think of them as generalised lotteries. A+B represents you being entered in both lotteries.
Hum, if this is causing confusion, there is no surprise that my overal post is obscure. I’ll try taking it apart to rewrite it more clearly.
To be technical, A and B are random variables, though you can usefully think of them as generalised lotteries. A+B represents you being entered in both lotteries.
That has nothing to do with the independence axiom, which is about Wei Dai’s first suggestion of a 50% chance of A and a 50% chance of B (and about unequal mixtures). I think your entire post is based on this confusion.
I did wonder what Stuart meant when he started talking about adding probability distributions together. In the usual treatment, a single probability distribution represents all possible worlds, yes?
Yes, the axioms are about preferences over probability distributions over all possible worlds and are enough to produce a utility function whose expectation produces those preferences.
Can you translate your complaint into a problem with the independence axiom in particular?
Your second example is not a problem of variance in final utility, but aggregation of utility. Utility theory doesn’t force “Giving 1 util to N people” to be equivalent to “Giving N util to 1 person”. That is, it doesn’t force your utility U to be equal to U1 + U2 + … + UN where Ui is the “utility for person i”.
To be concrete, suppose you want to maximise the average utility people have, but you also care about fairness so, all things equal, you prefer the utility to be clustered about its average. Then maybe your real utility function is not
U = (U[1] + …. + U[n])/n
but
U’ = U + ((U[1]-U)^2 + …. + (U[n]-U)^2)/n
which is in some sense a mean minus a variance.
Precisely the model I often have in mind (except I use the standard deviation, not the variance, as it is in the same units as the mean).
But let us now see the problem with the independence axiom. Replace expected utility with phi=”expected utility minus half the standard deviation”.
Then if A and B are two independent probability distributions, phi(A+B) >= phi(A) + phi(B) by Jensen’s inequality, as the square root is a concave function. Equality happens only if the variance of A or B is zero.
Now imagine that B and C are identical distributions with non-zero variances, and that A has no variance with phi(A) = phi(B) = phi(C). Then phi(A+B) = phi(A) + phi(B) = phi(B) + phi(C) < phi(B+C), violating independence.
(if we use variance rather than standard deviation, we get phi(2B) < 2phi(B), giving similar results)
A and B are supposed to be distributions on possible outcomes, right? What is A+B supposed to mean here? A distribution with equal mixture of A and B (i.e. 50% chance of A happening and 50% chance of B happening), or A happening followed by B happening? It doesn’t seem to make sense either way.
If it’s supposed to be 50⁄50 mixture of A and B, then phi(A+B) could be less than phi(A) + phi(B). If it’s A happening followed by B happening, then Independence/expected utility maximization doesn’t apply because it’s about aggregating utility between possible worlds, not utility of events within a possible world.
To be technical, A and B are random variables, though you can usefully think of them as generalised lotteries. A+B represents you being entered in both lotteries.
Hum, if this is causing confusion, there is no surprise that my overal post is obscure. I’ll try taking it apart to rewrite it more clearly.
That has nothing to do with the independence axiom, which is about Wei Dai’s first suggestion of a 50% chance of A and a 50% chance of B (and about unequal mixtures). I think your entire post is based on this confusion.
I did wonder what Stuart meant when he started talking about adding probability distributions together. In the usual treatment, a single probability distribution represents all possible worlds, yes?
Yes, the axioms are about preferences over probability distributions over all possible worlds and are enough to produce a utility function whose expectation produces those preferences.
That’s how it looks to me as well.
No, it isn’t. I’ll write another post that makes my position clearer, as it seems I’ve spectacularly failed with this one :-)