I’m using the usual machinery of probability theory, and particularly countable additivity. It may be reasonable to give up on that, and so I think the biggest assumption I made at the beginning was that we were defining a probability distribution over arbitrary lotteries and working with the space of probability distributions.
A way to look at it is: the thing I’m taking sums over are the probabilities of possible outcomes. I’m never talking anywhere about utilities or cash payouts or anything else. The fact that I labeled some symbols X8 does not mean that the real number 8 is involved anywhere.
But these sums over the probabilities of worlds are extremely convergent. I’m not doing any “rearrangement,” I’m just calculating ∑∞k=n+112k=12n.
So there are some missing axioms here, describing what happens when you construct lotteries out of other lotteries. Specifically, the rearranging step Slider asks about is not justified by the explicitly given axioms alone: it needs something along the lines of “if for each i we have a lottery ∑jpijXj, then the values of the lotteries ∑iqi(∑jpijXj) and ∑j(∑iqipij)Xj are equal”.
(Your derivation only actually uses this in the special case where for each i only finitely many of the pij are nonzero.)
You might want to say either that these two “different” lotteries have equal value, or else that they are in fact the same lottery.
In either case, it seems to me that someone might dispute the axiom in question (intuitively obvious though it seems, just like the others). You’ve chosen a notation for lotteries that makes an analogy with infinite series; if we take this seriously, we notice that this sort of rearrangement absolutely can change whether the series converges and to what value if so. How sure are you that rearranging lotteries is safer than rearranging sums of real numbers?
(The sums of the probabilities are extremely convergent, yes. But the probabilities are (formally) multiplying outcomes whose values we are supposing are correspondingly divergent. Again, I am not sure I want to assume that this sort of manipulation is safe.)
I’m handling lotteries as probability distributions over an outcome space Ω, not as formal sums of outcomes.
To make things simple you can assume Ω is countable. Then a lottery A assigns a real number A(ω) to each ω∈Ω, representing its probability under the lottery A, such that ∑ω∈ΩA(ω)=1. The sum ∑piAi is defined by (∑piAi)(ω)=∑piAi(ω). And all these infinite sums of real numbers are in turn defined as the suprema of the finite sums which are easily seen to exist and to still sum to 1. (All of this is conventional notation.) Then ∑iqi(∑jpijAj)and ∑j(∑iqipij)Aj are exactly equal.
OK! But I still feel like there’s something being swept under the carpet here. And I think I’ve managed to put my finger on what’s bothering me.
There are various things we could require our agents to have preferences over, but I am not sure that probability distributions over outcomes is the best choice. (Even though I do agree that the things we want our agents to have preferences over have essentially the same probabilistic structure.)
A weaker assumptions we might make about agents’ preferences is that they are over possibly-uncertain situations, expressed in terms of the agent’s epistemic state.
And I don’t think “nested” possibly-uncertain-situations even exist. There is no such thing as assigning 50% probability to each of (1) assigning 50% probability to each of A and B, and (2) assigning 50% probability to each of A and C. There is such a thing as assigning 50% probability now to assigning those different probabilities in five minutes, and by the law of iterated expectations your final probabilities for A,B,C must then obey the distributive law, but the situations are still not literally the same, and I think that in divergent-utility situations we can’t assume that your preferences depend only on the final outcome distribution.
Another way to say this is that, given that the Ai and Bi are lotteries rather than actual outcomes and that combinations like ∑piAi mean something more complicated than they may initially look like they mean, the dominance axioms are less obvious than the notation makes them look, and even though there are no divergences in the sums-over-probabilities that arise when you do the calculations there are divergences in implied something-like-sums-over-weighted utilities, and in my formulation you really are having to rearrange outcomes as well as probabilities when you do the calculations.
I agree that in the real world you’d have something like “I’m uncertain about whether X or Y will happen, call it 50⁄50. If X happens, I’m 50⁄50 about whether A or B will happen. If Y happens, I’m 50⁄50 about whether B or C will happen.” And it’s not obvious that this should be the same as being 50⁄50 between B or X, and conditioned on X being 50⁄50 between A or C.
Having those two situations be different is kind of what I mean by giving up on probabilities—your preferences are no longer a function of the probability that outcomes occur, they are a more complicated function of your epistemic state, and so it’s not correct to summarize your epistemic state as a probability distribution over outcomes.
I don’t think this is totally crazy, but I think it’s worth recognizing it as a fairly drastic move.
To anyone who is still not convinced—that last move, ∑i∑jqipijAj=∑j∑iqipijAj, is justified by Tonelli’s theorem, merely because qipijAj(ω)≥0 (for all i,j,ω).
The way I look at this is that objects like 12X0+12X1 live in a function space like X→R≥0, specifically the subspace of that where the functions f are integrable with respect to counting measure on X and ∑x∈Xf(x)=1. In other words, objects like f1:=12X0+12X1 are probability mass functions (pmf). f1(X0) is 12, and f1(X1) is 12, and f1 of anything else is 0. When we write what looks like an infinite series λ1f1+λ2f2+⋯, what this really means is that we’re defining a new f by pointwise infinite summation: f(x):=∑∞i=1λifi(x). So only each collection of terms that contains a given Xk needs to form a convergent series in order for this new f to be well-defined. And for it to equal another f′, the convergent sums only need to be equal pointwise (for each Xk, f(Xk)=f′(Xk)). In Paul’s proof above, the only Xk for which the collection of terms containing it is even infinite is X0. That’s the reason he’s “just calculating” that one sum.
The outcomes have the property that they are step-wise more than double the worth.
In X∞=12X0+14X1+18X2+116X4+… the real part only halfs on each term. So as the series goes on each term gets bigger and bigger instead of smaller and smaller and smaller associated with convergent-like scenario. So it seems to me that even in isolation this is a divergent-like series.
Here’s a concrete example. Start with a sum that converges to 0 (in fact every partial sum is 0):
0 + 0 + …
Regroup the terms a bit:
= (1 + −1) + (1 + −1) + …
= 1 + (-1 + 1) + (-1 + 1) + …
= 1 + 0 + 0 + …
and you get a sum that converges to 1 (in fact every partial sum is 1). I realize that the things you’re summing are probability distributions over outcomes and not real numbers, but do you have reason to believe that they’re better behaved than real numbers in infinite sums? I’m not immediately seeing how countable additivity helps. Sorry if that should be obvious.
Aha. So if a sum of non-negative numbers converges, than any rearrangement of that sum will converge to the same number, but not so for sums of possibly-negative numbers?
Ok, another angle. If you take Christiano’s lottery:
X∞=12X0+14X1+18X2+116X4...
and map outcomes to their utilities, setting the utility of X0 to 1, of X1 to 2, etc., you get:
1/2+1/2+1/2+1/2+...
Looking at how the utility gets rearranged after the “we can write X∞ as a mixture” step, the first “1/2″ term is getting “smeared” across the rest of the terms, giving:
3/4+5/8+9/16+17/32+...
which is a sequence of utilities that are pairwise higher. This is an essential part of the violation of Antisymmetry/Unbounded/Dominance. My intuition says that a strange thing happened when you rearranged the terms of the lottery, and maybe you shouldn’t do that.
Should there be another property, called “Rearrangement”?
Rearrangement: you may apply an infinite number of commutivity (x+y=y+x) and associativity ((x+y)+z=x+(y+z)) rewrites to a lottery.
(In contrast, I’m pretty sure you can’t get an Antisymmetry/Unbounded/Dominance violation by applying only finitely many commutivity and associativity rearrangements.)
I don’t actually have a sense of what “infinite lotteries, considered equivalent up to finite but not infinite rearrangements” look like. Maybe it’s not a sensible thing.
I’m using the usual machinery of probability theory, and particularly countable additivity. It may be reasonable to give up on that, and so I think the biggest assumption I made at the beginning was that we were defining a probability distribution over arbitrary lotteries and working with the space of probability distributions.
A way to look at it is: the thing I’m taking sums over are the probabilities of possible outcomes. I’m never talking anywhere about utilities or cash payouts or anything else. The fact that I labeled some symbols X8 does not mean that the real number 8 is involved anywhere.
But these sums over the probabilities of worlds are extremely convergent. I’m not doing any “rearrangement,” I’m just calculating ∑∞k=n+112k=12n.
So there are some missing axioms here, describing what happens when you construct lotteries out of other lotteries. Specifically, the rearranging step Slider asks about is not justified by the explicitly given axioms alone: it needs something along the lines of “if for each i we have a lottery ∑jpijXj, then the values of the lotteries ∑iqi(∑jpijXj) and ∑j(∑iqipij)Xj are equal”.
(Your derivation only actually uses this in the special case where for each i only finitely many of the pij are nonzero.)
You might want to say either that these two “different” lotteries have equal value, or else that they are in fact the same lottery.
In either case, it seems to me that someone might dispute the axiom in question (intuitively obvious though it seems, just like the others). You’ve chosen a notation for lotteries that makes an analogy with infinite series; if we take this seriously, we notice that this sort of rearrangement absolutely can change whether the series converges and to what value if so. How sure are you that rearranging lotteries is safer than rearranging sums of real numbers?
(The sums of the probabilities are extremely convergent, yes. But the probabilities are (formally) multiplying outcomes whose values we are supposing are correspondingly divergent. Again, I am not sure I want to assume that this sort of manipulation is safe.)
I’m handling lotteries as probability distributions over an outcome space Ω, not as formal sums of outcomes.
To make things simple you can assume Ω is countable. Then a lottery A assigns a real number A(ω) to each ω∈Ω, representing its probability under the lottery A, such that ∑ω∈ΩA(ω)=1. The sum ∑piAi is defined by (∑piAi)(ω)=∑piAi(ω). And all these infinite sums of real numbers are in turn defined as the suprema of the finite sums which are easily seen to exist and to still sum to 1. (All of this is conventional notation.) Then ∑iqi(∑jpijAj)and ∑j(∑iqipij)Aj are exactly equal.
OK! But I still feel like there’s something being swept under the carpet here. And I think I’ve managed to put my finger on what’s bothering me.
There are various things we could require our agents to have preferences over, but I am not sure that probability distributions over outcomes is the best choice. (Even though I do agree that the things we want our agents to have preferences over have essentially the same probabilistic structure.)
A weaker assumptions we might make about agents’ preferences is that they are over possibly-uncertain situations, expressed in terms of the agent’s epistemic state.
And I don’t think “nested” possibly-uncertain-situations even exist. There is no such thing as assigning 50% probability to each of (1) assigning 50% probability to each of A and B, and (2) assigning 50% probability to each of A and C. There is such a thing as assigning 50% probability now to assigning those different probabilities in five minutes, and by the law of iterated expectations your final probabilities for A,B,C must then obey the distributive law, but the situations are still not literally the same, and I think that in divergent-utility situations we can’t assume that your preferences depend only on the final outcome distribution.
Another way to say this is that, given that the Ai and Bi are lotteries rather than actual outcomes and that combinations like ∑piAi mean something more complicated than they may initially look like they mean, the dominance axioms are less obvious than the notation makes them look, and even though there are no divergences in the sums-over-probabilities that arise when you do the calculations there are divergences in implied something-like-sums-over-weighted utilities, and in my formulation you really are having to rearrange outcomes as well as probabilities when you do the calculations.
I agree that in the real world you’d have something like “I’m uncertain about whether X or Y will happen, call it 50⁄50. If X happens, I’m 50⁄50 about whether A or B will happen. If Y happens, I’m 50⁄50 about whether B or C will happen.” And it’s not obvious that this should be the same as being 50⁄50 between B or X, and conditioned on X being 50⁄50 between A or C.
Having those two situations be different is kind of what I mean by giving up on probabilities—your preferences are no longer a function of the probability that outcomes occur, they are a more complicated function of your epistemic state, and so it’s not correct to summarize your epistemic state as a probability distribution over outcomes.
I don’t think this is totally crazy, but I think it’s worth recognizing it as a fairly drastic move.
Would a decision theory like this count as “giving up on probabilities” in the sense in which you mean it here?
To anyone who is still not convinced—that last move, ∑i∑jqipijAj=∑j∑iqipijAj, is justified by Tonelli’s theorem, merely because qipijAj(ω)≥0 (for all i,j,ω).
The way I look at this is that objects like 12X0+12X1 live in a function space like X→R≥0, specifically the subspace of that where the functions f are integrable with respect to counting measure on X and ∑x∈Xf(x)=1. In other words, objects like f1:=12X0+12X1 are probability mass functions (pmf). f1(X0) is 12, and f1(X1) is 12, and f1 of anything else is 0. When we write what looks like an infinite series λ1f1+λ2f2+⋯, what this really means is that we’re defining a new f by pointwise infinite summation: f(x):=∑∞i=1λifi(x). So only each collection of terms that contains a given Xk needs to form a convergent series in order for this new f to be well-defined. And for it to equal another f′, the convergent sums only need to be equal pointwise (for each Xk, f(Xk)=f′(Xk)). In Paul’s proof above, the only Xk for which the collection of terms containing it is even infinite is X0. That’s the reason he’s “just calculating” that one sum.
The outcomes have the property that they are step-wise more than double the worth.
In X∞=12X0+14X1+18X2+116X4+… the real part only halfs on each term. So as the series goes on each term gets bigger and bigger instead of smaller and smaller and smaller associated with convergent-like scenario. So it seems to me that even in isolation this is a divergent-like series.
Here’s a concrete example. Start with a sum that converges to 0 (in fact every partial sum is 0):
0 + 0 + …
Regroup the terms a bit:
= (1 + −1) + (1 + −1) + …
= 1 + (-1 + 1) + (-1 + 1) + …
= 1 + 0 + 0 + …
and you get a sum that converges to 1 (in fact every partial sum is 1). I realize that the things you’re summing are probability distributions over outcomes and not real numbers, but do you have reason to believe that they’re better behaved than real numbers in infinite sums? I’m not immediately seeing how countable additivity helps. Sorry if that should be obvious.
Your argument doesn’t go through if you restrict yourself to infinite weighted averages with nonnegative weights.
Aha. So if a sum of non-negative numbers converges, than any rearrangement of that sum will converge to the same number, but not so for sums of possibly-negative numbers?
Ok, another angle. If you take Christiano’s lottery:
X∞=12X0+14X1+18X2+116X4...
and map outcomes to their utilities, setting the utility of X0 to 1, of X1 to 2, etc., you get:
1/2+1/2+1/2+1/2+...
Looking at how the utility gets rearranged after the “we can write X∞ as a mixture” step, the first “1/2″ term is getting “smeared” across the rest of the terms, giving:
3/4+5/8+9/16+17/32+...
which is a sequence of utilities that are pairwise higher. This is an essential part of the violation of Antisymmetry/Unbounded/Dominance. My intuition says that a strange thing happened when you rearranged the terms of the lottery, and maybe you shouldn’t do that.
Should there be another property, called “Rearrangement”?
Rearrangement: you may apply an infinite number of commutivity (x+y=y+x) and associativity ((x+y)+z=x+(y+z)) rewrites to a lottery.
(In contrast, I’m pretty sure you can’t get an Antisymmetry/Unbounded/Dominance violation by applying only finitely many commutivity and associativity rearrangements.)
I don’t actually have a sense of what “infinite lotteries, considered equivalent up to finite but not infinite rearrangements” look like. Maybe it’s not a sensible thing.