Slider comments on Impossibility results for unbounded utilities

Slider 2 Feb 2022 12:11 UTC
17 points
2
The proof doesn’t run for me. The only way I know to be able to rearrange the terms in a infinite series is if the starting starting series converges and the resultant series converges. The series doesn’t fullfill the condition so I am not convinced the rewrite is a safe step.
I am a bit unsure about my maths so I am going to hyberbole the kind of flawed logic I read into the proof. Start with series that might not converge 1+1+1+1+1+1… (oh it indeed blatantly diverges) then split each term to have a non-effective addition (1+0)+(1+0)+(1+0)+(1+0)… . Blatantly disregard safety rules about paranthesis messing with series and just treat them as paranthesis that follow familiar rules 1+0+1+0+1+0+1+0+1… so 1+1+1+1… is not equal to itself. (unsafe step leads to non-sense)
With converging series it doesn’t matter whether we get “twice as fast” to the limit but the “rate of ascension” might matter to whatever analog a divergent series would have to a value.
- Maximum_Skull 2 Feb 2022 14:51 UTC
  7 points
  0
  Parent
  The correct condition for real numbers would be absolute convergence (otherwise the sum after rearrangement might become different and/or infinite) but you are right: the series rearrangement is definitely illegal here.
  - paulfchristiano 2 Feb 2022 17:03 UTC
    4 points
    2
    Parent
    But in the post I’m rearranging a series of probabilities, $\frac{1}{2}, \frac{1}{4}, \dots$ which is very legal. The fact that you can’t rearrange infinite sums is an intuitive reason to reject Weak Dominance, and then the question is how you feel about that.
    - Maximum_Skull 6 Feb 2022 10:49 UTC
      5 points
      1
      Parent
      Those probabilities are multiplied by $X_{i}$ s, which makes it more complicated.
      If I try running it with $X$ s being the real numbers (which is probably the most popular choice for utility measurement), the proof breaks down. If I, for example, allow negative utilities, I can rearrange the series from a divergent one into a convergent one and vice versa, trivially leading to a contradiction just from the fact that I am allowed to do weird things with infinite series, and not because of proposed axioms being contradictory.
      EDIT: concisely, your axioms do not imply that the rearrangement should result in the same utility.
      - LGS 10 Feb 2022 7:39 UTC
        8 points
        1
        Parent
        The rearrangement property you’re rejecting is basically what Paul is calling the “rules of probability” that he is considering rejecting.
        If you have a probability distribution over infinitely (but countably) many probability distributions, each of which is of finite support, then it is in fact legal to “expand out” the probabilities to get one distribution over the underlying (countably infinite) domain. This is standard in probability theory, and it implies the rearrangement property that bothers you.
        Maximum_Skull 10 Feb 2022 8:32 UTC
        2 points
        0
        Parent
        Oh, thanks, I did not think about that! Now everything makes much more sense.
- paulfchristiano 2 Feb 2022 16:29 UTC
  6 points
  0
  Parent
  I’m not rearranging a sum of real numbers. I’m showing that no relationship $<$ over probability distributions satisfies a given dominance condition.
  - Slider 2 Feb 2022 16:52 UTC
    5 points
    1
    Parent
    I am not familiar with the rules of lotteries and mixtures to know whether the mixture rewrite is valid or not. If the outcomes were for example money payouts then the operations carried out would be invalid. I would be surprised if somehow the rules for lotteries made this okay.
    The bit where there is too much implicit steps for me is
    Consider the lottery $X_{\infty} = \frac{1}{2} X_{0} + \frac{1}{4} X_{1} + \frac{1}{8} X_{2} + \frac{1}{16} X_{4} + \dots$
    We can write $X_{\infty}$ as a mixture:
    $X_{\infty} = \frac{1}{2} (\frac{1}{2} X_{0} + \frac{1}{2} X_{1}) + \frac{1}{4} (\frac{1}{2} X_{0} + \frac{1}{2} X_{2}) + \frac{1}{8} (\frac{1}{2} X_{0} + \frac{1}{2} X_{4}) + \dots$
    I would benefit from babystepping throught this process or atleast pointers what I need to learn to be convinced of this
    - paulfchristiano 2 Feb 2022 16:59 UTC
      6 points
      0
      Parent
      I’m using the usual machinery of probability theory, and particularly countable additivity. It may be reasonable to give up on that, and so I think the biggest assumption I made at the beginning was that we were defining a probability distribution over arbitrary lotteries and working with the space of probability distributions.
      A way to look at it is: the thing I’m taking sums over are the probabilities of possible outcomes. I’m never talking anywhere about utilities or cash payouts or anything else. The fact that I labeled some symbols $X_{8}$ does not mean that the real number 8 is involved anywhere.
      But these sums over the probabilities of worlds are extremely convergent. I’m not doing any “rearrangement,” I’m just calculating $\sum_{k = n + 1}^{\infty} \frac{1}{2^{k}} = \frac{1}{2^{n}}$ .
      - gjm 3 Feb 2022 0:10 UTC
        9 points
        3
        Parent
        So there are some missing axioms here, describing what happens when you construct lotteries out of other lotteries. Specifically, the rearranging step Slider asks about is not justified by the explicitly given axioms alone: it needs something along the lines of “if for each i we have a lottery $\sum_{j} p_{i j} X_{j}$ , then the values of the lotteries $\sum_{i} q_{i} (\sum_{j} p_{i j} X_{j})$ and $\sum_{j} (\sum_{i} q_{i} p_{i j}) X_{j}$ are equal”.
        (Your derivation only actually uses this in the special case where for each i only finitely many of the $p_{i j}$ are nonzero.)
        You might want to say either that these two “different” lotteries have equal value, or else that they are in fact the same lottery.
        In either case, it seems to me that someone might dispute the axiom in question (intuitively obvious though it seems, just like the others). You’ve chosen a notation for lotteries that makes an analogy with infinite series; if we take this seriously, we notice that this sort of rearrangement absolutely can change whether the series converges and to what value if so. How sure are you that rearranging lotteries is safer than rearranging sums of real numbers?
        (The sums of the probabilities are extremely convergent, yes. But the probabilities are (formally) multiplying outcomes whose values we are supposing are correspondingly divergent. Again, I am not sure I want to assume that this sort of manipulation is safe.)
        paulfchristiano 3 Feb 2022 17:13 UTC
        4 points
        1
        Parent
        I’m handling lotteries as probability distributions over an outcome space $Ω$ , not as formal sums of outcomes.
        To make things simple you can assume $Ω$ is countable. Then a lottery $A$ assigns a real number $A (ω)$ to each $ω \in Ω$ , representing its probability under the lottery $A$ , such that $\sum_{ω \in Ω} A (ω) = 1$ . The sum $\sum p_{i} A_{i}$ is defined by $(\sum p_{i} A_{i}) (ω) = \sum p_{i} A_{i} (ω)$ . And all these infinite sums of real numbers are in turn defined as the suprema of the finite sums which are easily seen to exist and to still sum to 1. (All of this is conventional notation.) Then $\sum_{i} q_{i} (\sum_{j} p_{i j} A_{j})$ and $\sum_{j} (\sum_{i} q_{i} p_{i j}) A_{j}$ are exactly equal.
        gjm 4 Feb 2022 13:41 UTC
        4 points
        1
        Parent
        OK! But I still feel like there’s something being swept under the carpet here. And I think I’ve managed to put my finger on what’s bothering me.
        There are various things we could require our agents to have preferences over, but I am not sure that probability distributions over outcomes is the best choice. (Even though I do agree that the things we want our agents to have preferences over have essentially the same probabilistic structure.)
        A weaker assumptions we might make about agents’ preferences is that they are over possibly-uncertain situations, expressed in terms of the agent’s epistemic state.
        And I don’t think “nested” possibly-uncertain-situations even exist. There is no such thing as assigning 50% probability to each of (1) assigning 50% probability to each of A and B, and (2) assigning 50% probability to each of A and C. There is such a thing as assigning 50% probability now to assigning those different probabilities in five minutes, and by the law of iterated expectations your final probabilities for A,B,C must then obey the distributive law, but the situations are still not literally the same, and I think that in divergent-utility situations we can’t assume that your preferences depend only on the final outcome distribution.
        Another way to say this is that, given that the $A_{i}$ and $B_{i}$ are lotteries rather than actual outcomes and that combinations like $\sum p_{i} A_{i}$ mean something more complicated than they may initially look like they mean, the dominance axioms are less obvious than the notation makes them look, and even though there are no divergences in the sums-over-probabilities that arise when you do the calculations there are divergences in implied something-like-sums-over-weighted utilities, and in my formulation you really are having to rearrange outcomes as well as probabilities when you do the calculations.
        paulfchristiano 4 Feb 2022 16:30 UTC
        4 points
        0
        Parent
        I agree that in the real world you’d have something like “I’m uncertain about whether X or Y will happen, call it ⁵⁰⁄₅₀. If X happens, I’m ⁵⁰⁄₅₀ about whether A or B will happen. If Y happens, I’m ⁵⁰⁄₅₀ about whether B or C will happen.” And it’s not obvious that this should be the same as being ⁵⁰⁄₅₀ between B or X, and conditioned on X being ⁵⁰⁄₅₀ between A or C.
        Having those two situations be different is kind of what I mean by giving up on probabilities—your preferences are no longer a function of the probability that outcomes occur, they are a more complicated function of your epistemic state, and so it’s not correct to summarize your epistemic state as a probability distribution over outcomes.
        I don’t think this is totally crazy, but I think it’s worth recognizing it as a fairly drastic move.
        Bunthut 30 Mar 2022 9:30 UTC
        1 point
        0
        Parent
        Would a decision theory like this count as “giving up on probabilities” in the sense in which you mean it here?
        davidad 3 Feb 2022 17:49 UTC
        1 point
        0
        Parent
        To anyone who is still not convinced—that last move, $\sum_{i} \sum_{j} q_{i} p_{i j} A_{j} = \sum_{j} \sum_{i} q_{i} p_{i j} A_{j}$ , is justified by Tonelli’s theorem, merely because $q_{i} p_{i j} A_{j} (ω) \geq 0$ (for all $i, j, ω$ ).
      - davidad 2 Feb 2022 18:46 UTC
        6 points
        0
        Parent
        The way I look at this is that objects like $\frac{1}{2} X_{0} + \frac{1}{2} X_{1}$ live in a function space like $X \to R_{\geq 0}$ , specifically the subspace of that where the functions $f$ are integrable with respect to counting measure on $X$ and $\sum_{x \in X} f (x) = 1$ . In other words, objects like $f_{1} := \frac{1}{2} X_{0} + \frac{1}{2} X_{1}$ are probability mass functions (pmf). $f_{1} (X_{0})$ is $\frac{1}{2}$ , and $f_{1} (X_{1})$ is $\frac{1}{2}$ , and $f_{1}$ of anything else is $0$ . When we write what looks like an infinite series $λ_{1} f_{1} + λ_{2} f_{2} + \dots$ , what this really means is that we’re defining a new $f$ by pointwise infinite summation: $f (x) := \sum_{i = 1}^{\infty} λ_{i} f_{i} (x)$ . So only each collection of terms that contains a given $X_{k}$ needs to form a convergent series in order for this new $f$ to be well-defined. And for it to equal another $f^{'}$ , the convergent sums only need to be equal pointwise (for each $X_{k}$ , $f (X_{k}) = f^{'} (X_{k})$ ). In Paul’s proof above, the only $X_{k}$ for which the collection of terms containing it is even infinite is $X_{0}$ . That’s the reason he’s “just calculating” that one sum.
      - Slider 2 Feb 2022 19:08 UTC
        2 points
        Parent
        The outcomes have the property that they are step-wise more than double the worth.
        In $X_{\infty} = \frac{1}{2} X_{0} + \frac{1}{4} X_{1} + \frac{1}{8} X_{2} + \frac{1}{16} X_{4} + \dots$ the real part only halfs on each term. So as the series goes on each term gets bigger and bigger instead of smaller and smaller and smaller associated with convergent-like scenario. So it seems to me that even in isolation this is a divergent-like series.
      - justinpombrio 2 Feb 2022 18:36 UTC
        2 points
        −1
        Parent
        Here’s a concrete example. Start with a sum that converges to 0 (in fact every partial sum is 0):
        
        0 + 0 + …
        
        Regroup the terms a bit:
        
        = (1 + −1) + (1 + −1) + …
        
        = 1 + (-1 + 1) + (-1 + 1) + …
        
        = 1 + 0 + 0 + …
        
        and you get a sum that converges to 1 (in fact every partial sum is 1). I realize that the things you’re summing are probability distributions over outcomes and not real numbers, but do you have reason to believe that they’re better behaved than real numbers in infinite sums? I’m not immediately seeing how countable additivity helps. Sorry if that should be obvious.
        tailcalled 2 Feb 2022 19:13 UTC
        2 points
        1
        Parent
        Your argument doesn’t go through if you restrict yourself to infinite weighted averages with nonnegative weights.
        justinpombrio 3 Feb 2022 11:41 UTC
        4 points
        1
        Parent
        Aha. So if a sum of non-negative numbers converges, than any rearrangement of that sum will converge to the same number, but not so for sums of possibly-negative numbers?
        
        Ok, another angle. If you take Christiano’s lottery:
        
        $X_{\infty} = \frac{1}{2} X_{0} + \frac{1}{4} X_{1} + \frac{1}{8} X_{2} + \frac{1}{16} X_{4} . . .$
        
        and map outcomes to their utilities, setting the utility of $X_{0}$ to 1, of $X_{1}$ to 2, etc., you get:
        
        $1 / 2 + 1 / 2 + 1 / 2 + 1 / 2 + . . .$
        
        Looking at how the utility gets rearranged after the “we can write $X_{\infty}$ as a mixture” step, the first “1/2″ term is getting “smeared” across the rest of the terms, giving:
        
        $3 / 4 + 5 / 8 + 9 / 16 + 17 / 32 + . . .$
        
        which is a sequence of utilities that are pairwise higher. This is an essential part of the violation of Antisymmetry/Unbounded/Dominance. My intuition says that a strange thing happened when you rearranged the terms of the lottery, and maybe you shouldn’t do that.
        
        Should there be another property, called “Rearrangement”?
        
        Rearrangement: you may apply an infinite number of commutivity ( $x + y = y + x$ ) and associativity ( $(x + y) + z = x + (y + z)$ ) rewrites to a lottery.
        
        (In contrast, I’m pretty sure you can’t get an Antisymmetry/Unbounded/Dominance violation by applying only finitely many commutivity and associativity rearrangements.)
        
        I don’t actually have a sense of what “infinite lotteries, considered equivalent up to finite but not infinite rearrangements” look like. Maybe it’s not a sensible thing.
    - Slider 3 Feb 2022 20:29 UTC
      2 points
      −1
      Parent
      I am having trouble trying to translate between infinity-hiding style and explicit infinity style. My grievance with might be stupid.
      $\frac{1}{2} X_{0}$
      split X_0 into equal number parts to final form
      $\frac{1}{2} (ϵ X_{0} + ϵ X_{0} + ϵ X_{0} + . . .)$
      move the scalar in
      $\frac{1}{4} ϵ X_{0} + \frac{1}{8} ϵ X_{0} + \frac{1}{16} ϵ X_{0} + . . .$
      combine scalars
      $\frac{ϵ}{4} X_{0} + \frac{ϵ}{8} X_{0} + \frac{ϵ}{16} X_{0} + . . .$
      Take each of these separately to the rest of the original terms
      $(\frac{ϵ}{4} X_{0} + \frac{1}{4} X_{1}) + (\frac{ϵ}{8} X_{0} + \frac{1}{8} X_{2}) + (\frac{ϵ}{16} X_{0} + \frac{1}{16}) + . . .$
      Combine scalars to try to hit closest to the target form
      $\frac{1}{2} (\frac{ϵ}{2} X_{0} + \frac{1}{2} X_{1}) + \frac{1}{4} (\frac{ϵ}{2} X_{0} + \frac{1}{2} X_{1}) + \frac{1}{8} (\frac{ϵ}{2} X_{0} + \frac{1}{2} X_{1}) + . . .$
      $\frac{ϵ}{2} X_{0} + \frac{1}{2} X_{1}$ is then quite far from $\frac{1}{2} X_{0} + \frac{1}{2} X_{1}$
      Within real precision a single term hasn’t moved much $\frac{ϵ}{2} X_{0} + \frac{1}{2} X_{1} \sim \frac{1}{2} X_{1}$
      This suggests to me that somewhere there are “levels of calibration” that are mixing levels corresponding to members of different archimedean fields trying to intermingle here. Normally if one is allergic to infinity levels there are ways to dance around it / think about it in different terms. But I am not efficient in translating between them.
      - Slider 8 Feb 2022 16:58 UTC
        2 points
        0
        Parent
        New attempt
        $X_{\infty} = \frac{1}{2} X_{0} + \frac{1}{4} X_{1} + \frac{1}{8} X_{2} + \frac{1}{16} X_{4} + \dots$
        I think I now agree that $X_{0}$ can be written as $\frac{1}{2} X_{0} + \frac{1}{4} X_{0} + \frac{1}{8} X_{0} . . .$
        However this uses a “de novo” indexing and gets only to
        $\frac{1}{2}$ ( $\frac{1}{2} X_{0} + \frac{1}{4} X_{0} + \frac{1}{8} X_{0} . . .$ ) $+ \frac{1}{4} X_{1} + \frac{1}{8} X_{2} + \frac{1}{16} X_{4} + \dots$
        taking terms out form the inner thing crosses term lines for the outer summation which counts as “messing with indexing” in my intuition. The suspect move just maps them out one to one
        $(\frac{1}{4} X_{0} + \frac{1}{4} X_{1}) + (\frac{1}{8} X_{0} + \frac{1}{8} X_{2}) + (\frac{1}{16} X_{0} + \frac{1}{16} X_{4}) + . . .$
        But why is this the permitted way and could I jam the terms differently in say apply to every other term
        $(\frac{1}{4} X_{0} + \frac{1}{4} X_{1}) + (\frac{1}{8} X_{2}) + (\frac{1}{8} X_{0} + \frac{1}{16} X_{4}) + \frac{1}{32} X_{8} + (\frac{1}{16} X_{0} + \frac{1}{64} X_{1} 6) + . . .$
        If I have $(a \sum i = 0 x_{i}) + (a \sum j = 0 y_{j})$ I am more confident that they “index at the same rate” to make $c \sum u = 0 x_{u} + y_{u}$ . However if I have $(a \sum i x_{i}) + (b \sum j y_{j})$ I need more information about the relation of a and b to make sure that mixing them plays nicely. Say in the case of b=2a then it is not okay to think only of the terms when mixing.
- mikehawk 8 Feb 2022 16:10 UTC
  2 points
  0
  Parent
  I had the same initial reaction. I believe the logic of the proof is fine (it is similar to the Mazur swindle), basically because it it not operating on real numbers, but rather on mixtures of distributions.
  The issue is more: why would you expect the dominance condition to hold in the first place? If you allow for unbounded utility functions, then you have to give it up anyway, for kind of trivial reasons. Consider two sequences Ai and Bi of gambles such that EA_i<EB_i and sum_i p_iEA_i and sum_i p_i EB_i both diverge. Does it follow that E(sum_i p_iA_i)< E(sum_i p_i B_i) ? Obviously not, since both quantities diverge. At best you can say <=. A bit more formally; in real analysis/measure theory one works with the so-called extended real numbers, in which the value “infinity” is assigned to any divergent sum, with this value assumed to be defined by the algebraic property x<=infinity for any x. In particular, there is no x in the extended real numbers such that infinity<x. So at least in standard axiomatizations of measure theory, you cannot expect the strict dominance condition to hold in complete generality; you will have to make some kind of exception for infinite values. Similar considerations apply to the Intermediate Mixtures assumption.
  - Slider 8 Feb 2022 17:04 UTC
    2 points
    0
    Parent
    With surreals I might have transfinite quantities that can reliably compare every which way despite both members being beyond a finite bound. For “tame” entities all kinds of nice properties are easy to get/prove. The game of “how wild my entities can get while retaining a certain property” is a very different game. “These properties are impossible to get even for super-wild things” is even harder.
    Mazur seems (atleast based on the wikipedia article) not to be a proof of certain things, so that warrants special interest whether the applicability conditions are met or not.
- Jalex Stark 2 Feb 2022 16:07 UTC
  1 point
  Parent
  The sum we’re rearranging isn’t a sum of real numbers, it’s a sum in $ℓ_{1}$ . Ignoring details of what $ℓ_{1}$ means… the two rearrangements give the same sum! So I don’t understand what your argument is.
  Abstracting away the addition and working in an arbitrary topological space, the argument goes like this: $L = lim x_{n} = lim y_{n}$ . For all $n, f (x_{n}) = 0 and f (y_{n}) = 1.$ Therefore, f is not continuous (else 0 = 1).
  - Slider 2 Feb 2022 16:29 UTC
    2 points
    Parent
    if $ℓ_{1}$ is something weird then I don’t neccesarily even know that x+y=y+x, it is not a given at all that rearrangement would be permissible.
    In order to sensibly compare $lim x_{n}$ and $lim y_{n}$ it would be nice if they both existed and not be infinities. $L = lim x_{n} = lim y_{n} = \infty$ is not useful for transiting equalities between x and y.
    - Jalex Stark 3 Feb 2022 17:12 UTC
      1 point
      0
      Parent
      L is not equal to infinity; that’s a type error. L is equal to ¹⁄₂ A_0 + ¹⁄₄ A_1 + ¹⁄₈ A_2 …
      $ℓ_{1}$ is a bona fide vector space—addition behaves as you expect. The points are infinite sequences (x_i) such that $\sum_{i} | x_{i} |$ is finite. This sum is a norm and the space is Banach with respect to that norm.
      Concretely, our interpretation is that x_i is the probability of being in world A_i.
      A utility function is a linear functional, i.e. a map from points to real numbers such that the map commutes with addition. The space of continuous linear functionals on $ℓ_{1}$ is $ℓ_{\infty}$ , which is the space of bounded sequences. A special case of this post is that unbounded linear functionals are not continuous. I say ‘special case’ because the class of “preference between points” is richer than the class of utility functions. You get a preference order from a utility function via “map to real numbers and use the order there.” The utility function framework e.g. forces every pair of worlds to be comparable, but the more general framework doesn’t require this—Paul’s theorem follows from weaker assumptions.
      - Slider 3 Feb 2022 17:42 UTC
        1 point
        0
        Parent
        The presentation tries to deal with unbounded utilities. Assuming $\sum_{i} | x_{1} |$ to be finite exludes the target of investigation from the scope.
        Supposedly there are multiple text input methods but atleast on the website I can highlight text and use a $f (x)$ button to get math rendering.
        I don’t know enough about the fancy spaces whether a version where the norm can take on transfinite or infinidesimal values makes sense or that the elements are just sequences without a condition to converge. Either (real number times a outcome) is a type for which finiteness check doesn’t make sense or the allowable conversions from outcomes to real numbers forces the sum to be bigger than any real number.
        Jalex Stark 3 Feb 2022 20:01 UTC
        1 point
        0
        Parent
        Requiring $\sum_{i} | x_{i} |$ to be finite is just part of assuming the $x_{i}$ form a probability distribution over worlds. I think you’re confused about the type difference between the $A_{i}$ and the utility of $A_{i}$ . (Where in the context of this post, the utility is just represented by an element of a poset.)
        I’m not advocating for or making arguments about any fanciness related to infinitesimals or different infinite values or anything like that.