As several people mentioned in the comments on the other post, it feels like the most natural way to get around this is to only have beliefs which are finite-support probability distributions. If you do that, the paradoxes go away.
Of course, if we do that, then the set of our beliefs is no longer “complete” in the topological sense; that is, if we pick a distance metric between distributions (such as total variation distance), then for the set of finite-support distributions, Cauchy sequences do not necessarily converge.
This suggests that perhaps we can represent our beliefs as Cauchy sequences of finite-support distributions. The paradox can then be rephrased in this setting: it says that if you unbounded utilities, then a Cauchy sequence of beliefs might have a utility that does not converge (e.g. it might swing wildly between positive and negative, and the swings can even get worse and worse as we go further down the sequence, despite the sequence being Cauchy and hence “intuitively convergent”).
I mentioned this briefly in a footnote on the other post. The summary is that it’s not exactly clear to me what it means to have “unbounded utility functions” if you think there are only finitely many conceivable outcomes. Isn’t there then some best outcome, out of the 1030 that you think deserve non-zero probability?
Perhaps there could be infinitely many possible decisions, but that each decision involves only finitely many possible outcomes? But that seems implausible to me. For example, consider my parents making a decision about how to raise me—if there are infinitely many decisions I might face, then it seems like there are infinitely many possible outcomes from their decision. To me this seems worse than abstract worries about continuity.
And if there are infinitely many possible outcomes of a decision, what does it mean to force my beliefs to have finite support? If I just consider a single set of finitely-supported beliefs, what exactly am I doing? If I take limits, then as you point out we can end up back at the same paradox.
I guess the out here would be to represent outcomes as sequences of finitely supported probability distributions, effectively adding additional structure (that is presumably related to how that distribution came about). That means that I don’t need to be indifferent between two sequences with the same limit, I can care about that extra data.
This is the kind of thing I have in mind by abandoning probability theory and representing my uncertainty with some richer structure. I don’t find “sequence of finitely-supported probability distributions” particularly compelling but it seems like something you could try (and if you did it that way maybe you wouldn’t have to give up on probability theory, though as I suggested I suspect that’s where this road will end).
I guess the two questions, for that and any other proposal, would be: (i) where does this extra structure come from? what about my epistemic state determines how it gets represented as a sequence? (ii) are there any sensible preferences over the new enlarged space?
(I will probably make some posts in the future with more concrete examples of how totally messed up the “intuitive” unbounded utility functions are, which will hopefully make those concerns sharper.)
The way I was envisioning it, there would be infinitely possible outcomes but you could only have a belief about finitely many of them at one time.
I don’t think this is too outrageous—for example, if there were uncountably many possible outcomes then we all agree that (no matter the setup) there would be unmeasurable sets that you could not have a belief over.
The main motivation here is just that this is a mathematically nice way to set it up. For example, if the set of all possible outcomes is A, then conv(A) (the convex hull of A) will be the set of all finite-support probability distributions over A—it comes up naturally.
[More formal version: identify the set A as a subset of the vector space RA of functions from A to R, where each element x of A is identified with the characteristic function that returns 1 on input x and 0 otherwise. Then the convex hull of A can be defined as the intersection of all convex supersets of A (all of which are subsets of the vector space RA). It is then a relatively straight-forward theorem that this convex hull of A happens to be exactly the set of all functions A→R such that (1) they return 0 on all but finitely many elements, (2) they have non-negative range, and (3) the sum of their non-zero outputs is exactly 1; in other words, the convex hull of A is exactly the set of finite-support probability distributions over the set A. My point here is merely that finite-support probability distributions came up naturally, even in the context of an infinite outcome space A, and even though the definition of convex hull did not explicitly mention finite supports in any way.]
Upon reflection, I agree that sequences of such finite-support distributions are a kind of an ugly hack. In particular, it’s not clear how to mix together two such sequences (i.e. how to take a convex combination of them, something we may want to do with our beliefs).
We can just stick to finite-support distributions themselves, without allowing sequences of them. (Perhaps a motivation could be that our finite brains can only think about finitely many plausible outputs at a time, or something like that). In that case, I think the main drawback is only that we cannot model St. Petersburg paradox. However, given your counterexamples, perhaps this is a feature rather than a bug...
I guess I’m confused about how to represent my current beliefs with a finitely-supported probability distribution. It looks to me like there are infinitely many ways the universe could be (in the sense that e.g. I could start listing them and never stop, or that there are functions f:universes→universes for which f(U) is bigger than U while still being plausible).
I don’t expect to enumerate all these infinitely many universes, but practically how am I supposed to think about my preferences if it feels like there are clearly infinitely many possible states of affairs?
Your comment gave me pause, and certainly makes me lean away from finite-support probability distributions somewhat.
However, if the problem is that you can actively generate more and more plausible universes without stop, then it does seem at some level like your belief structure is a sequence of finite-support probability distributions, doesn’t it? As you mentally generate more and more plausible universes, your belief gets updates to a distribution with larger and larger support. The main problem is just that “sequence of distributions” is a much uglier mathematical object than a single distribution.
Another thought: if you can actively mentally generate more and more possible universes, and if, in addition, the universes you generate have such large utilities that they become “more and more important” to consider (i.e. even after multiplying by their diminishing probabilities, the absolute value of probability*utility is increasing), then you are screwed. This was shown nicely by your examples. So in some sense, we have to restrict to situations where the possible universes you mentally generate are diminishing in importance (i.e. even if their utility is increasing, their probability is diminishing fast enough to make the sequence absolutely convergent).
If you believe that spacetime is discrete at the Planck scale, then there are only finitely many options for how far your neighbor’s house can be from yours. I tend to think that finite-support probability distributions are sufficient for this task… even if spacetime is continuous, we can get a good-enough approximation by assuming it is discrete at the Planck scale.
(Is there some context I’m missing here? I don’t know if I’m supposed to recognize your example.)
I am trying to ask about the limits of the apporach by formulating something like the most reasonable case where capturing the innumerable aspects of the topic is actually on point. One could think that 3, 3.1, π and golden ratio would be perfectly legit options and questions of the form “Do you prefer your neighbour to be A far away or B far away?” would need to be answerable for all valid options and one option for conceiving it is for A and B to be arbitrary reals. With the “good-enough approximation” we don’t talk about being π distance away because we don’t believe in truly trancendental distances. There is one distance a little beyond that and one little short of that and claims need to be about those.
Well, technically you can still restrict to finite-support probability distributions even if your outcome space is infinite. So even if you allow all real numbers as distances (and have utilities for each), you can restrict your set of beliefs to have finite support at any given time (i.e. at one point in time you might believe the distance to be one of {3,3.1,pi} or any other finite set of reals, and you may pick any distribution over that finite set). This setup still avoids Paul’s paradoxes.
Having said that, I have trouble seeing why you’d need to do this for the specific case of distances. Computers already use float-point arithmetic (of finite precision) to estimate real numbers, and not much goes wrong there. So computers are already restricting the set of possible distances to a finite set.
Any gradualation is likely to not hit the exact distance on the spot. Then If I was faced to be in a situation where I could become sensitive to that distinction I would need to go from not having included it in the support to having inluded it in the support ie from zero probablity to non-zero probablity. This seems like a smell that things are not genuine comparable to the safety of avoiding unbounded utilities, so i am not sure whether it is an improvement.
Even computers can do symbolic manipulation where they can get exact results. They are not forced to numerically simulate everything. Determining the intersection of two lines can be done exactly in finite computation despite doing it by “brute force” point-for-point whether they are in the same location would call for more than numerable steps.
I have intuitiion/introspective impression that there are objects like “distance is {3,3.1, between 3.2 and 3.3}” where the three categories are equiprobable and distances within “3.2 to 3.3″ are equiprobable to each other but that “3.2 to 3.3” is not made up of listable separate beliefs. (More realisticially they tend to not feel exactly equiprobable within the whole range).
As several people mentioned in the comments on the other post, it feels like the most natural way to get around this is to only have beliefs which are finite-support probability distributions. If you do that, the paradoxes go away.
Of course, if we do that, then the set of our beliefs is no longer “complete” in the topological sense; that is, if we pick a distance metric between distributions (such as total variation distance), then for the set of finite-support distributions, Cauchy sequences do not necessarily converge.
This suggests that perhaps we can represent our beliefs as Cauchy sequences of finite-support distributions. The paradox can then be rephrased in this setting: it says that if you unbounded utilities, then a Cauchy sequence of beliefs might have a utility that does not converge (e.g. it might swing wildly between positive and negative, and the swings can even get worse and worse as we go further down the sequence, despite the sequence being Cauchy and hence “intuitively convergent”).
I mentioned this briefly in a footnote on the other post. The summary is that it’s not exactly clear to me what it means to have “unbounded utility functions” if you think there are only finitely many conceivable outcomes. Isn’t there then some best outcome, out of the 1030 that you think deserve non-zero probability?
Perhaps there could be infinitely many possible decisions, but that each decision involves only finitely many possible outcomes? But that seems implausible to me. For example, consider my parents making a decision about how to raise me—if there are infinitely many decisions I might face, then it seems like there are infinitely many possible outcomes from their decision. To me this seems worse than abstract worries about continuity.
And if there are infinitely many possible outcomes of a decision, what does it mean to force my beliefs to have finite support? If I just consider a single set of finitely-supported beliefs, what exactly am I doing? If I take limits, then as you point out we can end up back at the same paradox.
I guess the out here would be to represent outcomes as sequences of finitely supported probability distributions, effectively adding additional structure (that is presumably related to how that distribution came about). That means that I don’t need to be indifferent between two sequences with the same limit, I can care about that extra data.
This is the kind of thing I have in mind by abandoning probability theory and representing my uncertainty with some richer structure. I don’t find “sequence of finitely-supported probability distributions” particularly compelling but it seems like something you could try (and if you did it that way maybe you wouldn’t have to give up on probability theory, though as I suggested I suspect that’s where this road will end).
I guess the two questions, for that and any other proposal, would be: (i) where does this extra structure come from? what about my epistemic state determines how it gets represented as a sequence? (ii) are there any sensible preferences over the new enlarged space?
(I will probably make some posts in the future with more concrete examples of how totally messed up the “intuitive” unbounded utility functions are, which will hopefully make those concerns sharper.)
The way I was envisioning it, there would be infinitely possible outcomes but you could only have a belief about finitely many of them at one time.
I don’t think this is too outrageous—for example, if there were uncountably many possible outcomes then we all agree that (no matter the setup) there would be unmeasurable sets that you could not have a belief over.
The main motivation here is just that this is a mathematically nice way to set it up. For example, if the set of all possible outcomes is A, then conv(A) (the convex hull of A) will be the set of all finite-support probability distributions over A—it comes up naturally.
[More formal version: identify the set A as a subset of the vector space RA of functions from A to R, where each element x of A is identified with the characteristic function that returns 1 on input x and 0 otherwise. Then the convex hull of A can be defined as the intersection of all convex supersets of A (all of which are subsets of the vector space RA). It is then a relatively straight-forward theorem that this convex hull of A happens to be exactly the set of all functions A→R such that (1) they return 0 on all but finitely many elements, (2) they have non-negative range, and (3) the sum of their non-zero outputs is exactly 1; in other words, the convex hull of A is exactly the set of finite-support probability distributions over the set A. My point here is merely that finite-support probability distributions came up naturally, even in the context of an infinite outcome space A, and even though the definition of convex hull did not explicitly mention finite supports in any way.]
Upon reflection, I agree that sequences of such finite-support distributions are a kind of an ugly hack. In particular, it’s not clear how to mix together two such sequences (i.e. how to take a convex combination of them, something we may want to do with our beliefs).
We can just stick to finite-support distributions themselves, without allowing sequences of them. (Perhaps a motivation could be that our finite brains can only think about finitely many plausible outputs at a time, or something like that). In that case, I think the main drawback is only that we cannot model St. Petersburg paradox. However, given your counterexamples, perhaps this is a feature rather than a bug...
I guess I’m confused about how to represent my current beliefs with a finitely-supported probability distribution. It looks to me like there are infinitely many ways the universe could be (in the sense that e.g. I could start listing them and never stop, or that there are functions f:universes→universes for which f(U) is bigger than U while still being plausible).
I don’t expect to enumerate all these infinitely many universes, but practically how am I supposed to think about my preferences if it feels like there are clearly infinitely many possible states of affairs?
Your comment gave me pause, and certainly makes me lean away from finite-support probability distributions somewhat.
However, if the problem is that you can actively generate more and more plausible universes without stop, then it does seem at some level like your belief structure is a sequence of finite-support probability distributions, doesn’t it? As you mentally generate more and more plausible universes, your belief gets updates to a distribution with larger and larger support. The main problem is just that “sequence of distributions” is a much uglier mathematical object than a single distribution.
Another thought: if you can actively mentally generate more and more possible universes, and if, in addition, the universes you generate have such large utilities that they become “more and more important” to consider (i.e. even after multiplying by their diminishing probabilities, the absolute value of probability*utility is increasing), then you are screwed. This was shown nicely by your examples. So in some sense, we have to restrict to situations where the possible universes you mentally generate are diminishing in importance (i.e. even if their utility is increasing, their probability is diminishing fast enough to make the sequence absolutely convergent).
Does this approach mean that questions like “How far I prefer my neighbours house to be from mine?” are still answereable?
If you believe that spacetime is discrete at the Planck scale, then there are only finitely many options for how far your neighbor’s house can be from yours. I tend to think that finite-support probability distributions are sufficient for this task… even if spacetime is continuous, we can get a good-enough approximation by assuming it is discrete at the Planck scale.
(Is there some context I’m missing here? I don’t know if I’m supposed to recognize your example.)
I am trying to ask about the limits of the apporach by formulating something like the most reasonable case where capturing the innumerable aspects of the topic is actually on point. One could think that 3, 3.1, π and golden ratio would be perfectly legit options and questions of the form “Do you prefer your neighbour to be A far away or B far away?” would need to be answerable for all valid options and one option for conceiving it is for A and B to be arbitrary reals. With the “good-enough approximation” we don’t talk about being π distance away because we don’t believe in truly trancendental distances. There is one distance a little beyond that and one little short of that and claims need to be about those.
Well, technically you can still restrict to finite-support probability distributions even if your outcome space is infinite. So even if you allow all real numbers as distances (and have utilities for each), you can restrict your set of beliefs to have finite support at any given time (i.e. at one point in time you might believe the distance to be one of {3,3.1,pi} or any other finite set of reals, and you may pick any distribution over that finite set). This setup still avoids Paul’s paradoxes.
Having said that, I have trouble seeing why you’d need to do this for the specific case of distances. Computers already use float-point arithmetic (of finite precision) to estimate real numbers, and not much goes wrong there. So computers are already restricting the set of possible distances to a finite set.
Any gradualation is likely to not hit the exact distance on the spot. Then If I was faced to be in a situation where I could become sensitive to that distinction I would need to go from not having included it in the support to having inluded it in the support ie from zero probablity to non-zero probablity. This seems like a smell that things are not genuine comparable to the safety of avoiding unbounded utilities, so i am not sure whether it is an improvement.
Even computers can do symbolic manipulation where they can get exact results. They are not forced to numerically simulate everything. Determining the intersection of two lines can be done exactly in finite computation despite doing it by “brute force” point-for-point whether they are in the same location would call for more than numerable steps.
I have intuitiion/introspective impression that there are objects like “distance is {3,3.1, between 3.2 and 3.3}” where the three categories are equiprobable and distances within “3.2 to 3.3″ are equiprobable to each other but that “3.2 to 3.3” is not made up of listable separate beliefs. (More realisticially they tend to not feel exactly equiprobable within the whole range).