However, in order for expected utility to converge unconditionally, either carrying out the threat must get unlikely faster than the disutility increases, or the probability of the threat itself must get unlikely that fast. In other words, either someone threatening 3^^^3 people is so unlikely to carry it out to make it non-threatening, or the threat itself must be so difficult to make that you don’t have to worry about it.
Except you get this result by making up probabilities rather than arriving at them through any rational process. This has been discussed here many times before, including in the sequences and very recently. Downvoted.
Except you get this result by making up probabilities rather than arriving at them through any rational process. This has been discussed here many times before, including in the sequences and very recently. Downvoted.
I disagree that the above is not a new contribution to thought on this. The issue at stake has to do with restricting the set of permissible utility functions. If we have a probability measure induced by our empirical observations, then it doesn’t do any good from a rationalism standpoint to allow non-summable or non-integrable utility functions with respect to that probability measure.
This example shows one such case. Suppose Nature hands me a probability distribution over some sequence of events, P(Xn) = 2^{-n}. Then there is a meta-probability assignment over the space of utility functions I can assign to the events Xn and it involves the resulting expectations. You can think of it like a Dirichlet distribution.
It makes no sense to speak of utility functions that aren’t L1(problem domain) (respectively, l1(problem domain)) under the probability measure you believe to be true about the situation.
I think Pascal’s mugging suffers from this issue. For any valid probability distribution over the number of lives at stake, I can produce utility functions for valuing lives that produce arbitrarily different output decisions. In reality, though, you can’t decouple the choice of a “permissible” utility function from the exact same processes that yield some knowledge or model about the probability distribution over lives threatened.
I could go get some evidence about probability of lives threatened, then internally reflect on how I should choose to assign value to lives, then compute joint probability distributions over both the threatened lives and all my different options for utility functions on the space of threatened lives, then internally reflect on how to value joint configurations of (threatened lives, utility functions over spaces of threatened lives), then compute joint probabilities over the 2-tuple consisting of ( (threatened lives, utility functions over threatened lives), utility functions over 2-tuples of (threatened lives, utility functions over threatened lives) ), and so on ad infinitum.
At some point, because brains have finite computing resources and (at least human brains) have a machine epsilon, I just have to stop this recursive computation, draw a line in the sand, accept some conditional probabilities some at some deep ply of the computation, and then integrate my way back all the way down to the decision of choosing a utility function.
Nothing stops me from choosing a utility function that, when coupled with the probabilities that Nature gives me, causes my expectation to fail to be summable (integrable). I could, after all, act like The Ultimate Pessimist and assign a utility of -\infty to every outcome, for example. More realistically, I could choose a utility function that has the same shape as a Cauchy distribution. But in the landscape of meta-goals, or even just correspondence of utility functions to reality, this would be bad for me. How can I make decisions about which bets to accept if I am in a situation where Nature hands me an improper prior uniform probability of a set of different outcomes, and I choose to have a Cauchy distribution of personal utility over that set of outcomes? The idea of an expectation fails to even exist in that scenario. Hence, scalar multiples of Cauchy distributions don’t make much sense viewed as potential utility functions.
The example here of conditional convergence is a very elementary one. More complicated issues like this arise when you think in terms of probability theory and functional analysis on the space of utility functions. But it’s a salient example nonetheless. If we choose utility functions such that the resultant expectation calculation includes a conditionally convergent, or worse non-summable, series, then we can’t accept or reject bets in a way that has meaningful correspondence to our perceived actual utility. Hence, implicitly, rationalists must make some time-saving admissibility criteria for what sorts of functions are even allowed to be utility functions.
Getting rid of conditional convergence, or issues of non-measurability and non-integrability, would seem like intuitively plausible first steps in forming utility functions. Similar to the way that Jaynes showed how consistent formulations of belief in terms of wagers was isomorphic to probability theory, we have similar constraints on consistent use of utility functions. But as the Cauchy distribution example above, for utility functions, shows that the restrictions must actually be quite a bit more severe than mere summability.
The fact that this is a problem does not make anything in the post novel. In the grandparent, I linked to discussions of this problem that touched on everything that you discussed here.
I could go get some evidence about probability of lives threatened, then internally reflect on how I should choose to assign value to lives, then compute joint probability distributions over both the threatened lives and all my different options for utility functions on the space of threatened lives
Since utility functions are only unique modulo affine transforms, you can’t combine them using naive expected utility. The correct method to do so is unknown.
Since utility functions are only unique modulo affine transforms, you can’t combine them using naive expected utility. The correct method to do so is unknown.
I’m aware of this, but fail to see how it would change the ability to make probability distributions over the space of utility functions and then take expectations there. Sure, you’d be doing it over equivalence classes of functions, but that’s hardly any difficulty. What I am saying is you can assign utility to choices of utility functions: utility functions must inherently be recursive in practice. And so their non-summability (or other technical difficulties) causes immediate problems.
Utility functions are not primitive. They are constructed using an algorithm specified by vN&M (or Savage, or A&A). Constructed from preferences over lotteries over outcomes. Preferences are primitive. Priors over states of nature are primitive. Utility functions are constructs. They are not arbitrary.
As has been mentioned, if you constrain preferences using one of the standard vN&M axioms, and if you assume that you can construct a lottery leading to any outcome, then you can prove that outcome utilities are bounded.
I think that the OP needs to be seen as a proposal for constraining the freedom to construct arbitrary lottery-probes. And, if the constraint is properly defined, we can have an algorithm that generates unbounded utilities, but not poorly behaved utilities—utilities which cannot be used to construct expectations that are not unconditionally convergent.
You had one link for changing the expected utility just to make Pascal’s mugging go away, and another that seems to be based on the same idea, but has flawed reasoning and a different conclusion.
The first link was to the comment, not the post; I disagree with the post. The proposal in the second link was qualitatively similar to yours and it failed for the same reason.
Using expected utility is implicitly using such a prior. If you want to use such a prior, how do you suggest replacing the concept of expected utility?
Yes. (Well, it’s a bit more complicated than that; VNM utility theory doesn’t extend to choices with an infinite number of possible outcomes, so I reject the whole system.) I discussed this in more detail in the comments in the linked article. In brief, there is a chance that my utility function is bounded, but I am definitely not willing to bet the universe on it.
VNM definitely does extend to the case of infinitely many outcomes. It requires a continuous utility function, and thus continuous preferences and a topology in outcome space. Why is this additional modeling assumption any more problematic than other VNM axioms?
In short, because utilities may not converge. The axioms do not assert themselves able to be applied an infinite number of times; if they did, they would run into all the usual problems with infinite series. There are modifications of the VNM theorem that extend infinitely, but they all either must only work for certain infinite sets or must require bounded utility.
This is exactly the stuff I was talking about. I mean, basic measure theory determines what functions you can even talk about. If you have a probability measure P, then utilities that are not in L^{1}_{P}(outcome domain) make no sense. You may need some more restrictions than that, but one can’t talk about expected utility if the utility is not at least L1. You cannot define a function w.r.t. a probability measure than has a support set of infinite Lebesgue measure, is unbounded, and has a defined expectation (the L1 norm)… unless you know that the rate of growth of the unbounded utility function behaves in certain nice ways when compared to the decay of the probability measure. You might be already saying this, but this much simply can’t be changed, no matter what you do. If your utility function is unbounded, then the probabilities for certain outcomes must decay faster than your utility grows. Since probabilities are given by nature and utilities (sort of) aren’t, my guess would be that utilities have to decay quickly (or, conversely, probabilities have to decay super quickly).
If your utility function is unbounded, then the probabilities for certain outcomes must decay faster than your utility grows. Since probabilities are given by nature and utilities (sort of) aren’t, my guess would be that utilities have to decay quickly (or, conversely, probabilities have to decay super quickly).
Nature does not require that it is possible to make utility function converge at all. Also, nature neither requires that taking expectations be the only way of comparing choices, nor that utilities be real.
I totally agree and never meant to imply otherwise. But just as any consistent system of degrees of belief can be put into correspondence with the axioms of probability, so there are certain stipulations about what can reasonably called a utility function.
I would argue that if you meet a conscious agent and your model of their utility function says that it doesn’t converge (in the appropriate L1 norm of the appropriate modeled probability space) then something’s wrong with that model of utility function… not with the assumption that utility functions should converge. There are many subtleties, I’m sure, but non-integrable utility functions seem futile to me. If something can be well-modeled by a non-integrable utility function, then I’m fine updating my position, but in years of learning and teaching probability theory, I’ve never encountered anything that would convince me of that.
Yes, good point. Is there any study of the most general objects to which integrability theory applies? Also, are you familiar with Martin Kruskal’s work on generalizing calculus to the surreal numbers? I am having difficulty locating any of his papers.
What comes to my mind are Bochner integrals and random elements. I’m not sure how much integrability theory one can develop outside of a Banach space, although you can get interesting fractal type integrals when dealing with Hausdorff measure. Integrability theory is really just an extension of measure theory, which was pinned down in painstaking detail by Lebesgue, Caratheodory, Perron, Henstock, and Kurzweil (no relation to the singularity Kurzweil). The Henstock-Kurzweil (HK) integral is the most generalized integral over the reals and complexs that preserves certain nice properties, like the fundamental theorem of calculus. The name of the game in integration theory was never an attempt to find the most abstract workable definitions of integration, but rather to see under what general assumptions you could get physically meaningful results, like mean value theorem or fundamental theorem of calculus, to hold. Complex integration theory, especially in higher dimensions shattered a lot of the preconceived notions of how functions should behave.
In looking up surreal numbers, it appears that Conway and Knuth invented them. I was surprised to learn that the hyperreal numbers (developed by Abraham Robinson) are contained in the surreals. To my knowledge, which is a bit limited because I focus more on applied math and so I am probably not as familiar with the literature on something like surreal numbers as other LWers may be, there hasn’t been much work, if any, on defining an integral over the surreals. My guess, though, is that such an integral would wind up being an unsatisfyingly trivial extension of integration over the regular reals, as is the case for hyperreals.
I’ll definitely take a look at Kruskal’s papers and see what he’s come up with.
I was surprised to learn that the hyperreal numbers (developed by Abraham Robinson) are contained in the surreals.
Every ordered field is contained within the surreals, which is why I find them promising for utility theory. The surreals themselves are not a field but a Field, since they form a proper class.
Another point worth noting is that on a set D of finite measure (which any measurable subset of a probability space is), L^{N}(D) is contained in L^{N-1}(D), and so if the first moment fails to exist (non-integrable, no defined expectation) then all higher moments fail and computation of order statistics fails. Of course nature doesn’t have to be modeled by statistics, but you’d be hard pressed to out-perform simple axiomatic formulations that just assume a topolgy, continuous preference functions, and get on with it and have access to higher order moments.
How do you construct utility without the VNM axioms? Are there less strong axioms for which a VNM-like result holds?
EDIT: Sorry if this is covered in the comments in the other article, I’m being a bit lazy here and not reading through all of your comments there in detail.
Okay. If you end up being successful, I would be quite interested to know about it. (A counterexample would also be interesting, actually probably more interesting since it is less expected.)
Except you get this result by making up probabilities rather than arriving at them through any rational process. This has been discussed here many times before, including in the sequences and very recently. Downvoted.
I disagree that the above is not a new contribution to thought on this. The issue at stake has to do with restricting the set of permissible utility functions. If we have a probability measure induced by our empirical observations, then it doesn’t do any good from a rationalism standpoint to allow non-summable or non-integrable utility functions with respect to that probability measure.
This example shows one such case. Suppose Nature hands me a probability distribution over some sequence of events, P(Xn) = 2^{-n}. Then there is a meta-probability assignment over the space of utility functions I can assign to the events Xn and it involves the resulting expectations. You can think of it like a Dirichlet distribution.
It makes no sense to speak of utility functions that aren’t L1(problem domain) (respectively, l1(problem domain)) under the probability measure you believe to be true about the situation.
I think Pascal’s mugging suffers from this issue. For any valid probability distribution over the number of lives at stake, I can produce utility functions for valuing lives that produce arbitrarily different output decisions. In reality, though, you can’t decouple the choice of a “permissible” utility function from the exact same processes that yield some knowledge or model about the probability distribution over lives threatened.
I could go get some evidence about probability of lives threatened, then internally reflect on how I should choose to assign value to lives, then compute joint probability distributions over both the threatened lives and all my different options for utility functions on the space of threatened lives, then internally reflect on how to value joint configurations of (threatened lives, utility functions over spaces of threatened lives), then compute joint probabilities over the 2-tuple consisting of ( (threatened lives, utility functions over threatened lives), utility functions over 2-tuples of (threatened lives, utility functions over threatened lives) ), and so on ad infinitum.
At some point, because brains have finite computing resources and (at least human brains) have a machine epsilon, I just have to stop this recursive computation, draw a line in the sand, accept some conditional probabilities some at some deep ply of the computation, and then integrate my way back all the way down to the decision of choosing a utility function.
Nothing stops me from choosing a utility function that, when coupled with the probabilities that Nature gives me, causes my expectation to fail to be summable (integrable). I could, after all, act like The Ultimate Pessimist and assign a utility of -\infty to every outcome, for example. More realistically, I could choose a utility function that has the same shape as a Cauchy distribution. But in the landscape of meta-goals, or even just correspondence of utility functions to reality, this would be bad for me. How can I make decisions about which bets to accept if I am in a situation where Nature hands me an improper prior uniform probability of a set of different outcomes, and I choose to have a Cauchy distribution of personal utility over that set of outcomes? The idea of an expectation fails to even exist in that scenario. Hence, scalar multiples of Cauchy distributions don’t make much sense viewed as potential utility functions.
The example here of conditional convergence is a very elementary one. More complicated issues like this arise when you think in terms of probability theory and functional analysis on the space of utility functions. But it’s a salient example nonetheless. If we choose utility functions such that the resultant expectation calculation includes a conditionally convergent, or worse non-summable, series, then we can’t accept or reject bets in a way that has meaningful correspondence to our perceived actual utility. Hence, implicitly, rationalists must make some time-saving admissibility criteria for what sorts of functions are even allowed to be utility functions.
Getting rid of conditional convergence, or issues of non-measurability and non-integrability, would seem like intuitively plausible first steps in forming utility functions. Similar to the way that Jaynes showed how consistent formulations of belief in terms of wagers was isomorphic to probability theory, we have similar constraints on consistent use of utility functions. But as the Cauchy distribution example above, for utility functions, shows that the restrictions must actually be quite a bit more severe than mere summability.
The fact that this is a problem does not make anything in the post novel. In the grandparent, I linked to discussions of this problem that touched on everything that you discussed here.
Since utility functions are only unique modulo affine transforms, you can’t combine them using naive expected utility. The correct method to do so is unknown.
I’m aware of this, but fail to see how it would change the ability to make probability distributions over the space of utility functions and then take expectations there. Sure, you’d be doing it over equivalence classes of functions, but that’s hardly any difficulty. What I am saying is you can assign utility to choices of utility functions: utility functions must inherently be recursive in practice. And so their non-summability (or other technical difficulties) causes immediate problems.
Utility functions are not primitive. They are constructed using an algorithm specified by vN&M (or Savage, or A&A). Constructed from preferences over lotteries over outcomes. Preferences are primitive. Priors over states of nature are primitive. Utility functions are constructs. They are not arbitrary.
As has been mentioned, if you constrain preferences using one of the standard vN&M axioms, and if you assume that you can construct a lottery leading to any outcome, then you can prove that outcome utilities are bounded.
I think that the OP needs to be seen as a proposal for constraining the freedom to construct arbitrary lottery-probes. And, if the constraint is properly defined, we can have an algorithm that generates unbounded utilities, but not poorly behaved utilities—utilities which cannot be used to construct expectations that are not unconditionally convergent.
You had one link for changing the expected utility just to make Pascal’s mugging go away, and another that seems to be based on the same idea, but has flawed reasoning and a different conclusion.
The first link was to the comment, not the post; I disagree with the post. The proposal in the second link was qualitatively similar to yours and it failed for the same reason.
Using expected utility is implicitly using such a prior. If you want to use such a prior, how do you suggest replacing the concept of expected utility?
This is an open problem. I contest certain axioms (P6 and P7).
Do you also contest the Archimedean axiom for von Neumann’s formulation of utility?
Yes. (Well, it’s a bit more complicated than that; VNM utility theory doesn’t extend to choices with an infinite number of possible outcomes, so I reject the whole system.) I discussed this in more detail in the comments in the linked article. In brief, there is a chance that my utility function is bounded, but I am definitely not willing to bet the universe on it.
VNM definitely does extend to the case of infinitely many outcomes. It requires a continuous utility function, and thus continuous preferences and a topology in outcome space. Why is this additional modeling assumption any more problematic than other VNM axioms?
In short, because utilities may not converge. The axioms do not assert themselves able to be applied an infinite number of times; if they did, they would run into all the usual problems with infinite series. There are modifications of the VNM theorem that extend infinitely, but they all either must only work for certain infinite sets or must require bounded utility.
This is exactly the stuff I was talking about. I mean, basic measure theory determines what functions you can even talk about. If you have a probability measure P, then utilities that are not in L^{1}_{P}(outcome domain) make no sense. You may need some more restrictions than that, but one can’t talk about expected utility if the utility is not at least L1. You cannot define a function w.r.t. a probability measure than has a support set of infinite Lebesgue measure, is unbounded, and has a defined expectation (the L1 norm)… unless you know that the rate of growth of the unbounded utility function behaves in certain nice ways when compared to the decay of the probability measure. You might be already saying this, but this much simply can’t be changed, no matter what you do. If your utility function is unbounded, then the probabilities for certain outcomes must decay faster than your utility grows. Since probabilities are given by nature and utilities (sort of) aren’t, my guess would be that utilities have to decay quickly (or, conversely, probabilities have to decay super quickly).
Nature does not require that it is possible to make utility function converge at all. Also, nature neither requires that taking expectations be the only way of comparing choices, nor that utilities be real.
I totally agree and never meant to imply otherwise. But just as any consistent system of degrees of belief can be put into correspondence with the axioms of probability, so there are certain stipulations about what can reasonably called a utility function.
I would argue that if you meet a conscious agent and your model of their utility function says that it doesn’t converge (in the appropriate L1 norm of the appropriate modeled probability space) then something’s wrong with that model of utility function… not with the assumption that utility functions should converge. There are many subtleties, I’m sure, but non-integrable utility functions seem futile to me. If something can be well-modeled by a non-integrable utility function, then I’m fine updating my position, but in years of learning and teaching probability theory, I’ve never encountered anything that would convince me of that.
Doesn’t this all assume that utility functions are real-valued?
No, all of the integrability theory (w.r.t. probability measures) extends straightforwardly to complex valued functions. See this and this.
Yes, good point. Is there any study of the most general objects to which integrability theory applies? Also, are you familiar with Martin Kruskal’s work on generalizing calculus to the surreal numbers? I am having difficulty locating any of his papers.
What comes to my mind are Bochner integrals and random elements. I’m not sure how much integrability theory one can develop outside of a Banach space, although you can get interesting fractal type integrals when dealing with Hausdorff measure. Integrability theory is really just an extension of measure theory, which was pinned down in painstaking detail by Lebesgue, Caratheodory, Perron, Henstock, and Kurzweil (no relation to the singularity Kurzweil). The Henstock-Kurzweil (HK) integral is the most generalized integral over the reals and complexs that preserves certain nice properties, like the fundamental theorem of calculus. The name of the game in integration theory was never an attempt to find the most abstract workable definitions of integration, but rather to see under what general assumptions you could get physically meaningful results, like mean value theorem or fundamental theorem of calculus, to hold. Complex integration theory, especially in higher dimensions shattered a lot of the preconceived notions of how functions should behave.
In looking up surreal numbers, it appears that Conway and Knuth invented them. I was surprised to learn that the hyperreal numbers (developed by Abraham Robinson) are contained in the surreals. To my knowledge, which is a bit limited because I focus more on applied math and so I am probably not as familiar with the literature on something like surreal numbers as other LWers may be, there hasn’t been much work, if any, on defining an integral over the surreals. My guess, though, is that such an integral would wind up being an unsatisfyingly trivial extension of integration over the regular reals, as is the case for hyperreals.
I’ll definitely take a look at Kruskal’s papers and see what he’s come up with.
Every ordered field is contained within the surreals, which is why I find them promising for utility theory. The surreals themselves are not a field but a Field, since they form a proper class.
Another point worth noting is that on a set D of finite measure (which any measurable subset of a probability space is), L^{N}(D) is contained in L^{N-1}(D), and so if the first moment fails to exist (non-integrable, no defined expectation) then all higher moments fail and computation of order statistics fails. Of course nature doesn’t have to be modeled by statistics, but you’d be hard pressed to out-perform simple axiomatic formulations that just assume a topolgy, continuous preference functions, and get on with it and have access to higher order moments.
How do you construct utility without the VNM axioms? Are there less strong axioms for which a VNM-like result holds?
EDIT: Sorry if this is covered in the comments in the other article, I’m being a bit lazy here and not reading through all of your comments there in detail.
I don’t yet. :) I have a few reason to think that it has a good chance of being possible, but it has not been done.
Okay. If you end up being successful, I would be quite interested to know about it. (A counterexample would also be interesting, actually probably more interesting since it is less expected.)