The normal VNM approach is to start with an agent whose behavior satisfies some common sense conditions: can’t be money pumped and so on. From that we can prove that the agent behaves as if maximizing the expectation of some function on outcomes, which we call the “utility function”. That function is not unique, you can apply an affine transform and obtain another utility function describing the same behavior. The behavior is what’s real; utility functions are merely our descriptions of it.
From that perspective, it makes no sense to talk about “maximizing the geometric expectation of utility”. Utility is, by definition, the function whose (ordinary, not geometric) expectation is maximized by your behavior. That’s the whole reason for introducing the concept of utility.
The mistake is a bit similar to how people talk about “caring about other people’s utility, not just your own”. You cannot care about other people’s utility at the expense of your own, it’s a misuse of terms. If your behavior is consistent, then the function that describes it is called “your utility”.
The word ‘utility’ can be used in two different ways: normative and descriptive.
You are describing ‘utility’ in the descriptive sense. I am using it in the normative sense. These are explained in the first paragraph of the Wikipedia page for ‘utility’.
As I explained in the opening paragraph, I’m using the word ‘utility’ to mean the goodness/desirability/value of an outcome. This is normative: if an outcome is ‘good’ then there is the implication that you ought to pursue it.
That makes me even more confused. Are you arguing that we ought to (1) assign some “goodness” values to outcomes, and then (2) maximize the geometric expectation of “goodness” resulting from our actions? But then wouldn’t any argument for (2) depend on the details of how (1) is done? For example, if “goodnesses” were logarithmic in the first place, then wouldn’t you want to use arithmetic averaging? Is there some description of how we should assign goodnesses in (1) without a kind of firm ground that VNM gives?
Without wishing to be facetious: how much (if any) of the post did you read? If you disagree with me, that’s fine, but I feel like I’m answering questions which I already addressed in the post!
Are you arguing that we ought to (1) assign some “goodness” values to outcomes, and then (2) maximize the geometric expectation of “goodness” resulting from our actions?
I’m not arguing that we ought to maximize the geometric expectation of “goodness” resulting from our actions. I’m exploring what it might look like if we did. In the conclusion, (and indeed, many other parts of the post) I’m pretty ambivalent.
But then wouldn’t any argument for (2) depend on the details of how (1) is done? For example, if “goodnesses” were logarithmic in the first place, then wouldn’t you want to use arithmetic averaging?
I don’t think so. I think you could have a preference ordering over ‘certain’ world states and the you are still left with choosing a method for deciding between lotteries where the outcome is uncertain. I describe that this is my position in the section titled ‘Geometric Expectation ≠ Logarithmic Utility’.
Is there some description of how we should assign goodnesses in (1) without a kind of firm ground that VNM gives?
This is what philosophers of normative ethics do! People disagree on the how exactly to do it, but that doesn’t stop them from trying! My post tries to be agnostic as to what exactly it is we care about and how we assign utility to different world states, since I’m focusing on the difference between averaging methods.
Guilty as charged—I did read your post as arguing in favor of geometric averaging, when it really wasn’t. Sorry.
The main point still seems strange to me, though. Suppose you were programming a robot to act on my behalf, and you asked me to write out some goodness values for outcomes, to program them into the robot. Then before writing out the goodnesses I’d be sure to ask you: which method would the robot use for evaluating lotteries over outcomes? Depending on that, the goodness values I’d write for you (to achieve the desired behavior from the robot) would be very different.
To me it suggests that the goodness values and the averaging method are not truly independent degrees of freedom. So it’s simpler to nail down the averaging method, to use ordinary arithmetic averaging, and then assign the goodness values. We don’t lose any ability to describe behavior (as long as it’s consistent), and we remain with only the degree of freedom that actually matters.
(apologies for taking a couple of days to respond, work has been busy)
I think your robot example nicely demonstrates the difference between our intuitions. As cubefox pointed out in another comment, what representation you want to use depends on what you take as basic.
There are certain types of preferences/behaviours which cannot be expressed using arithmetic averaging. These are the ones which violate VNM, and I think violating VNM axioms isn’t totally crazy. I think its worth exploring these VNM-violating preferences and seeing what they look like when more fleshed out. That’s what I tried to do in this post.
If I wanted a robot that violated one of the VNM axioms, then I wouldn’t be able to describe it by ‘nailing down the averaging method to use ordinary arithmetic averaging and assigning goodness values’. For example, if there were certain states of the world which I wanted to avoid at all costs (and thus violate the continuity axiom), I could assign zero utility to it and use geometric averaging. I couldn’t do this with arithmetic averaging and any finite utilities [1].
A better example is Scott Garrabrant’s argument regarding abandoning the VNM axiom of independence. If I wanted to program a robot which sometimes preferred lotteries to any definite outcome, I wouldn’t be able to program the robot using arithmetic averaging over goodness values.
I think that these examples show that there is at least some independence between averaging methods and utility/goodness.
(ok, I guess you could assign ‘negative infinity’ utility to those states if you wanted. But once you’re doing stuff like that, it seems to me that geometric averaging is a much more intuitive way to describe these preferences. )
For example, if there were certain states of the world which I wanted to avoid at all costs (and thus violate the continuity axiom), I could assign zero utility to it and use geometric averaging. I couldn’t do this with arithmetic averaging and any finite utilities.
Well, you can’t have some states as “avoid at all costs” and others as “achieve at all costs”, because having them in the same lottery leads to nonsense, no matter what averaging you use. And allowing only one of the two seems arbitrary. So it seems cleanest to disallow both.
If I wanted to program a robot which sometimes preferred lotteries to any definite outcome, I wouldn’t be able to program the robot using arithmetic averaging over goodness values.
But geometric averaging wouldn’t let you do that either, or am I missing something?
Well, you can’t have some states as “avoid at all costs” and others as “achieve at all costs”, because having them in the same lottery leads to nonsense, no matter what averaging you use. And allowing only one of the two seems arbitrary. So it seems cleanest to disallow both.
Fine. But the purpose of exploring different averaging methods is to see whether it expands the richness of the kind of behaviour we want to describe. The point is that using arithmetic averaging is a choice which limits the kind of behaviour we can get. Maybe we want to describe behaviours which can’t be described under expected utility. Having an ‘avoid at all costs state’ is one such behaviour which finds natural description using a non-arithmetic averaging which can’t be described in more typical VNM terms.
If your position is ‘I would never want to describe normative ethics using anything other than expected utility’ then that’s fine, but some people (like me) are interested in looking at what alternatives to expected utility might be. That’s why I wrote this post. As it stands, I didn’t find geometric averaging very satisfactory (as I wrote in the post), but I think things like this are worth exploring.
But geometric averaging wouldn’t let you do that either, or am I missing something?
You are right. Geometric averaging on its own doesn’t give allow violations of independence. But some other protocol for deciding over lotteries does. It’s described more in the Garrabrant post linked above.
The normal VNM approach is to start with an agent whose behavior satisfies some common sense conditions: can’t be money pumped and so on.
Nitpicks: (1) the vNM theorem is arguably about preference, not choice and behavior; and (2) “can’t be money pumped” is not one of the conditions in the theorem.
There are several different representation theorems, not just the one by VNM. They differ in what they take to be basic. See the table here in section 2.2.5. As the article emphasizes, nothing can be concluded from direction of representation about what is more fundamental:
Notice that the order of construction differs between theorems: Ramsey constructs a representation of probability using utility, while von Neumann and Morgenstern begin with probabilities and construct a representation of utility. Thus, although the arrows represent a mathematical relationship of representation, they cannot represent a metaphysical relationship of grounding. The Reality Condition needs to be justified independently of any representation theorem.
E.g. you could also trivially “represent” preferences in terms of utilities by defining
x≻y:=U(x)>U(y).
This case isn’t mentioned in the table because a representation proof based on it would be too trivial to label it a “theorem” (for example, preferences are automatically transitive because utilities are represented by real numbers and the “larger than” relation on the real numbers is transitive).
If we want to argue what is more fundamental, we need independent arguments; formal representation relations alone are too arbitrary.
There are indeed a few such arguments. For example, it makes both semantic and psychological sense to interpret “I prefer x to y” as “I want x more than I want x”, but it doesn’t seem possible to interpret (semantically and psychologically) plausible statements like “I want x much more than I want y” or “I want x about twice as much as I want y” in terms of preferences, or preferences and probabilities. The reason is that the latter force you to interpret utility functions as invariant under addition of arbitrary constants, which can make utility levels arbitrarily close to each other. So we can interpret preferences as being explained by relations between degrees of desire (strength of wanting), but we can’t interpret desires as being explained by preference relations, or both preferences and probabilities.
This seems misguided.
The normal VNM approach is to start with an agent whose behavior satisfies some common sense conditions: can’t be money pumped and so on. From that we can prove that the agent behaves as if maximizing the expectation of some function on outcomes, which we call the “utility function”. That function is not unique, you can apply an affine transform and obtain another utility function describing the same behavior. The behavior is what’s real; utility functions are merely our descriptions of it.
From that perspective, it makes no sense to talk about “maximizing the geometric expectation of utility”. Utility is, by definition, the function whose (ordinary, not geometric) expectation is maximized by your behavior. That’s the whole reason for introducing the concept of utility.
The mistake is a bit similar to how people talk about “caring about other people’s utility, not just your own”. You cannot care about other people’s utility at the expense of your own, it’s a misuse of terms. If your behavior is consistent, then the function that describes it is called “your utility”.
The word ‘utility’ can be used in two different ways: normative and descriptive.
You are describing ‘utility’ in the descriptive sense. I am using it in the normative sense. These are explained in the first paragraph of the Wikipedia page for ‘utility’.
As I explained in the opening paragraph, I’m using the word ‘utility’ to mean the goodness/desirability/value of an outcome. This is normative: if an outcome is ‘good’ then there is the implication that you ought to pursue it.
That makes me even more confused. Are you arguing that we ought to (1) assign some “goodness” values to outcomes, and then (2) maximize the geometric expectation of “goodness” resulting from our actions? But then wouldn’t any argument for (2) depend on the details of how (1) is done? For example, if “goodnesses” were logarithmic in the first place, then wouldn’t you want to use arithmetic averaging? Is there some description of how we should assign goodnesses in (1) without a kind of firm ground that VNM gives?
Without wishing to be facetious: how much (if any) of the post did you read? If you disagree with me, that’s fine, but I feel like I’m answering questions which I already addressed in the post!
I’m not arguing that we ought to maximize the geometric expectation of “goodness” resulting from our actions. I’m exploring what it might look like if we did. In the conclusion, (and indeed, many other parts of the post) I’m pretty ambivalent.
I don’t think so. I think you could have a preference ordering over ‘certain’ world states and the you are still left with choosing a method for deciding between lotteries where the outcome is uncertain. I describe that this is my position in the section titled ‘Geometric Expectation ≠ Logarithmic Utility’.
This is what philosophers of normative ethics do! People disagree on the how exactly to do it, but that doesn’t stop them from trying! My post tries to be agnostic as to what exactly it is we care about and how we assign utility to different world states, since I’m focusing on the difference between averaging methods.
Guilty as charged—I did read your post as arguing in favor of geometric averaging, when it really wasn’t. Sorry.
The main point still seems strange to me, though. Suppose you were programming a robot to act on my behalf, and you asked me to write out some goodness values for outcomes, to program them into the robot. Then before writing out the goodnesses I’d be sure to ask you: which method would the robot use for evaluating lotteries over outcomes? Depending on that, the goodness values I’d write for you (to achieve the desired behavior from the robot) would be very different.
To me it suggests that the goodness values and the averaging method are not truly independent degrees of freedom. So it’s simpler to nail down the averaging method, to use ordinary arithmetic averaging, and then assign the goodness values. We don’t lose any ability to describe behavior (as long as it’s consistent), and we remain with only the degree of freedom that actually matters.
(apologies for taking a couple of days to respond, work has been busy)
I think your robot example nicely demonstrates the difference between our intuitions. As cubefox pointed out in another comment, what representation you want to use depends on what you take as basic.
There are certain types of preferences/behaviours which cannot be expressed using arithmetic averaging. These are the ones which violate VNM, and I think violating VNM axioms isn’t totally crazy. I think its worth exploring these VNM-violating preferences and seeing what they look like when more fleshed out. That’s what I tried to do in this post.
If I wanted a robot that violated one of the VNM axioms, then I wouldn’t be able to describe it by ‘nailing down the averaging method to use ordinary arithmetic averaging and assigning goodness values’. For example, if there were certain states of the world which I wanted to avoid at all costs (and thus violate the continuity axiom), I could assign zero utility to it and use geometric averaging. I couldn’t do this with arithmetic averaging and any finite utilities [1].
A better example is Scott Garrabrant’s argument regarding abandoning the VNM axiom of independence. If I wanted to program a robot which sometimes preferred lotteries to any definite outcome, I wouldn’t be able to program the robot using arithmetic averaging over goodness values.
I think that these examples show that there is at least some independence between averaging methods and utility/goodness.
(ok, I guess you could assign ‘negative infinity’ utility to those states if you wanted. But once you’re doing stuff like that, it seems to me that geometric averaging is a much more intuitive way to describe these preferences. )
Well, you can’t have some states as “avoid at all costs” and others as “achieve at all costs”, because having them in the same lottery leads to nonsense, no matter what averaging you use. And allowing only one of the two seems arbitrary. So it seems cleanest to disallow both.
But geometric averaging wouldn’t let you do that either, or am I missing something?
Fine. But the purpose of exploring different averaging methods is to see whether it expands the richness of the kind of behaviour we want to describe. The point is that using arithmetic averaging is a choice which limits the kind of behaviour we can get. Maybe we want to describe behaviours which can’t be described under expected utility. Having an ‘avoid at all costs state’ is one such behaviour which finds natural description using a non-arithmetic averaging which can’t be described in more typical VNM terms.
If your position is ‘I would never want to describe normative ethics using anything other than expected utility’ then that’s fine, but some people (like me) are interested in looking at what alternatives to expected utility might be. That’s why I wrote this post. As it stands, I didn’t find geometric averaging very satisfactory (as I wrote in the post), but I think things like this are worth exploring.
You are right. Geometric averaging on its own doesn’t give allow violations of independence. But some other protocol for deciding over lotteries does. It’s described more in the Garrabrant post linked above.
Nitpicks: (1) the vNM theorem is arguably about preference, not choice and behavior; and (2) “can’t be money pumped” is not one of the conditions in the theorem.
There are several different representation theorems, not just the one by VNM. They differ in what they take to be basic. See the table here in section 2.2.5. As the article emphasizes, nothing can be concluded from direction of representation about what is more fundamental:
E.g. you could also trivially “represent” preferences in terms of utilities by defining
x≻y:=U(x)>U(y).
This case isn’t mentioned in the table because a representation proof based on it would be too trivial to label it a “theorem” (for example, preferences are automatically transitive because utilities are represented by real numbers and the “larger than” relation on the real numbers is transitive).
If we want to argue what is more fundamental, we need independent arguments; formal representation relations alone are too arbitrary.
There are indeed a few such arguments. For example, it makes both semantic and psychological sense to interpret “I prefer x to y” as “I want x more than I want x”, but it doesn’t seem possible to interpret (semantically and psychologically) plausible statements like “I want x much more than I want y” or “I want x about twice as much as I want y” in terms of preferences, or preferences and probabilities. The reason is that the latter force you to interpret utility functions as invariant under addition of arbitrary constants, which can make utility levels arbitrarily close to each other. So we can interpret preferences as being explained by relations between degrees of desire (strength of wanting), but we can’t interpret desires as being explained by preference relations, or both preferences and probabilities.