I think that, depending on what the v’s are, choosing a Pareto optimum is actually quite undesirable.
For example, let v1 be min(1000, how much food you have), and let v2 be min(1000, how much water you have). Suppose you can survive for days equal to a soft minimum of v1 and v2 (for example, 0.001 v1 + 0.001 v2 + min(v1, v2)). All else being equal, more v1 is good and more v2 is good. But maximizing a convex combination of v1 and v2 can lead to avoidable dehydration or starvation. Suppose you assign weights to v1 and v2, and are offered either 1000 of the more valued resource, or 100 of each. Then you will pick the 1000 of the one resource, causing starvation or dehydration after 1 day when you could have lasted over 100. If which resource is chosen is selected randomly, then any convex optimizer will die early at least half the time.
A non-convex aggregate utility function, for example the number of days survived (0.001 v1 + 0.001 v2 + min(v1, v2)), is much more sensible. However, it will not select Pareto optima. It will always select the 100 of each option; always selecting 1000 of one leads to greater expected v1 and expected v2 (500 for each).
I believe your Game is badly-formed. This doesn’t sound at all like how Games should be modeled. Here, you don’t have two agents each trying to maximize something that they value of their own, so you can’t use those tricks.
As a result, apparently you’re not properly representing utility in this model. You’re implicitly assuming the thing to be maximized is health and life duration, without modeling it at all. With the model you make, there are only two values, food and water. The agent does not care about survival with only those two Vs. So for this agent, yes, picking one of the “1000” options really truly spectacularly trivially is better. The agent just doesn’t represent your own preferences properly, that’s all.
If your agent cares at all about survival, there should be a value for survival in there too, probably conditionally dependent on how much water and food is obtained. Better yet, you seem to be implying that the amount of food and water obtained isn’t really important, only surviving longer is—strike out the food and water values, only keep a “days survived” value dependent upon food and water obtained, and then form the Game properly.
I think we agree. I am just pointing out that Pareto optimality is undesirable for some selections of “values”. For example, you might want you and everyone else to both be happy, and happiness of one without the other would be much less valuable.
I’m not sure how you would go about deciding if Pareto optimality is desirable, now that the theorem proves that it is desirable iff you maximize some convex combination of the values.
Given some value v1 that you are risk averse with respect to, you can find some value v1′ that your utility is linear with. For example, if with other values fixed, utility = log(v1), then v1′:=log(v1). Then just use v1′ in place of v1 in your optimization. You are right that it doesn’t make sense to maximize the expected value of a function that you don’t care about the expected value of, but if you are VNM-rational, then given an ordinal utility function (for which the expected value is meaningless), you can find a cardinal utility function (which you do want to maximize the expected value of) with the same relative preference ordering.
I didn’t say anything about risk aversion. This is about utility functions that depend on multiple different “values” in some non-convex way. You can observe that, in my original example, if you have no water, then utility (days survived) is linear with respect to food.
Oh, I see. The problem is that if the importance of a value changes depending on how well you achieve a different value, a Pareto improvement in the expected value of each value function is not necessarily an improvement overall, even if your utility with respect to each value function is linear given any fixed values for the other value functions (e.g. U = v1*v2). That’s a good point, and I now agree; Pareto optimality with respect to the expected value of each value function is not an obviously desirable criterion. (apologies for the possibly confusing use of “value” to mean two different things)
Edit: I’m going to backtrack on that somewhat. I think it makes sense if the values are independent of one another (not the case for food and water, which are both subgoals of survival). The assumption needed for the theorem is that for all i, the utility function is linear with respect to v_i given fixed expected values of the other value functions, and does not depend on the distribution of possible values of the other value functions.
For example, you might want you and everyone else to both be happy, and happiness of one without the other would be much less valuable.
Now you’ve got me curious. I don’t see what selections of values representative of the agent they’re trying to model could possibly desire non-Pareto-optimal scenarios. The given example (quoted), for one, is something I’d represent like this:
Let x = my happiness, y = happiness of everyone else
To model the fact that each is worthless without the other, let:
v1 = min(x, 10y) v2 = min(y, 10x)
Choice A: Gain 10 x, 0 y Choice B: Gain 0 x, 10 y Choice C: Gain 2 x, 2 y
It seems very obvious that the sole Pareto-optimal choice is the only desirable policy. Utility is four for choice C, and zero for A and B.
This may reduce to exactly what AlexMennen said, too, I guess. I have never encountered any intuition or decision problem that couldn’t at-least-in-principle resolve to a utility function with perfect modeling accuracy given enough time and computational resources.
I do think that everything should reduce to a single utility function. That said, this utility function is not necessarily a convex combination of separate values, such as “my happiness”, “everyone else’s happiness”, etc. It could contain more complex values such as your v1 and v2, which depend on both x and y.
In your example, let’s add a choice D: 50% of the time it’s A, 50% of the time it’s B. In terms of individual happiness, this is Pareto superior to C. It is Pareto inferior for v1 and v2, though.
EDIT: For an example of what I’m criticizing: Nisan claims that this theorem presents a difficulty for avoiding the repugnant conclusion if your desiderata are total and average happiness. If v1 = total happiness and v2 = average happiness, and Pareto optimality is desirable, then it follows that utility is a*v1 + b*v2. From this utility function, some degenerate behavior (blissful solipsist or repugnant conclusion) follows. However, there is nothing that says that Pareto optimality in v1 and v2 is desirable. You might pick a non-linear utility function of total and average happiness, for example atan(average happiness) + atan(total happiness). Such a utility function will sometimes pick policies that are Pareto inferior with respect to v1 and v2.
This example doesn’t satisfy the hypotheses of the theorem because you wouldn’t want to optimize for v1 if your water was held fixed. Presumably, if you have 3 units of water and no food, you’d prefer 3 units of food to a 50% chance of 7 units of food, even though the latter leads to a higher expectation of v1.
Ah, okay. In that case, if you’re faced with a number of choices that offer varying expectations of v1 but all offer a certainty of say 3 units of water, then you’ll want to optimize for v1. But if the choices only have the same expectation of v2, then you won’t be optimizing for v1. So the theorem doesn’t apply because the agent doesn’t optimize for each value ceteris paribus in the strong sense described in this footnote.
But if the choices only have the same expectation of v2, then you won’t be optimizing for v1.
Ok, this correct. I hadn’t understood the preconditions well enough. It seems that now the important question is whether things people intuitively think of as different values (my happiness, total happiness, average happiness) satisfy this condition.
I think that, depending on what the v’s are, choosing a Pareto optimum is actually quite undesirable.
For example, let v1 be min(1000, how much food you have), and let v2 be min(1000, how much water you have). Suppose you can survive for days equal to a soft minimum of v1 and v2 (for example, 0.001 v1 + 0.001 v2 + min(v1, v2)). All else being equal, more v1 is good and more v2 is good. But maximizing a convex combination of v1 and v2 can lead to avoidable dehydration or starvation. Suppose you assign weights to v1 and v2, and are offered either 1000 of the more valued resource, or 100 of each. Then you will pick the 1000 of the one resource, causing starvation or dehydration after 1 day when you could have lasted over 100. If which resource is chosen is selected randomly, then any convex optimizer will die early at least half the time.
A non-convex aggregate utility function, for example the number of days survived (0.001 v1 + 0.001 v2 + min(v1, v2)), is much more sensible. However, it will not select Pareto optima. It will always select the 100 of each option; always selecting 1000 of one leads to greater expected v1 and expected v2 (500 for each).
Wha...?
I believe your Game is badly-formed. This doesn’t sound at all like how Games should be modeled. Here, you don’t have two agents each trying to maximize something that they value of their own, so you can’t use those tricks.
As a result, apparently you’re not properly representing utility in this model. You’re implicitly assuming the thing to be maximized is health and life duration, without modeling it at all. With the model you make, there are only two values, food and water. The agent does not care about survival with only those two Vs. So for this agent, yes, picking one of the “1000” options really truly spectacularly trivially is better. The agent just doesn’t represent your own preferences properly, that’s all.
If your agent cares at all about survival, there should be a value for survival in there too, probably conditionally dependent on how much water and food is obtained. Better yet, you seem to be implying that the amount of food and water obtained isn’t really important, only surviving longer is—strike out the food and water values, only keep a “days survived” value dependent upon food and water obtained, and then form the Game properly.
I think we agree. I am just pointing out that Pareto optimality is undesirable for some selections of “values”. For example, you might want you and everyone else to both be happy, and happiness of one without the other would be much less valuable.
I’m not sure how you would go about deciding if Pareto optimality is desirable, now that the theorem proves that it is desirable iff you maximize some convex combination of the values.
Given some value v1 that you are risk averse with respect to, you can find some value v1′ that your utility is linear with. For example, if with other values fixed, utility = log(v1), then v1′:=log(v1). Then just use v1′ in place of v1 in your optimization. You are right that it doesn’t make sense to maximize the expected value of a function that you don’t care about the expected value of, but if you are VNM-rational, then given an ordinal utility function (for which the expected value is meaningless), you can find a cardinal utility function (which you do want to maximize the expected value of) with the same relative preference ordering.
I didn’t say anything about risk aversion. This is about utility functions that depend on multiple different “values” in some non-convex way. You can observe that, in my original example, if you have no water, then utility (days survived) is linear with respect to food.
Oh, I see. The problem is that if the importance of a value changes depending on how well you achieve a different value, a Pareto improvement in the expected value of each value function is not necessarily an improvement overall, even if your utility with respect to each value function is linear given any fixed values for the other value functions (e.g. U = v1*v2). That’s a good point, and I now agree; Pareto optimality with respect to the expected value of each value function is not an obviously desirable criterion. (apologies for the possibly confusing use of “value” to mean two different things)
Edit: I’m going to backtrack on that somewhat. I think it makes sense if the values are independent of one another (not the case for food and water, which are both subgoals of survival). The assumption needed for the theorem is that for all i, the utility function is linear with respect to v_i given fixed expected values of the other value functions, and does not depend on the distribution of possible values of the other value functions.
Now you’ve got me curious. I don’t see what selections of values representative of the agent they’re trying to model could possibly desire non-Pareto-optimal scenarios. The given example (quoted), for one, is something I’d represent like this:
Let x = my happiness, y = happiness of everyone else
To model the fact that each is worthless without the other, let:
v1 = min(x, 10y)
v2 = min(y, 10x)
Choice A: Gain 10 x, 0 y
Choice B: Gain 0 x, 10 y
Choice C: Gain 2 x, 2 y
It seems very obvious that the sole Pareto-optimal choice is the only desirable policy. Utility is four for choice C, and zero for A and B.
This may reduce to exactly what AlexMennen said, too, I guess. I have never encountered any intuition or decision problem that couldn’t at-least-in-principle resolve to a utility function with perfect modeling accuracy given enough time and computational resources.
I do think that everything should reduce to a single utility function. That said, this utility function is not necessarily a convex combination of separate values, such as “my happiness”, “everyone else’s happiness”, etc. It could contain more complex values such as your v1 and v2, which depend on both x and y.
In your example, let’s add a choice D: 50% of the time it’s A, 50% of the time it’s B. In terms of individual happiness, this is Pareto superior to C. It is Pareto inferior for v1 and v2, though.
EDIT: For an example of what I’m criticizing: Nisan claims that this theorem presents a difficulty for avoiding the repugnant conclusion if your desiderata are total and average happiness. If v1 = total happiness and v2 = average happiness, and Pareto optimality is desirable, then it follows that utility is a*v1 + b*v2. From this utility function, some degenerate behavior (blissful solipsist or repugnant conclusion) follows. However, there is nothing that says that Pareto optimality in v1 and v2 is desirable. You might pick a non-linear utility function of total and average happiness, for example atan(average happiness) + atan(total happiness). Such a utility function will sometimes pick policies that are Pareto inferior with respect to v1 and v2.
This example doesn’t satisfy the hypotheses of the theorem because you wouldn’t want to optimize for v1 if your water was held fixed. Presumably, if you have 3 units of water and no food, you’d prefer 3 units of food to a 50% chance of 7 units of food, even though the latter leads to a higher expectation of v1.
You would if you could survive for v1*v2 days.
Ah, okay. In that case, if you’re faced with a number of choices that offer varying expectations of v1 but all offer a certainty of say 3 units of water, then you’ll want to optimize for v1. But if the choices only have the same expectation of v2, then you won’t be optimizing for v1. So the theorem doesn’t apply because the agent doesn’t optimize for each value ceteris paribus in the strong sense described in this footnote.
Ok, this correct. I hadn’t understood the preconditions well enough. It seems that now the important question is whether things people intuitively think of as different values (my happiness, total happiness, average happiness) satisfy this condition.
Admittedly, I’m pretty sure they don’t.