Suppose further that you can quantify each item on that list with a function from world-histories to real numbers, and you want to optimize for each function, all other things being equal.
If fairness is one of my values, it can’t necessary be represented by such a function. (I.e., it may need to be a function from lotteries over world-histories to real numbers.)
What the theorem says is that if you really care about the values on that list, then there are linear aggregations that you should start optimizing for.
I think before you make this conclusion, you have to say something about how one is supposed to pick the weights. The theorem itself seems to suggest that I can pick the weights by choosing a Pareto-optimal policy/outcome that’s mutually acceptable to all of my values, and then work backwards to a set of weights that would generate a utility function (or more generally, a way to pick such weights based on a coin-flip) that would then end up optimizing for the same outcome. But in this case, it seems to me that all of the real “optimizing” was already done prior to the time you form the linear aggregation.
(ETA: I guess the key question here is whether the weights ought to logically depend on the actual shape of the Pareto frontier. If yes, then you have to compute the Pareto frontier before you choose the weights, in which case you’ve already “optimized” prior to choosing the weights, since computing the Pareto frontier involves optimizing against the individual values as separate utility functions.)
Also, even if my values can theoretically be represented by functions from world-histories to real numbers, I can’t obtain encodings of such functions since I don’t have introspective access to my values, and therefore I can’t compute linear aggregations of them. So I don’t know how I can start optimizing for a linear aggregation of my values, even if I did have a reasonable way to derive the weights.
If you’re capable of making precommitments and we don’t worry about computational difficulties [...]
I’m glad you made these assumptions explicit, but shouldn’t there be a similar caveat when you make the final conclusions? The way I see it, I have a choice between (A) a solution known to be optimal along some dimensions not including considerations of logical uncertainty and dynamical consistency, or (B) a very imperfectly optimized solution that nevertheless probably does take them into account to some degree (i.e., the native decision making machinery that evolution gave me). Sticking with B for now doesn’t seem unreasonable to me (especially given the other difficulties I mentioned with trying to implement A).
(I’ve skipped some of the supporting arguments in this comment since I already wrote about them under the recent Harsanyi post. Let me know if you want me clarify anything.)
I think before you make this conclusion, you have to say something about how one is supposed to pick the weights.
I agree with this concern. The theorem is basically saying that, given any sensible aggregation rule, there is a linear aggregation rule that produces the same decisions. However, it assumes that we already have a prior; the linear coefficients are allowed to depend on what we think the world actually looks like, rather than being a pure representation of values. I think people, especially those who don’t understand the proof of this theorem, are likely to misinterpret it.
I guess the key question here is whether the weights ought to logically depend on the actual shape of the Pareto frontier.
Yes, whether a set of weights leads to Pareto-dominance depends logically on the shape of the Pareto frontier. So the theorem does not help with the computational part of figuring out what one’s values are.
I have a choice between (A) a solution known to be optimal along some dimensions not including considerations of logical uncertainty and dynamical consistency, or (B) a very imperfectly optimized solution that nevertheless probably does take them into account to some degree (i.e., the native decision making machinery that evolution gave me). Sticking with B for now doesn’t seem unreasonable to me
Sticking with B by default sounds reasonable except when we know something about the ways in which B falls short of optimality and the ways in which B takes dynamical consistency issues into account. E.g., I can pretty confidently recommend that minor philanthropists donate all their charity to the single best cause, modulo a number of important caveats and exceptions. It’s natural to feel that one should diversify their (altruistic, outcome-oriented) giving; but once one sees the theoretical justification for single-cause giving under ideal conditions and one explains away their intuitions with motives they don’t endorse and heuristics that work okay in the EAA but not on this particular problem, I think they have a good reason to go with choice A.
Even then, the philanthropist still has to decide which cause to donate to. It’s possible that once they believe they should construct a utility function for a particular domain, they’ll be able to use other tools to come up with a utility function they’re happy with. But this theorem doesn’t guarantee that.
I tried not to claim too much in the OP. I hope no one reads this post and makes a really bad decision because of an overly-naive expected-utility calculation.
Yes, whether a set of weights leads to Pareto-dominance depends logically on the shape of the Pareto frontier. So the theorem does not help with the computational part of figuring out what one’s values are.
Do you mean “figuring out what one’s weights are”? Assuming yes, I think my point was a bit stronger than that, namely there’s not necessarily a reason to figure out the weights at all, if in order to figure out the weights, you actually have to first come to a decision using some other procedure.
Sticking with B by default sounds reasonable except when we know something about the ways in which B falls short of optimality and the ways in which B takes dynamical consistency issues into account.
I think there’s probably local Pareto improvements that we can make to B, but that’s very different from switching to A (which is what your OP was arguing for).
E.g., I can pretty confidently recommend that minor philanthropists donate all their charity to the single best cause, modulo a number of important caveats and exceptions. It’s natural to feel that one should diversify their (altruistic, outcome-oriented) giving;
I agree this seems like a reasonable improvement to B, but I’m not sure what relevance your theorem has for it. You may have to write that post you mentioned in the OP to explain.
I tried not to claim too much in the OP. I hope no one reads this post and makes a really bad decision because of an overly-naive expected-utility calculation.
Besides that, I’m concerned about many people seemingly convinced that VNM is rationality and working hard to try to justify it, instead of working on a bunch of open problems that seem very important and interesting to me, one of which is what rationality actually is.
Yes, whether a set of weights leads to Pareto-dominance depends logically on the shape of the Pareto frontier. So the theorem does not help with the computational part of figuring out what one’s values are.
Do you mean “figuring out what one’s weights are”?
Yes
Assuming yes, I think my point was a bit stronger than that, namely there’s not necessarily a reason to figure out the weights at all, if in order to figure out the weights, you actually have to first come to a decision using some other procedure.
I think any disagreement we have here is subsumed by our discussion elsewhere in this thread.
I think there’s probably local Pareto improvements that we can make to B, but that’s very different from switching to A (which is what your OP was arguing for).
Perhaps I will write that philanthropy post, and then we will have a concrete example to discuss.
Besides that, I’m concerned about many people seemingly convinced that VNM is rationality and working hard to try to justify it, instead of working on a bunch of open problems that seem very important and interesting to me, one of which is what rationality actually is.
I appreciate your point.
ETA: Wei_Dai and I determined that part of our apparent disagreement came from the fact that an agent with a policy that happens to optimize a function does not need to use a decision algorithm that computes expected values.
If fairness is one of my values, it can’t necessary be represented by such a function. (I.e., it may need to be a function from lotteries over world-histories to real numbers.)
You refer to cases such as A = “I give the last candy to Alice”, B = “I give the last candy to Bob” and you strictly prefer the lottery {50% A, 50% B} to {100% A} or {100% B}?
But remember that we’re talking about entire world histories, not just world states—If you take A0 = “I arbitrarily give the last candy to Alice”, A1 = “I flip a coin to decide whom to give the last candy to, and Alice wins”, etc., you can easily have A1 = B1 > A0 = B0, since A1 and A0 are different (one includes you flipping a coin, the other doesn’t). So a function from world histories would suffice, after all.
I’m pretty sure Nisan meant to define “world-histories” in a way to exclude utility functions like that, otherwise it’s hard to make sense of the convexity property that he assumes in his theorem. (Hopefully he will jump in and confirm or deny this.)
Yes, we should assume the agent has access to a source of uncertainty with respect to which the functions v_i are invariant.
In fact, let’s assume a kind of Cartesian dualism, so that the agent (and a single fair coin) are not part of the world. That way the agent can’t have preferences over its own decision procedure.
I think before you make this conclusion, you have to say something about how one is supposed to pick the weights.
I think these weights are descriptive, not prescriptive. Eliciting values is very important- and there’s some work in the decision analysis literature on that- but there isn’t much to be done theoretically, since most of the work is “how do we work around the limitations of human psychology?” rather than “how do we get the math right?”.
I think these weights are descriptive, not prescriptive.
What do you mean by that? Are you saying humans already maximize expected utility using some linear aggregation of individual values, so these weights already exist? But the whole point of the OP is to convince people who are not already EU maximizers to become EU maximizers.
Are you saying humans already maximize expected utility using some linear aggregation of individual values, so these weights already exist?
I think my answer would be along the lines of “humans have preferences that could be consistently aggregated but they are bad at consistently aggregating them due to the computational difficulties involved.” For example, much of the early statistical prediction rule work fit a linear regression to a particular expert’s output on training cases, and found that the regression of that expert beat the expert on new cases- that is, it captured enough of their expertise but did not capture as much of their mistakes, fatigue, and off days. If you’re willing to buy that a simple algorithm based on a doctor can diagnose a disease better than that doctor, then it doesn’t seem like a big stretch to claim that a simple algorithm based on a person can satisfy that person’s values better than that person’s decisions made in real-time. (In order to move from ‘diagnose this one disease’ to ‘make choices that impact my life trajectory’ you need much, much more data, and probably more sophisticated aggregation tools than linear regression, but the basic intuition should hold.)
And so I think the methodology is (sort of) prescriptive: whatever you do, if it isn’t equivalent to a linear combination of your subvalues, then your aggregation procedure is introducing new subvalues, which is probably a bug.* (The ‘equivalent to’ is what makes it only ‘sort of’ prescriptive.) If the weights aren’t all positive, that’s probably also a bug (since that means one of your subvalues has no impact on your preferences, and thus it’s not actually a subvalue). But what should the relative weights for v_3 and v_4 be? Well, that depends on the tradeoffs that the person is willing to make; it’s not something we can pin down theoretically.
*Or you erroneously identified two subvalues as distinct, when they are related and should be mapped jointly.
And so I think the methodology is (sort of) prescriptive: whatever you do, if it isn’t equivalent to a linear combination of your subvalues, then your aggregation procedure is introducing new subvalues, which is probably a bug.
I tried to argue against this in the top level comment of this thread, but may not have been very clear. I just came up with a new argument, and would be interested to know whether it makes more sense to you.
If fairness is one of my values, it can’t necessary be represented by such a function. (I.e., it may need to be a function from lotteries over world-histories to real numbers.)
I think before you make this conclusion, you have to say something about how one is supposed to pick the weights. The theorem itself seems to suggest that I can pick the weights by choosing a Pareto-optimal policy/outcome that’s mutually acceptable to all of my values, and then work backwards to a set of weights that would generate a utility function (or more generally, a way to pick such weights based on a coin-flip) that would then end up optimizing for the same outcome. But in this case, it seems to me that all of the real “optimizing” was already done prior to the time you form the linear aggregation.
(ETA: I guess the key question here is whether the weights ought to logically depend on the actual shape of the Pareto frontier. If yes, then you have to compute the Pareto frontier before you choose the weights, in which case you’ve already “optimized” prior to choosing the weights, since computing the Pareto frontier involves optimizing against the individual values as separate utility functions.)
Also, even if my values can theoretically be represented by functions from world-histories to real numbers, I can’t obtain encodings of such functions since I don’t have introspective access to my values, and therefore I can’t compute linear aggregations of them. So I don’t know how I can start optimizing for a linear aggregation of my values, even if I did have a reasonable way to derive the weights.
I’m glad you made these assumptions explicit, but shouldn’t there be a similar caveat when you make the final conclusions? The way I see it, I have a choice between (A) a solution known to be optimal along some dimensions not including considerations of logical uncertainty and dynamical consistency, or (B) a very imperfectly optimized solution that nevertheless probably does take them into account to some degree (i.e., the native decision making machinery that evolution gave me). Sticking with B for now doesn’t seem unreasonable to me (especially given the other difficulties I mentioned with trying to implement A).
(I’ve skipped some of the supporting arguments in this comment since I already wrote about them under the recent Harsanyi post. Let me know if you want me clarify anything.)
I agree with this concern. The theorem is basically saying that, given any sensible aggregation rule, there is a linear aggregation rule that produces the same decisions. However, it assumes that we already have a prior; the linear coefficients are allowed to depend on what we think the world actually looks like, rather than being a pure representation of values. I think people, especially those who don’t understand the proof of this theorem, are likely to misinterpret it.
Yes, whether a set of weights leads to Pareto-dominance depends logically on the shape of the Pareto frontier. So the theorem does not help with the computational part of figuring out what one’s values are.
Sticking with B by default sounds reasonable except when we know something about the ways in which B falls short of optimality and the ways in which B takes dynamical consistency issues into account. E.g., I can pretty confidently recommend that minor philanthropists donate all their charity to the single best cause, modulo a number of important caveats and exceptions. It’s natural to feel that one should diversify their (altruistic, outcome-oriented) giving; but once one sees the theoretical justification for single-cause giving under ideal conditions and one explains away their intuitions with motives they don’t endorse and heuristics that work okay in the EAA but not on this particular problem, I think they have a good reason to go with choice A.
Even then, the philanthropist still has to decide which cause to donate to. It’s possible that once they believe they should construct a utility function for a particular domain, they’ll be able to use other tools to come up with a utility function they’re happy with. But this theorem doesn’t guarantee that.
I tried not to claim too much in the OP. I hope no one reads this post and makes a really bad decision because of an overly-naive expected-utility calculation.
Do you mean “figuring out what one’s weights are”? Assuming yes, I think my point was a bit stronger than that, namely there’s not necessarily a reason to figure out the weights at all, if in order to figure out the weights, you actually have to first come to a decision using some other procedure.
I think there’s probably local Pareto improvements that we can make to B, but that’s very different from switching to A (which is what your OP was arguing for).
I agree this seems like a reasonable improvement to B, but I’m not sure what relevance your theorem has for it. You may have to write that post you mentioned in the OP to explain.
Besides that, I’m concerned about many people seemingly convinced that VNM is rationality and working hard to try to justify it, instead of working on a bunch of open problems that seem very important and interesting to me, one of which is what rationality actually is.
Yes
I think any disagreement we have here is subsumed by our discussion elsewhere in this thread.
Perhaps I will write that philanthropy post, and then we will have a concrete example to discuss.
I appreciate your point.
ETA: Wei_Dai and I determined that part of our apparent disagreement came from the fact that an agent with a policy that happens to optimize a function does not need to use a decision algorithm that computes expected values.
You refer to cases such as A = “I give the last candy to Alice”, B = “I give the last candy to Bob” and you strictly prefer the lottery {50% A, 50% B} to {100% A} or {100% B}?
But remember that we’re talking about entire world histories, not just world states—If you take A0 = “I arbitrarily give the last candy to Alice”, A1 = “I flip a coin to decide whom to give the last candy to, and Alice wins”, etc., you can easily have A1 = B1 > A0 = B0, since A1 and A0 are different (one includes you flipping a coin, the other doesn’t). So a function from world histories would suffice, after all.
I’m pretty sure Nisan meant to define “world-histories” in a way to exclude utility functions like that, otherwise it’s hard to make sense of the convexity property that he assumes in his theorem. (Hopefully he will jump in and confirm or deny this.)
Yes, we should assume the agent has access to a source of uncertainty with respect to which the functions v_i are invariant.
In fact, let’s assume a kind of Cartesian dualism, so that the agent (and a single fair coin) are not part of the world. That way the agent can’t have preferences over its own decision procedure.
I think these weights are descriptive, not prescriptive. Eliciting values is very important- and there’s some work in the decision analysis literature on that- but there isn’t much to be done theoretically, since most of the work is “how do we work around the limitations of human psychology?” rather than “how do we get the math right?”.
What do you mean by that? Are you saying humans already maximize expected utility using some linear aggregation of individual values, so these weights already exist? But the whole point of the OP is to convince people who are not already EU maximizers to become EU maximizers.
I think my answer would be along the lines of “humans have preferences that could be consistently aggregated but they are bad at consistently aggregating them due to the computational difficulties involved.” For example, much of the early statistical prediction rule work fit a linear regression to a particular expert’s output on training cases, and found that the regression of that expert beat the expert on new cases- that is, it captured enough of their expertise but did not capture as much of their mistakes, fatigue, and off days. If you’re willing to buy that a simple algorithm based on a doctor can diagnose a disease better than that doctor, then it doesn’t seem like a big stretch to claim that a simple algorithm based on a person can satisfy that person’s values better than that person’s decisions made in real-time. (In order to move from ‘diagnose this one disease’ to ‘make choices that impact my life trajectory’ you need much, much more data, and probably more sophisticated aggregation tools than linear regression, but the basic intuition should hold.)
And so I think the methodology is (sort of) prescriptive: whatever you do, if it isn’t equivalent to a linear combination of your subvalues, then your aggregation procedure is introducing new subvalues, which is probably a bug.* (The ‘equivalent to’ is what makes it only ‘sort of’ prescriptive.) If the weights aren’t all positive, that’s probably also a bug (since that means one of your subvalues has no impact on your preferences, and thus it’s not actually a subvalue). But what should the relative weights for v_3 and v_4 be? Well, that depends on the tradeoffs that the person is willing to make; it’s not something we can pin down theoretically.
*Or you erroneously identified two subvalues as distinct, when they are related and should be mapped jointly.
I tried to argue against this in the top level comment of this thread, but may not have been very clear. I just came up with a new argument, and would be interested to know whether it makes more sense to you.