T does not satisfy axiom 2 because Alice, Bob, and Charlie all prefer the gamble .5X+.5Y over W, but T is indifferent between .5X+.5Y and W. As I said, axiom 2 includes decisions under uncertainty. The VNM axioms don’t even make a distinction between known outcomes and gambles, so I assumed that would be understood as the default.
Ah! I was interpreting “choice” as “outcome,” rather than “probabilistic combination of outcomes,” and with the latter interpretation axiom 2 becomes much stronger.
I still don’t think it’s strong enough, though: it appears to me that U still satisfies axiom 2, despite not being a linear combination of the utilities. As well, if I add Diana, who has utility (0,0,0,1), then T appears to serve as a social welfare function for Alice, Bob, Charlie, and Diana.
I should note that I suspect that’s a general failure mode: take any utility function, and add an agent to the pool who has that utility function. That agent is now a candidate for the social welfare function, as it now satisfies the first two axioms and might satisfy the third. (Alternatively, appoint any agent already in the pool as the social welfare function; the first two axioms will be satisfied, and the third will be unchanged.)
I should note that I suspect that’s a general failure mode: take any utility function, and add an agent to the pool who has that utility function. That agent is now a candidate for the social welfare function, as it now satisfies the first two axioms and might satisfy the third. (Alternatively, appoint any agent already in the pool as the social welfare function; the first two axioms will be satisfied, and the third will be unchanged.)
That is correct. But in a case like that, the aggregate utility function is a linear combination of the original utility functions where all but one of the coefficients are 0. Being a linear combination of utility functions is not a strong enough requirement to rule out all bad aggregations.
Conclusion: I don’t agree with Harsanyi’s claim that the linear combination of utility functions is unique up to linear transformations. I agree it is unique up to affine transformations, and the discrepancy between my statement and his is explained by his comment “on the understanding that the zero point of the social welfare function is appropriately chosen.” (Why he didn’t explicitly generalize to affine transformations is beyond me.)
I don’t think the claim “the utility function can be expressed as a linear combination of the individual utility functions” is particularly meaningful, because it just means that the aggregated utility function must exist in the space spanned by the individual utility functions. I’d restate it as:
If the aggregator introduces new values not shared by humans, it is willing to trade human values to get them, and thus is not a friendly aggregator.
(Because, as per VNM, all values are comparable.) Also, note that this might not be a necessary condition for friendliness, but it is a necessary condition for axiom 2-ness.
Notes:
I’ve been representing the utilities as vectors, and it seems like moving to linear algebra will make this discussion much cleaner.
Suppose the utility vector for an individual is a row vector. We can combine their preferences into a matrix P=[A;B;C].
In order to make a counterexample, we need a row vector S which 1) is linearly independent of P, that is, rank[P;S] =/= rank[P]. Note that if P has rank equal to the number of outcomes, this is impossible; all utility functions can be expressed as linear combinations. In our particular example, the rank of P is 3, and there are 4 outcomes, so S=null[P]=[-1,0,0,0], and we can confirm that rank[P;S]=4. (Note that for this numerical example, S is equivalent to a affinely transformed C, but I’m not sure if this is general.)
We also need S to 2) satisfy any preferences shared by all members of P. We can see gambles as column vectors, with each element being the probability that a gamble leads to a particular outcome; all values should be positive and sum to one. We can compare gambles by subtracting them; A*x-A*y gives us the amount that A prefers x to y. Following Harsanyi, we’ll make it share indifferences; that is, if A*(x-y)=0, then A is indifferent between x and y, and if P*(x-y) is a zero column vector, then all members of the population are indifferent.
Let z=(x-y), and note that P*z=0 is the null space of P, which we used earlier to identify a candidate S, because we knew incorporating one of the vectors of the null space would increase the rank. We need S*z=0 for it to be indifferent when P is indifferent; this requires that the null space of P have at least two dimensions. (So three independent agents aggregated in four dimensions isn’t enough!)
We also need the sum of z to be zero for it to count as a comparison between gambles, which is equivalent to [1,1,1,1,1]*z=0. If we get lucky, this occurs normally, but we’re not guaranteed two different gambles that all members of the population are indifferent between. If we have a null space of at least three dimensions, then that is guaranteed to happen, because we can toss the ones vector in as another row to ensure that all the vectors returned by null sum to 0.
So, if the null space of P is at least 2-dimensional, we can construct a social welfare function that shares indifferences, and if the null space of P is at least 3-dimensional, those indifferences are guaranteed to exist. But sharing preferences is a bit tougher- we need every case where P*z>0 to result in S*z>0. Since z=x-y, we have the constraint that the sum of z’s elements must add up to 0, which makes things weirder, since it means we need to consider at least two elements at once.
So it’s not clear to me yet that it’s impossible to construct S which shares preferences and is linearly independent, but I also haven’t generated a constructive method to do so in general.
I don’t agree with Harsanyi’s claim that the linear combination of utility functions is unique up to linear transformations. I agree it is unique up to affine transformations, and the discrepancy between my statement and his is explained by his comment “on the understanding that the zero point of the social welfare function is appropriately chosen.” (Why he didn’t explicitly generalize to affine transformations is beyond me.)
I’m not quite sure what you mean. Are you talking about the fact that you can add a constant to utility function without changing anything important, but that a constant is not necessarily a linear combination of the utility functions to be aggregated? For that reason, it might be best to implicitly include the constant function in any set of utility functions when talking about whether or not they are linearly independent; otherwise you can change the answer by adding a constant to one of them. Also, where did Harsanyi say that?
I don’t think the claim “the utility function can be expressed as a linear combination of the individual utility functions” is particularly meaningful, because it just means that the aggregated utility function must exist in the space spanned by the individual utility functions.
Yes, that’s what it means. I don’t see how that makes it unmeaningful.
Agreed that linear algebra is a natural way to approach this. In fact, I was thinking in similar terms. If you replace axiom 3 with the stronger assumption that the utility functions to be aggregated, along with the constant function, are linearly independent (which I think is still reasonable if there are an infinite number of outcomes, or even if there are just at least 2 more outcomes than agents), then it is fairly easy to show that sharing preferences requires the aggregation to be a linear combination of the utility functions and the constant function.
Let K represent the row vector with all 1s (a constant function). Let “pseudogamble” refer to column vectors whose elements add to 1 (Kx = 1). Note that given two pseudogambles x and y, we can find two gambles x’ and y’ such that for any agent A, A(x-y) has the same sign as A(x’-y’) by mixing the pseuogambles with another gamble. For instance, if x, y, and z are outcomes, and A(x) > A(2y-z), then A(.5x+.5z) > A(.5(2y-z)+.5z) = A(y). So the fact that I’ll be talking about pseudogambles rather than gambles is not a problem.
Anyway, if the initial utility functions and K are linearly independent, then the aggregate not being a linear combination of the initial utility functions and K would mean that the aggregate, K, and the initial utility functions all together are linearly independent. Given a linearly independent set of row vectors, it is possible to find a column vector whose product with each row vector is independently specifiable. In particular, you can find column vectors x and y such that Kx=Ky=1, Ax>Ay for all initial utility functions A, and Sx<Sy, where S is the aggregate utility function.
Edit: I just realized that if we use Harsanyi’s shared indifference criterion instead of my shared preference criterion, we don’t even need the linear independence of the initial utility functions for that argument to work. You can find x and y such that Kx=Ky=1, Ax=Ay for all initial utility functions A, and Sx=/=Sy if S is not a linear combination of the initial utility functions and K, whether or not the initial utility functions are linearly independent of each other, because if you ensure that Ax=Ay for a maximal linearly independent subset of the initial utility functions and K, then it follows that Ax=Ay for the others as well.
Immediately before the statement of Theorem I in section III.
Yes, that’s what it means. I don’t see how that makes it unmeaningful.
In my mind, there’s a meangingful difference between construction and description- yes, you can describe any waveform as an infinite series of sines and cosines, but if you actually want to build one, you probably want to use a finite series. And this result doesn’t exclude any exotic methods of constructing utility functions; you could multiply together the utilities of each individual in the pool and you’d end up with an aggregate utility function that could be expressed as a linear combination of the individual utilities (and the ones vector), with the weights changing every time you add another individual to the pool or add another outcome to be considered.
More relevant to the discussion, though, is the idea of the aggregator should not introduce novel preferences. This is an unobjectionable conclusion, I would say, but it doesn’t get us very far: if there are preferences in the pool that we want to exclude, like a utility monster’s, setting their weight to 0 is what excludes their preferences, not abandoning linear combinations, and if the system designer has preferences about “fairness” or so on, then so long as one of the agents in the pool has those preferences, the system designer can incorporate those preferences just by increasing their weight in the combination.
But in both cases, the aggregator would probably be created through another function, and then so long as it does not introduce novel preferences it can be described as a linear combination. Instead of arguing about weights, we may find it more fruitful to argue about meta-weights, even though there is a many-to-one mapping (for any particular instance) from meta-weights to weights.
Let K represent the row vector with all 1s (a constant function). Let “pseudogamble” refer to column vectors whose elements add to 1 (Kx = 1).
I’d recommend the use of “e” for the ones vector, and if the elements add to 1, it’s not clear to me why it’s a “pseudogamble” rather than a “gamble,” if one uses the terminology that column vectors where only a single element is 1 are “outcomes.”
I find preferences much clearer to think about as “tradeoffs”- that is, column vectors that add to 0, which are easily created by subtracting two gambles, but now the scaling is arbitrary and the sign of the product of a utility row vector and a tradeoff column vector unambiguously determines the preference for the preference.
For instance, if x, y, and z are outcomes
Alphabetical collision!
Given a linearly independent set of row vectors, it is possible to find a column vector whose product with each row vector is independently specifiable. In particular, you can find column vectors x and y such that Kx=Ky=1, Ax>Ay for all initial utility functions A, and Sx<Sy, where S is the aggregate utility function.
you could multiply together the utilities of each individual in the pool and you’d end up with an aggregate utility function that could be expressed as a linear combination of the individual utilities (and the ones vector), with the weights changing every time you add another individual to the pool or add another outcome to be considered.
Unlikely, unless there are at least as many agents as outcomes.
if the system designer has preferences about “fairness” or so on, then so long as one of the agents in the pool has those preferences, the system designer can incorporate those preferences just by increasing their weight in the combination.
Yes. In fact, I think something like that will be necessary. For example, suppose there is a population of two agents, each of which has a “hedon function” which specifies their agent-centric preferences. One of the agents is an egoist, so his utility function is his hedon function. The other agent is an altruist, so his utility function is the average of his and the egoist’s hedon functions. If you add up the two utility functions, you find that the egoist’s hedon function gets three times the weight of the altruist’s hedon function, which seems unfair. So we would want to give extra weight to the altruist’s utility function (you could argue that in this example you should use only the altruist’s utility function).
if the elements add to 1, it’s not clear to me why it’s a “pseudogamble” rather than a “gamble,” if one uses the terminology that column vectors where only a single element is 1 are “outcomes.”
Unlikely, unless there are at least as many agents as outcomes.
It’s unlikely that the weights of existing agents would change under either of those cases, or that the multiplication could be expressed as a weighted sum, or that the multiplication would have axiom 2-ness?
If you add up the two utility functions, you find that the egoist’s hedon function gets three times the weight of the altruist’s hedon function, which seems unfair.
Indeed. The problem is more general- I would classify the parts as “internal” and “external,” rather than agent-centric and other, because that makes it clearer that agents don’t have to positively weight each other’s utilities. If you have a ‘maltruist’ whose utility is his internal utility minus the egoist’s utility (divided by two to normalize), we might want to balance their weight and the egoist’s weight so that the agents’ internal utilities are equally represented in the aggregator.
Such meta-weight arguments, though, exist in an entirely different realm from this result, and so this result has little bearing on those arguments (which is what people are interested in when they resist the claim that social welfare functions are linear combinations of individual utility).
It’s unlikely that the weights of existing agents would change under either of those cases, or that the multiplication could be expressed as a weighted sum, or that the multiplication would have axiom 2-ness?
Unlikely that the multiplication could be expressed as a weighted sum (and hence by extension, also unlikely it would obey axiom 2).
I agree in general, because we would need the left inverse of the combined linearly independent individual utilities and e, and that won’t exist. We do have freedom to affinely transform the individual utilities before taking their element-wise product, though, and that gives us an extra degree of freedom per agent. I suspect we can do it so long as the number of agents is at least half the number of outcomes.
Oh, I see what you mean. It should be possible to find some affinely transformed product that is also a linear combination if the number of agents is at least half the number of outcomes, but some arbitrary affinely transformed product is only likely to also be a linear combination if the number of agents is at least the number of outcomes.
T does not satisfy axiom 2 because Alice, Bob, and Charlie all prefer the gamble .5X+.5Y over W, but T is indifferent between .5X+.5Y and W. As I said, axiom 2 includes decisions under uncertainty. The VNM axioms don’t even make a distinction between known outcomes and gambles, so I assumed that would be understood as the default.
Ah! I was interpreting “choice” as “outcome,” rather than “probabilistic combination of outcomes,” and with the latter interpretation axiom 2 becomes much stronger.
I still don’t think it’s strong enough, though: it appears to me that U still satisfies axiom 2, despite not being a linear combination of the utilities. As well, if I add Diana, who has utility (0,0,0,1), then T appears to serve as a social welfare function for Alice, Bob, Charlie, and Diana.
I should note that I suspect that’s a general failure mode: take any utility function, and add an agent to the pool who has that utility function. That agent is now a candidate for the social welfare function, as it now satisfies the first two axioms and might satisfy the third. (Alternatively, appoint any agent already in the pool as the social welfare function; the first two axioms will be satisfied, and the third will be unchanged.)
That is correct. But in a case like that, the aggregate utility function is a linear combination of the original utility functions where all but one of the coefficients are 0. Being a linear combination of utility functions is not a strong enough requirement to rule out all bad aggregations.
First, thanks for your patience.
Conclusion: I don’t agree with Harsanyi’s claim that the linear combination of utility functions is unique up to linear transformations. I agree it is unique up to affine transformations, and the discrepancy between my statement and his is explained by his comment “on the understanding that the zero point of the social welfare function is appropriately chosen.” (Why he didn’t explicitly generalize to affine transformations is beyond me.)
I don’t think the claim “the utility function can be expressed as a linear combination of the individual utility functions” is particularly meaningful, because it just means that the aggregated utility function must exist in the space spanned by the individual utility functions. I’d restate it as:
(Because, as per VNM, all values are comparable.) Also, note that this might not be a necessary condition for friendliness, but it is a necessary condition for axiom 2-ness.
Notes:
I’ve been representing the utilities as vectors, and it seems like moving to linear algebra will make this discussion much cleaner.
Suppose the utility vector for an individual is a row vector. We can combine their preferences into a matrix P=[A;B;C].
In order to make a counterexample, we need a row vector S which 1) is linearly independent of P, that is, rank[P;S] =/= rank[P]. Note that if P has rank equal to the number of outcomes, this is impossible; all utility functions can be expressed as linear combinations. In our particular example, the rank of P is 3, and there are 4 outcomes, so S=null[P]=[-1,0,0,0], and we can confirm that rank[P;S]=4. (Note that for this numerical example, S is equivalent to a affinely transformed C, but I’m not sure if this is general.)
We also need S to 2) satisfy any preferences shared by all members of P. We can see gambles as column vectors, with each element being the probability that a gamble leads to a particular outcome; all values should be positive and sum to one. We can compare gambles by subtracting them; A*x-A*y gives us the amount that A prefers x to y. Following Harsanyi, we’ll make it share indifferences; that is, if A*(x-y)=0, then A is indifferent between x and y, and if P*(x-y) is a zero column vector, then all members of the population are indifferent.
Let z=(x-y), and note that P*z=0 is the null space of P, which we used earlier to identify a candidate S, because we knew incorporating one of the vectors of the null space would increase the rank. We need S*z=0 for it to be indifferent when P is indifferent; this requires that the null space of P have at least two dimensions. (So three independent agents aggregated in four dimensions isn’t enough!)
We also need the sum of z to be zero for it to count as a comparison between gambles, which is equivalent to [1,1,1,1,1]*z=0. If we get lucky, this occurs normally, but we’re not guaranteed two different gambles that all members of the population are indifferent between. If we have a null space of at least three dimensions, then that is guaranteed to happen, because we can toss the ones vector in as another row to ensure that all the vectors returned by null sum to 0.
So, if the null space of P is at least 2-dimensional, we can construct a social welfare function that shares indifferences, and if the null space of P is at least 3-dimensional, those indifferences are guaranteed to exist. But sharing preferences is a bit tougher- we need every case where P*z>0 to result in S*z>0. Since z=x-y, we have the constraint that the sum of z’s elements must add up to 0, which makes things weirder, since it means we need to consider at least two elements at once.
So it’s not clear to me yet that it’s impossible to construct S which shares preferences and is linearly independent, but I also haven’t generated a constructive method to do so in general.
I’m not quite sure what you mean. Are you talking about the fact that you can add a constant to utility function without changing anything important, but that a constant is not necessarily a linear combination of the utility functions to be aggregated? For that reason, it might be best to implicitly include the constant function in any set of utility functions when talking about whether or not they are linearly independent; otherwise you can change the answer by adding a constant to one of them. Also, where did Harsanyi say that?
Yes, that’s what it means. I don’t see how that makes it unmeaningful.
Agreed that linear algebra is a natural way to approach this. In fact, I was thinking in similar terms. If you replace axiom 3 with the stronger assumption that the utility functions to be aggregated, along with the constant function, are linearly independent (which I think is still reasonable if there are an infinite number of outcomes, or even if there are just at least 2 more outcomes than agents), then it is fairly easy to show that sharing preferences requires the aggregation to be a linear combination of the utility functions and the constant function.
Let K represent the row vector with all 1s (a constant function). Let “pseudogamble” refer to column vectors whose elements add to 1 (Kx = 1). Note that given two pseudogambles x and y, we can find two gambles x’ and y’ such that for any agent A, A(x-y) has the same sign as A(x’-y’) by mixing the pseuogambles with another gamble. For instance, if x, y, and z are outcomes, and A(x) > A(2y-z), then A(.5x+.5z) > A(.5(2y-z)+.5z) = A(y). So the fact that I’ll be talking about pseudogambles rather than gambles is not a problem.
Anyway, if the initial utility functions and K are linearly independent, then the aggregate not being a linear combination of the initial utility functions and K would mean that the aggregate, K, and the initial utility functions all together are linearly independent. Given a linearly independent set of row vectors, it is possible to find a column vector whose product with each row vector is independently specifiable. In particular, you can find column vectors x and y such that Kx=Ky=1, Ax>Ay for all initial utility functions A, and Sx<Sy, where S is the aggregate utility function.
Edit: I just realized that if we use Harsanyi’s shared indifference criterion instead of my shared preference criterion, we don’t even need the linear independence of the initial utility functions for that argument to work. You can find x and y such that Kx=Ky=1, Ax=Ay for all initial utility functions A, and Sx=/=Sy if S is not a linear combination of the initial utility functions and K, whether or not the initial utility functions are linearly independent of each other, because if you ensure that Ax=Ay for a maximal linearly independent subset of the initial utility functions and K, then it follows that Ax=Ay for the others as well.
Immediately before the statement of Theorem I in section III.
In my mind, there’s a meangingful difference between construction and description- yes, you can describe any waveform as an infinite series of sines and cosines, but if you actually want to build one, you probably want to use a finite series. And this result doesn’t exclude any exotic methods of constructing utility functions; you could multiply together the utilities of each individual in the pool and you’d end up with an aggregate utility function that could be expressed as a linear combination of the individual utilities (and the ones vector), with the weights changing every time you add another individual to the pool or add another outcome to be considered.
More relevant to the discussion, though, is the idea of the aggregator should not introduce novel preferences. This is an unobjectionable conclusion, I would say, but it doesn’t get us very far: if there are preferences in the pool that we want to exclude, like a utility monster’s, setting their weight to 0 is what excludes their preferences, not abandoning linear combinations, and if the system designer has preferences about “fairness” or so on, then so long as one of the agents in the pool has those preferences, the system designer can incorporate those preferences just by increasing their weight in the combination.
But in both cases, the aggregator would probably be created through another function, and then so long as it does not introduce novel preferences it can be described as a linear combination. Instead of arguing about weights, we may find it more fruitful to argue about meta-weights, even though there is a many-to-one mapping (for any particular instance) from meta-weights to weights.
I’d recommend the use of “e” for the ones vector, and if the elements add to 1, it’s not clear to me why it’s a “pseudogamble” rather than a “gamble,” if one uses the terminology that column vectors where only a single element is 1 are “outcomes.”
I find preferences much clearer to think about as “tradeoffs”- that is, column vectors that add to 0, which are easily created by subtracting two gambles, but now the scaling is arbitrary and the sign of the product of a utility row vector and a tradeoff column vector unambiguously determines the preference for the preference.
Alphabetical collision!
Agreed.
Unlikely, unless there are at least as many agents as outcomes.
Yes. In fact, I think something like that will be necessary. For example, suppose there is a population of two agents, each of which has a “hedon function” which specifies their agent-centric preferences. One of the agents is an egoist, so his utility function is his hedon function. The other agent is an altruist, so his utility function is the average of his and the egoist’s hedon functions. If you add up the two utility functions, you find that the egoist’s hedon function gets three times the weight of the altruist’s hedon function, which seems unfair. So we would want to give extra weight to the altruist’s utility function (you could argue that in this example you should use only the altruist’s utility function).
It may contain negative elements.
It’s unlikely that the weights of existing agents would change under either of those cases, or that the multiplication could be expressed as a weighted sum, or that the multiplication would have axiom 2-ness?
Indeed. The problem is more general- I would classify the parts as “internal” and “external,” rather than agent-centric and other, because that makes it clearer that agents don’t have to positively weight each other’s utilities. If you have a ‘maltruist’ whose utility is his internal utility minus the egoist’s utility (divided by two to normalize), we might want to balance their weight and the egoist’s weight so that the agents’ internal utilities are equally represented in the aggregator.
Such meta-weight arguments, though, exist in an entirely different realm from this result, and so this result has little bearing on those arguments (which is what people are interested in when they resist the claim that social welfare functions are linear combinations of individual utility).
Ah! Of course.
Unlikely that the multiplication could be expressed as a weighted sum (and hence by extension, also unlikely it would obey axiom 2).
I agree in general, because we would need the left inverse of the combined linearly independent individual utilities and e, and that won’t exist. We do have freedom to affinely transform the individual utilities before taking their element-wise product, though, and that gives us an extra degree of freedom per agent. I suspect we can do it so long as the number of agents is at least half the number of outcomes.
Oh, I see what you mean. It should be possible to find some affinely transformed product that is also a linear combination if the number of agents is at least half the number of outcomes, but some arbitrary affinely transformed product is only likely to also be a linear combination if the number of agents is at least the number of outcomes.
Right, I just noticed that. So T is out as a counterexample, and likewise U is just Charlie’s utility. Attempting to build another counterexample.