When someone optimizes utility of a group of agents, all the utilities need to be combined. Taking sum of all utilities can create an issue where the world is optimized according to single agent’s utility function at the cost of others, if that function increases fast enough.
It’s probably better to maximize Ucombined=min(agent∈A)Uagent+arctan(max(agent∈A)Uagent−min(agent∈A)Uagent) - this way the top utility does not take over the full function since the difference can only add a finite amount of utility, but is still considered (so that there is incentive to improve the top utility if the minimal one has run into limitations).
Though, this creates a problem that agents would try to divide the utility function by some number to make their function considered more. It’s necessary to normalize the functions in some way, and I’m not sure how to do that.
If I wanted to combine 2 utility functions fairly, I’d add them, but first I’d normalize them by multiplying each one by the constant that makes its sum over the set of possible outcomes equal to 1. In symbols:
U_combined(o) = U_1(o) / (\sum_{o2 in O} U_1(o2)) + U_2(o) / (\sum_{o2 in O} U_2(o2)) for all o in O where O is the set of outcomes (world states or more generally world histories).
Wouldn’t that break if the sum (or integral) of an agent’s utility function over the world-state space was negative? Normalization would reverse that agent’s utility.
The situation becomes even worse if the sum over all outcomes is zero.
However, if you and I have the seed of a super-intelligence in front of us, waiting only on our specifying a utility function and for us to press the “start” button, then if we can individually specify what we want for the world in the form of a utility function, then it would prove easy for us to work around the first of the two gotchas you point out.
As for the second gotcha, if we were at all pressed for time, I’d go ahead with my normalization method on the theory that the probability of the sum’s turning out to be exactly zero is very low.
I am interested however in hearing from readers who are better at math than I: how can the normalization method can be improved to remove the two gotchas?
ADDED. What I wrote so far in this comment fails to get at the heart of the matter. The purpose of a utility function is to encode preferences. Restricting our discourse to utility functions such that for every o in O, U(o) is a real number greater than zero and less than one does not restrict the kinds of preferences that can be encoded. And when we do that, every utility function in our universe of discourse can be normalized using the method already given—free from the two gotchas you pointed out. (In other words, instead of describing a gotcha-free method for normalizing arbitrary utility functions, I propose that we simply avoid defining certain utility functions that might be trigger one of the gotchas.)
Specifically, if o_worst is the worst outcome according to the agent under discussion and o_best is its best outcome, set U(o_worst)=0, U(o_best)=1 and for every other outcome o, set U(o) = p where p is the probability for which the agent is indifferent between o and the lottery [p, o_best; 1-p, o_worst].
Specifically, if o_worst is the worst outcome according to the agent under discussion and o_best is its best outcome, set U(o_worst)=0, U(o_best)=1 and for the other outcomes o, set U(o) = p where p is the probability for which the agent is indifferent between o and the lottery [p, o_best; 1-p, o_worst].
It seems that “Unifying Bargaining” sequence relies on being able to denominate utility in some units that can actually be obtained and suggested as trade to any party so that party would get worse results claiming to have other utility function (with the same preference ordering but with other values).
In humans, and perhaps all complex agents, utility is an unmeasurable abstraction about multidimensional preferences and goals. It can’t be observed, let alone summed or calculated. It CAN be modeled and estimated, and it’s fair to talk about aggregation functions of one’s estimates of utilities, or about aggregation of self-reported estimates or indications from others.
It is your own modeling choice to dislike the outcome of outsized influence via larger utility swings in some participants. How you normalize it is a preference of yours, not an objective fact about the world.
Yes, this is indeed a preference of mine (and other people as well), and I’m attempting to find the way to combine utilities that is as good as possible according to my and other people preferences (so that it can be incorporated into AGI, for example).
When someone optimizes utility of a group of agents, all the utilities need to be combined. Taking sum of all utilities can create an issue where the world is optimized according to single agent’s utility function at the cost of others, if that function increases fast enough.
It’s probably better to maximize Ucombined=min(agent∈A)Uagent+arctan(max(agent∈A)Uagent−min(agent∈A)Uagent) - this way the top utility does not take over the full function since the difference can only add a finite amount of utility, but is still considered (so that there is incentive to improve the top utility if the minimal one has run into limitations).
Though, this creates a problem that agents would try to divide the utility function by some number to make their function considered more. It’s necessary to normalize the functions in some way, and I’m not sure how to do that.
Nit: I’m not seeing how “increase” is well defined here, but I probably know what you mean anyways.
I thought we were talking about combining utility functions, but I see only one utility function here, not counting the combined one:
If I wanted to combine 2 utility functions fairly, I’d add them, but first I’d normalize them by multiplying each one by the constant that makes its sum over the set of possible outcomes equal to 1. In symbols:
U_combined(o) = U_1(o) / (\sum_{o2 in O} U_1(o2)) + U_2(o) / (\sum_{o2 in O} U_2(o2)) for all o in O where O is the set of outcomes (world states or more generally world histories).
Wouldn’t that break if the sum (or integral) of an agent’s utility function over the world-state space was negative? Normalization would reverse that agent’s utility.
The situation becomes even worse if the sum over all outcomes is zero.
Good catch.
However, if you and I have the seed of a super-intelligence in front of us, waiting only on our specifying a utility function and for us to press the “start” button, then if we can individually specify what we want for the world in the form of a utility function, then it would prove easy for us to work around the first of the two gotchas you point out.
As for the second gotcha, if we were at all pressed for time, I’d go ahead with my normalization method on the theory that the probability of the sum’s turning out to be exactly zero is very low.
I am interested however in hearing from readers who are better at math than I: how can the normalization method can be improved to remove the two gotchas?
ADDED. What I wrote so far in this comment fails to get at the heart of the matter. The purpose of a utility function is to encode preferences. Restricting our discourse to utility functions such that for every o in O, U(o) is a real number greater than zero and less than one does not restrict the kinds of preferences that can be encoded. And when we do that, every utility function in our universe of discourse can be normalized using the method already given—free from the two gotchas you pointed out. (In other words, instead of describing a gotcha-free method for normalizing arbitrary utility functions, I propose that we simply avoid defining certain utility functions that might be trigger one of the gotchas.)
Specifically, if o_worst is the worst outcome according to the agent under discussion and o_best is its best outcome, set U(o_worst)=0, U(o_best)=1 and for every other outcome o, set U(o) = p where p is the probability for which the agent is indifferent between o and the lottery [p, o_best; 1-p, o_worst].
That’s a nice workaround!
https://en.wikipedia.org/wiki/Prioritarianism and https://www.lesswrong.com/posts/hTMFt3h7QqA2qecn7/utilitarianism-meets-egalitarianism may be relevant here. Are you familiar with https://www.lesswrong.com/s/hCt6GL4SXX6ezkcJn as well? I’m curious how you’d compare the justification for your use of arctan to the justifications in those articles.
Thank you for those articles!
It seems that “Unifying Bargaining” sequence relies on being able to denominate utility in some units that can actually be obtained and suggested as trade to any party so that party would get worse results claiming to have other utility function (with the same preference ordering but with other values).
In humans, and perhaps all complex agents, utility is an unmeasurable abstraction about multidimensional preferences and goals. It can’t be observed, let alone summed or calculated. It CAN be modeled and estimated, and it’s fair to talk about aggregation functions of one’s estimates of utilities, or about aggregation of self-reported estimates or indications from others.
It is your own modeling choice to dislike the outcome of outsized influence via larger utility swings in some participants. How you normalize it is a preference of yours, not an objective fact about the world.
Yes, this is indeed a preference of mine (and other people as well), and I’m attempting to find the way to combine utilities that is as good as possible according to my and other people preferences (so that it can be incorporated into AGI, for example).