Ah, I see. You’re assuming agents have bounded utility. Well in that case, yes, there is a canonical way to compare utilities. However, that by itself doesn’t justify adopting that particular way of comparing utilities. Suppose you have two agents, A and B, with identical preferences except that agent A strongly prefers there to be an odd number of stars in the Milky Way. As long as effecting that desire is impractical, A and B will exhibit the same preferences; but normalizing their utilities to fit the range (-1,1) will mean that you treat A as a utility monster.
Is bounded utility truly necessary to normalize it? So long as the utility function never actually returns infinity in practice, normalization will work. What would a world state with infinite utility look like, anyway, and would it be reachable from any world state with finite utility? Reductionism implies that one single physical change would cause a discontinuous jump in utility from finite to infinite, and that seems to break the utility function itself. Another way to look at it is that the utility function is unbounded because it depends on the world state; if the world state were allowed to be infinite than an infinite utility could result. However, we are fairly certain that we will only ever have access to a finite amount of energy and matter in this universe. If that turns out not to be true then I imagine utilitarianism will cease to be useful as a result.
I’m failing to understand your reasoning about treating A as a utility monster (normalizing would make its utilities slightly lower than B for the same things, right?). I suppose I don’t really see this as a problem, though. If “odd number of stars in the milky way” has utility 1 for A, then that means A actually really, really wants “odd number of stars in the milky way”, at the expense of everything else. All other things being equal, you might think it wise to split an ice cream cone evenly between A and B, but B will be happy with half an ice cream cone and A will be happy with half an ice cream cone except for the nagging desire for an odd number of stars in the galaxy. If you’ve ever tried to enjoy an ice cream cone while stressed out, you may understand the feeling. If nothing can be done to assuage A’s burning desire that ruins the utility of other things, then why not give more of those things to B? If, instead, you meant that if A values odd stars with utility 1 we should pursue that over all of B’s goals, then I don’t think that follows. If it’s just A and B, the fair thing would be to spend half the available resources on confirming an odd number of stars or destroying one star and the other half on B’s highest preference.
I think calibrating utility functions by their extreme values is weird because outcomes of extreme utility are exotic and don’t occur in practice. If one really wants to compare decision-theoretic utilities between people, perhaps a better approach is choosing some basket of familiar outcomes to calibrate on. This would be interesting to see and I’m not sure if anyone has thought about that approach.
I thought it was similarly weird to allow any agent to, for instance, obtain 3^^^3 utilons for some trivially satisfiable desire. Isn’t that essentially what allows the utility monster in the first place? I see existential risk and the happiness of future humans as similar problems; If existential risk is incredibly negative then we should do nothing but alleviate existential risk. If the happiness of future humans is so incredibly positive then we should become future human happiness maximizers (and by extension each of those future humans should also become future human happiness maximizers).
The market has done a fairly good job of assigning costs to common outcomes. We can compare outcomes by what people are willing to pay for them (or pay to avoid them), assuming they have the same economic means at their disposal.
Another idea I have had is to use instant run-off voting for world states. In the utility function, every world state is ranked according to preferences and then the first world state to achieve a 50% majority of votes in the run-off process is the most ethical world state.
Bounded utility and infinite utility are different things. A utility function u from outcomes to real numbers is bounded if there is a number M such that for every outcome x, we have |u(x)| < M.
Bounded utility and infinite utility are different things. A utility function u from outcomes to real numbers is bounded if there is a number M such that for every outcome x, we have |u(x)| < M.
I was confused, thanks There are two ways that I can imagine having a bounded utility function; either define the function so that it has a finite bound or only define it over a finite domain. I was only thinking about the former when I wrote that comment (and not assuming its range was limited to the reals, e.g. “infinity” was a valid utility), and so I missed the fact that the utility function could be unbounded as the result of an infinite domain.
When we talk about utility functions, we’re talking about functions that encode a rational agent’s preferences. It does not represent how happy an agent is.
First of all, was I wrong in assuming that A’s high preference for an odd number of stars puts it at a disadvantage to B in normalized utility, making B the utility monster? If not, please explain how A can become a utility monster if, e.g. A’s most important preference is having an odd number of stars and B’s most important preference is happily living forever. Doesn’t a utility monster only happen if one agent’s utility for the same things is overvalued, which normalization should prevent?
What does it mean for A and B to “have identical preferences” if in fact A has an overriding preference for an odd number of stars? I think that the maximum utility (if it exists) that an agent can achieve should be normalized against the maximum utility of other agents otherwise the immediate result is a utility monster. It’s one thing for A to have its own high utility for something, it’s quite another for A to have arbitrarily more utility than any other agent.
Also, if A’s highest preference has no chance of being an outcome then isn’t the solution to fix A’s utility function instead of favoring B’s achievable preferences? The other possibility is to do run-off voting on desired outcomes so that A’s top votes are always going to be for outcomes with an odd number of stars, but when those world states lose the votes will run off to the outcomes that are identical except for there being an even or indeterminate number of stars, and then A’s and B’s voting preferences will be exactly the same.
Agent utility and utilitarian utility (this renormaization/combining buisness) are two entirely seperate things. No reason the former has to impact the latter, in fact, as we can see, it causes utility monsters and such.
I can’t comment further. Every way I look at it, combining preferences (utilitarianism) is utterly incoherent. Game theory/cooperation seems the only tractible path. I don’t know the context here tho...
if A’s highest preference has no chance of being an outcome then isn’t the solution to fix A’s utility function
Solution for who? A certainly doesn’t want you mucking around it its utility function as that would cause it to not do good things in the universe (from its perspective)
Solution for who? A certainly doesn’t want you mucking around it its utility function as that would cause it to not do good things in the universe (from its perspective)
If A knows that a preferred outcome is completely unobtainable and it knows that some utilitarian theorist is going to discount its preferences with regard to another agent, isn’t it rational to adjust its utility function? Perhaps it’s not; striving for unobtainable goals is somehow a human trait.
In pathological cases like that, sure, you can blackmail it into adjusting its post-op utility function. But only if it became convinced that that gave it a higher chance of getting the things it currently wants.
A lot of those pathological cases go away with reflectively consistent decision thoeries, but perhaps not that one. Don’t feel like working it out.
Ah, you’re right. B would be the utility monster. Not because A’s normalized utilities are lower, but because the intervals between them are shorter. I could go into more detail in a top-level Discussion post, but I think we’re basically in agreement here.
Also, if A’s highest preference has no chance of being an outcome then isn’t the solution to fix A’s utility function instead of favoring B’s achievable preferences?
Well, now you’re abandoning the program of normalizing utilities and averaging them, the inadequacy of which program this thought experiment was meant to demonstrate.
Is bounded utility truly necessary to normalize it? So long as the utility function never actually returns infinity in practice, normalization will work.
Huh?
Suppose my utility function is unbounded and linear in kittens (for any finite number of kittens I am aware of, that number is the output of my utility function). How do you normalize this utility to [-1,1] (or any other interval) while preserving the property that I’m indifferent between 1 kitten and a 1/N chance of N kittens?
Is the number of possible kittens bounded? That’s the point I was missing earlier.
If the number of kittens is bounded by M, your maximum utility u_max is bounded by M times the constant utility of a kitten (M * u_kitten). Therefore u_kitten is bounded by 1/M.
In future, consider expressing these arguments in terms of ponies. Why make a point using hypothetical utility functions, when you can make the same point by talking about what we really value?
I can think of an infinite utility scenario. Say the AI figures out a way to run arbitrarily powerful computations in constant time. Say it’s utility function is over survival and happiness of humans. Say it runs an infinite loop (in constant time), consisting of a formal system containing implementations of human minds, which it can prove will have some minimum happiness, forever. Thus, it can make predictions about its utility a thousand years from now just as accurately as ones about a billion years from now, or n, where n is an finite number of years. Summing the future utility of the choice to turn on the computer, from zero to infinity, would be an infinite result. Contrived I know, but the point stands.
Is bounded utility truly necessary to normalize it? So long as the utility function never actually returns infinity in practice, normalization will work. What would a world state with infinite utility look like, anyway, and would it be reachable from any world state with finite utility? Reductionism implies that one single physical change would cause a discontinuous jump in utility from finite to infinite, and that seems to break the utility function itself. Another way to look at it is that the utility function is unbounded because it depends on the world state; if the world state were allowed to be infinite than an infinite utility could result. However, we are fairly certain that we will only ever have access to a finite amount of energy and matter in this universe. If that turns out not to be true then I imagine utilitarianism will cease to be useful as a result.
I’m failing to understand your reasoning about treating A as a utility monster (normalizing would make its utilities slightly lower than B for the same things, right?). I suppose I don’t really see this as a problem, though. If “odd number of stars in the milky way” has utility 1 for A, then that means A actually really, really wants “odd number of stars in the milky way”, at the expense of everything else. All other things being equal, you might think it wise to split an ice cream cone evenly between A and B, but B will be happy with half an ice cream cone and A will be happy with half an ice cream cone except for the nagging desire for an odd number of stars in the galaxy. If you’ve ever tried to enjoy an ice cream cone while stressed out, you may understand the feeling. If nothing can be done to assuage A’s burning desire that ruins the utility of other things, then why not give more of those things to B? If, instead, you meant that if A values odd stars with utility 1 we should pursue that over all of B’s goals, then I don’t think that follows. If it’s just A and B, the fair thing would be to spend half the available resources on confirming an odd number of stars or destroying one star and the other half on B’s highest preference.
I thought it was similarly weird to allow any agent to, for instance, obtain 3^^^3 utilons for some trivially satisfiable desire. Isn’t that essentially what allows the utility monster in the first place? I see existential risk and the happiness of future humans as similar problems; If existential risk is incredibly negative then we should do nothing but alleviate existential risk. If the happiness of future humans is so incredibly positive then we should become future human happiness maximizers (and by extension each of those future humans should also become future human happiness maximizers).
The market has done a fairly good job of assigning costs to common outcomes. We can compare outcomes by what people are willing to pay for them (or pay to avoid them), assuming they have the same economic means at their disposal.
Another idea I have had is to use instant run-off voting for world states. In the utility function, every world state is ranked according to preferences and then the first world state to achieve a 50% majority of votes in the run-off process is the most ethical world state.
Bounded utility and infinite utility are different things. A utility function u from outcomes to real numbers is bounded if there is a number M such that for every outcome x, we have |u(x)| < M.
When we talk about utility functions, we’re talking about functions that encode a rational agent’s preferences. It does not represent how happy an agent is.
I was confused, thanks There are two ways that I can imagine having a bounded utility function; either define the function so that it has a finite bound or only define it over a finite domain. I was only thinking about the former when I wrote that comment (and not assuming its range was limited to the reals, e.g. “infinity” was a valid utility), and so I missed the fact that the utility function could be unbounded as the result of an infinite domain.
First of all, was I wrong in assuming that A’s high preference for an odd number of stars puts it at a disadvantage to B in normalized utility, making B the utility monster? If not, please explain how A can become a utility monster if, e.g. A’s most important preference is having an odd number of stars and B’s most important preference is happily living forever. Doesn’t a utility monster only happen if one agent’s utility for the same things is overvalued, which normalization should prevent?
What does it mean for A and B to “have identical preferences” if in fact A has an overriding preference for an odd number of stars? I think that the maximum utility (if it exists) that an agent can achieve should be normalized against the maximum utility of other agents otherwise the immediate result is a utility monster. It’s one thing for A to have its own high utility for something, it’s quite another for A to have arbitrarily more utility than any other agent.
Also, if A’s highest preference has no chance of being an outcome then isn’t the solution to fix A’s utility function instead of favoring B’s achievable preferences? The other possibility is to do run-off voting on desired outcomes so that A’s top votes are always going to be for outcomes with an odd number of stars, but when those world states lose the votes will run off to the outcomes that are identical except for there being an even or indeterminate number of stars, and then A’s and B’s voting preferences will be exactly the same.
Agent utility and utilitarian utility (this renormaization/combining buisness) are two entirely seperate things. No reason the former has to impact the latter, in fact, as we can see, it causes utility monsters and such.
I can’t comment further. Every way I look at it, combining preferences (utilitarianism) is utterly incoherent. Game theory/cooperation seems the only tractible path. I don’t know the context here tho...
Solution for who? A certainly doesn’t want you mucking around it its utility function as that would cause it to not do good things in the universe (from its perspective)
If A knows that a preferred outcome is completely unobtainable and it knows that some utilitarian theorist is going to discount its preferences with regard to another agent, isn’t it rational to adjust its utility function? Perhaps it’s not; striving for unobtainable goals is somehow a human trait.
In pathological cases like that, sure, you can blackmail it into adjusting its post-op utility function. But only if it became convinced that that gave it a higher chance of getting the things it currently wants.
A lot of those pathological cases go away with reflectively consistent decision thoeries, but perhaps not that one. Don’t feel like working it out.
Ah, you’re right. B would be the utility monster. Not because A’s normalized utilities are lower, but because the intervals between them are shorter. I could go into more detail in a top-level Discussion post, but I think we’re basically in agreement here.
Well, now you’re abandoning the program of normalizing utilities and averaging them, the inadequacy of which program this thought experiment was meant to demonstrate.
Huh?
Suppose my utility function is unbounded and linear in kittens (for any finite number of kittens I am aware of, that number is the output of my utility function). How do you normalize this utility to [-1,1] (or any other interval) while preserving the property that I’m indifferent between 1 kitten and a 1/N chance of N kittens?
Is the number of possible kittens bounded? That’s the point I was missing earlier.
If the number of kittens is bounded by M, your maximum utility u_max is bounded by M times the constant utility of a kitten (M * u_kitten). Therefore u_kitten is bounded by 1/M.
In future, consider expressing these arguments in terms of ponies. Why make a point using hypothetical utility functions, when you can make the same point by talking about what we really value?
I can think of an infinite utility scenario. Say the AI figures out a way to run arbitrarily powerful computations in constant time. Say it’s utility function is over survival and happiness of humans. Say it runs an infinite loop (in constant time), consisting of a formal system containing implementations of human minds, which it can prove will have some minimum happiness, forever. Thus, it can make predictions about its utility a thousand years from now just as accurately as ones about a billion years from now, or n, where n is an finite number of years. Summing the future utility of the choice to turn on the computer, from zero to infinity, would be an infinite result. Contrived I know, but the point stands.