I would think that knowing evo psych is enough to realize [having an FAI find out human preferences, and then do them] is a dodgy approach at best.
I don’t see the connection, but I do care about the issue. Can you attempt to state an argument for that?
Human preferences are an imperfect abstraction. People talk about them all the time and reason usefully about them, so either an AI could do the same, or you found a counterexample to the Church-Turing thesis. “Human preferences” is a useful concept no matter where those preferences come from, so evo psych doesn’t matter.
Similarly, my left hand is an imperfect abstraction. Blood flows in, blood flows out, flakes of skin fall off, it gets randomly contaminated from the environment, and the boundaries aren’t exactly defined, but nevertheless it generally does make sense to think in terms of my left hand.
If you’re going to argue that FAI defined in terms of inferring human preferences can’t work, I hope that isn’t also going to be an argument that an AI can’t possibly use the concept of my left hand, since the latter conclusion would be absurd.
Sure. I think I should clarify first that I meant evo psych should have been sufficient to realize that human preferences are not rigorously coherent. If I tell a FAI to make me do what I want to do, its response is going to be “which you?”, as there is no Platonic me with a quickly identifiable utility function that it can optimize for me. There’s just a bunch of modules that won the evolutionary tournament of survival because they’re a good way to make grandchildren.
If I am conflicted between the emotional satisfaction of food and the emotional dissatisfaction of exercise combined with the social satisfaction of beauty, will a FAI be able to resolve that for me any more easily than I can resolve it?
If my far mode desires are rooted in my desire to have a good social identity, should the FAI choose those over my near mode desires which are rooted in my desire to survive and enjoy life?
In some sense, the problem of FAI is the problem of rigorously understanding humans, and evo psych suggests that will be a massively difficult problem. That’s what I was trying to suggest with my comment.
In some sense, the problem of FAI is the problem of rigorously understanding humans, and evo psych suggests that will be a massively difficult problem.
I think that bar is unreasonably high. If you have conflict between enjoying eating a lot vs being skinny and beautiful, and the FAI helps you do one or the other, then you aren’t in a position to complain that it did the wrong thing. It’s understanding of you doesn’t have to be more rigorous than your understanding of you.
It’s understanding of you doesn’t have to be more rigorous than your understanding of you.
It does if I want it to give me results any better than I can provide for myself. I also provided the trivial example of internal conflicts- external conflicts are much more problematic. Human desire for status is possibly the source of all human striving and accomplishment. How will a FAI deal with the status conflicts that develop?
It’s understanding of you doesn’t have to be more rigorous than your understanding of you.
It does if I want it to give me results any better than I can provide for myself.
No. For example, if it develops some diet drug that lets you safely enjoy eating and still stay skinny and beautiful, that might be a better result than you could provide for yourself, and it doesn’t need any special understanding of you to make that happen. It just makes the drug, makes sure you know the consequences of taking it, and offers it to you. If you choose take it, that tells the AI more about your preferences, but there’s no profound understanding of psychology required.
I also provided the trivial example of internal conflicts- external conflicts are much more problematic.
Putting an inferior argument first is good if you want to try to get the last word, but it’s not a useful part of problem solving. You should try to find the clearest problem where solving that problem solves all the other ones.
How will a FAI deal with the status conflicts that develop?
If it can do a reasonable job of comparing utilities across people, then maximizing average utility seems to do the right thing here. Comparing utilities between arbitrary rational agents doesn’t work, but comparing utilities between humans seems to—there’s an approximate universal maximum (getting everything you want) and an approximate universal minimum (you and all your friends and relatives getting tortured to death). Status conflicts are not one of the interesting use cases. Do you have anything better?
For example, if it develops some diet drug that lets you safely enjoy eating and still stay skinny and beautiful, that might be a better result than you could provide for yourself, and it doesn’t need any special understanding of you to make that happen.
It might not need special knowledge of my psychology, but it certainly needs special knowledge of my physiology.
But notice that the original point was about human preferences. Even if it provides new technologies that dissolve internal conflicts, the question of whether or not to use the technology becomes a conflict. Remember, we live in a world where some people have strong ethical objections to vaccines. An old psychological finding is that oftentimes, giving people more options makes them worse off. If the AI notices that one of my modules enjoys sensory pleasure, offers to wirehead me, and I reject it on philosophical grounds, I could easily become consumed by regret or struggles with temptation, and wish that I never had been offered wireheading in the first place.
Putting an inferior argument first is good if you want to try to get the last word, but it’s not a useful part of problem solving. You should try to find the clearest problem where solving that problem solves all the other ones.
I put the argument of internal conflicts first because it was the clearest example, and you’ll note it obliquely refers to the argument about status. Did you really think that, if a drug were available to make everyone have perfectly sculpted bodies, one would get the same social satisfaction from that variety of beauty?
If it can do a reasonable job of comparing utilities across people, then maximizing average utility seems to do the right thing here.
I doubt it can measure utilities; as I argued two posts ago, and simple average utilitarianism is so wracked with problems I’m not even sure where to begin.
Comparing utilities between arbitrary rational agents doesn’t work, but comparing utilities between humans seems to—there’s an approximate universal maximum (getting everything you want) and an approximate universal minimum (you and all your friends and relatives getting tortured to death).
A common tactic in human interaction is to care about everything more than the other person does, and explode (or become depressed) when they don’t get their way. How should such real-life utility monsters be dealt with?
Status conflicts are not one of the interesting use cases.
I haven’t heard of people having strong ethical objections to vaccines. They have strong practical (if ill-founded) objections—they believe vaccines have dangers so extreme as to make the benefits not worth it, or they have strong heuristic objections—I think they believe health is an innate property of an undisturbed body or they believe that anyone who makes money from selling a drug can’t be trusted to tell the truth about its risks.
To my mind, an ethical objection would be a belief that people should tolerate the effects of infectious diseases for some reason such as that suffering is good in itself or that it’s better for selection to enable people to develop innate immunities.
To my mind, an ethical objection would be a belief that people should tolerate the effects of infectious diseases for some reason such as that suffering is good in itself
That wasn’t precisely the objection of Christian conservatives to the HPV vaccine (perhaps more nearly that they wanted sex to lead to suffering?), but it is fairly close
A common tactic in human interaction is to care about everything more than the other person does, and explode (or become depressed) when they don’t get their way. How should such real-life utility monsters be dealt with?
If everyone’s inferred utility goes from 0 to 1, and the real-life utility monster cares more than the other people about one thing, the inferred utility will say he cares less than other people about something else. Let him play that game until the something else happens, then he loses, and that’s a fine outcome.
simple average utilitarianism is so wracked with problems I’m not even sure where to begin.
The problems I’m aware of have to do with creating new people. If you assume a fixed population and humans who have comparable utilities as described above, are there any problems left? Creating new people is a more interesting use case than status conflicts.
Why do you find status uninteresting?
As I said, because maximizing average utility seems to get a reasonable result in that case.
If everyone’s inferred utility goes from 0 to 1, and the real-life utility monster cares more than the other people about one thing, the inferred utility will say he cares less than other people about something else. Let him play that game until the something else happens, then he loses, and that’s a fine outcome.
That’s not the situation I’m describing; if 0 is “you and all your friends and relatives getting tortured to death” and 1 is “getting everything you want,” the utility monster is someone who puts “not getting one thing I want” at, say, .1 whereas normal people put it at .9999.
I think it can, in principle, estimate utilities from behavior.
And if humans turn out to be adaption-executers, then utility is going to look really weird, because it’ll depend a lot on framing and behavior.
The problems I’m aware of have to do with creating new people.
How do you add two utilities together? If you can’t add, how can you average?
As I said, because maximizing average utility seems to get a reasonable result in that case.
If people dislike losses more than they like gains and status is zero-sum, does that mean the reasonable result of average utilitarianism when applied to status is that everyone must be exactly the same status?
>If you can’t add, how can you average?
You can average but not add elements of an affine space. The average between the position of the tip of my nose and the point two metres west of it is the point one metre west of it, but their sum is not a well-defined concept (you’d have to pick an origin first, and the answer will depend on it).
(More generally, you can only take linear combinations whose coefficients sum to 1 (to get another element of the affine space) or to 0 (to get a vector). Anyway, the values of two different utility functions aren’t even elements of the same affine space, so you still can’t average them. The values of the same utility function are, and the average between U1 and U2 is U3 such that you’d be indifferent between 100% probability of U3, and 50% probability of each of U1 and U2.)
My point is that “If you can’t add, how can you average?” is not a valid argument, even though in this particular case both the premise and the conclusion happen to be correct.
My point is that “If you can’t add, how can you average?” is not a valid argument, even though in this particular case both the premise and the conclusion happen to be correct.
If I ask “If you can’t add, how can you average?” and TimFreeman responds with “by using utilities that live in affine spaces,” I then respond with “great, those utilities are useless for doing what you want to do.” When a rhetorical question has an answer, the answer needs to be material to invalidate its rhetorical function; where’s the invalidity?
I took the rhetorical question to implicitly be the syllogism ‘you can’t sum different people’s utilities, you can’t average what you can’ t sum, therefore you can’ average different people’s utilities’. I just pointed out that the second premise isn’t generally true. (Both the first premise and the conclusion are true, which is why it’s a nitpick.) Did I over-interpret the rhetorical question?
The direction I took the rhetorical question was “utilities aren’t numbers, they’re mappings,” which does not require the second premise. I agree with you that the syllogism you presented is flawed.
Utility functions are families of mappings from futures to reals, which don’t live in an affine space, as you mention.
Are you sure? The only thing one really wants from a utility function is ranking, which is even weaker a requirement than affine spaces. All monotonic remappings are in the same equivalency class.
The only thing one really wants from a utility function is ranking, which is even weaker a requirement than affine spaces.
It’s practically useful to have reals rather than rankings, because that lets one determine how the function will behave for different probabilistic combinations of futures. If you already have the function fully specified over uncertain futures, then only providing a ranking is sufficient for the output.
The reason why I mentioned that it was a mapping, though, is because the output of a single utility function can be seen as an affine space. The point I was making in the ancestral posts was that while it looks like the outputs of two different utility functions play nicely, careful consideration shows that their combination destroys the mapping, which is what makes utility functions useful.
All monotonic remappings are in the same equivalency class.
I’m hearing an echo of praxeology here; specifically the notion that humans use something like stack-ranking rather than comparison of real-valued utilities to make decisions. This seems like it could be investigated neurologically ….
Huh, no. If army1987.U($1000) = shminux.U($1000) = 1, army1987.U($10,000) = 1.9, shminux.U($10,000) = 2.1, and army1987.U($100,000) = shminux.U($100,000) = 3, then then I would prefer 50% probability of $1000 and 50% probability of $100,000 rather than 100% probability of $10,000, and you wouldn’t.
Using an interval scale? I don’t have anything to contribute to the question of interpersonal utility comparison, but the average of two values from the same agent’s utility function is easy enough, while addition is still undefined.
If everyone’s inferred utility goes from 0 to 1, and the real-life utility monster cares more than the other people about one thing, the inferred utility will say he cares less than other people about something else. Let him play that game until the something else happens, then he loses, and that’s a fine outcome.
That’s not the situation I’m describing; if 0 is “you and all your friends and relatives getting tortured to death” and 1 is “getting everything you want,” the utility monster is someone who puts “not getting one thing I want” at, say, .1 whereas normal people put it at .9999.
You have failed to disagree with me. My proposal exactly fits your alleged counterexample.
Suppose Alice is a utility monster where:
U(Alice, torture of everybody) = 0
U(Alice, everything) = 1
U(Alice, no cookie) = 0.1
U(Alice, Alice dies) = 0.05
And Bob is normal, except he doesn’t like Alice:
U(Bob, torture of everybody) = 0
U(Bob, everything) = 1
U(Bob, Alice lives, no cookie) = 0.8
U(Bob, Alice dies, no cookie) = 0.9
If the FAI has a cookie it can give to Bob or Alice, it will give it to Alice, since U(cookie to Bob) = U(Bob, everything) + U(Alice, everything but a cookie) = 1 + 0.1 = 1.1 < U(cookie to Alice) = U(Bob, everything but a cookie) + U(Alice, everything) = 0.8 + 1 = 1.8. Thus Alice gets her intended reward for being a utility monster.
However, if the are no cookies available and the FAI can kill Alice, it will do so for the benefit of Bob, since U(Bob, Alice lives, no cookie) + U(Alice, Alice lives, no cookie) = 0.8 + 0.1 = 0.9 < U(Bob, Alice dies, no cookie) + U(Alice, Alice dies) = 0.9 + 0.05 = 0.95. The basic problem is that since Alice had the cookie fixation, that ate up so much of her utility range that her desire to live in the absence of the cookie was outweighed by Bob finding her irritating.
Another problem with Alice’s utility is that it supports the FAI doing lotteries that Alice would apparently prefer but a normal person would not. For example, assuming the outcome for Bob does not change, the FAI should prefer 50% Alice dies + 50% Alice gets a cookie (adds to 0.525) over 100% Alice lives without a cookie (which is 0.1). This is a different issue from interpersonal utility comparison.
How do you add two utilities together?
They are numbers. Add them.
And if humans turn out to be adaption-executers, then utility is going to look really weird, because it’ll depend a lot on framing and behavior.
Yes. So far as I can tell, if the FAI is going to do what people want, it has to model people as though they want something, and that means ascribing utility functions to them. Better alternatives are welcome. Giving up because it’s a hard problem is not welcome.
If people dislike losses more than they like gains and status is zero-sum, does that mean the reasonable result of average utilitarianism when applied to status is that everyone must be exactly the same status?
No. If Alice has high status and Bob has low status, and the FAI takes action to lower Alice’s status and raise Bob’s, and people hate losing, then Alice’s utility decrease will exceed Bob’s utility increase, so the FAI will prefer to leave the status as it is. Similarly, the FAI isn’t going to want to increase Alice’s status at the expense of Bob. The FAI just won’t get involved in the status battles.
I have not found this conversation rewarding. Unless there’s an obvious improvement in the quality of your arguments, I’ll drop out.
Edit: Fixed the math on the FAI-kills-Alice scenario. Vaniver continued to change the topic with every turn, so I won’t be continuing the conversation.
So far as I can tell, if the FAI is going to do what people want, it has to model people as though they want something, and that means ascribing utility functions to them. Better alternatives are welcome. Giving up because it’s a hard problem is not welcome.
What if wants did not exist a priori, but only in response to stimuli? Alice, for example, doesn’t care about cookies, she cares about getting her way. If the FAI tells Alice and Bob “look, I have a cookie; how shall I divide it between you?” Alice decides that the cookie is hers and she will throw the biggest tantrum if the FAI decides otherwise, whereas Bob just grumbles to himself. If the FAI tells Alice and Bob individually “look, I’m going to make a cookie just for you, what would you like in it?” both of them enjoy the sugar, the autonomy of choosing, and the feel of specialness, without realizing that they’re only eating half of the cookie dough.
Suppose Alice is just as happy in both situations, because she got her way in both situations, and that Bob is happier in the second situation, because he gets more cookie. In such a scenario, the FAI would never ask Alice and Bob to come up with a plan to split resources between the two of them, because Alice would turn it into a win/lose situation.
It seems to me that an FAI would engage in want curation rather than want satisfaction. As the saying goes, seek to want what you have, rather than seeking to have what you want. A FAI who engages in that behavior would be more interested in a stimuli-response model of human behavior and mental states than a consequentialist-utility model of human behavior and mental states.
Another problem with Alice’s utility is that it supports the FAI doing lotteries that Alice would apparently prefer but a normal person would not.
This is one of the reasons why utility monsters tend to seem self-destructive; they gamble farther and harder than most people would.
They are numbers. Add them.
How do we measure one person’s utility? Preferences revealed by actions? (That is, given a mapping from situations to actions to consequences, I can construct a utility function which takes situations and consequences as inputs and returns the decision taken.) If so, when we add two utilities together, does the resulting number still uniquely identify the actions taken by both parties?
So are the atmospheric pressure in my room and the price of silver. But you cannot add them together (unless you have a conversion factor from millibars to dollars per ounce).
So are the atmospheric pressure in my room and the price of silver. But you cannot add them together (unless you have a conversion factor from millibars to dollars per ounce).
Your analogy is invalid, and in general analogy is a poor substitute for a rational argument. In the thread you’re replying to, I proposed a scheme for getting Alice’s utility to be commensurate with Bob’s so they can be added. It makes sense to argue that the scheme doesn’t work, but it doesn’t make sense to pretend it does not exist.
I don’t see the connection, but I do care about the issue. Can you attempt to state an argument for that?
Human preferences are an imperfect abstraction. People talk about them all the time and reason usefully about them, so either an AI could do the same, or you found a counterexample to the Church-Turing thesis. “Human preferences” is a useful concept no matter where those preferences come from, so evo psych doesn’t matter.
Similarly, my left hand is an imperfect abstraction. Blood flows in, blood flows out, flakes of skin fall off, it gets randomly contaminated from the environment, and the boundaries aren’t exactly defined, but nevertheless it generally does make sense to think in terms of my left hand.
If you’re going to argue that FAI defined in terms of inferring human preferences can’t work, I hope that isn’t also going to be an argument that an AI can’t possibly use the concept of my left hand, since the latter conclusion would be absurd.
Sure. I think I should clarify first that I meant evo psych should have been sufficient to realize that human preferences are not rigorously coherent. If I tell a FAI to make me do what I want to do, its response is going to be “which you?”, as there is no Platonic me with a quickly identifiable utility function that it can optimize for me. There’s just a bunch of modules that won the evolutionary tournament of survival because they’re a good way to make grandchildren.
If I am conflicted between the emotional satisfaction of food and the emotional dissatisfaction of exercise combined with the social satisfaction of beauty, will a FAI be able to resolve that for me any more easily than I can resolve it?
If my far mode desires are rooted in my desire to have a good social identity, should the FAI choose those over my near mode desires which are rooted in my desire to survive and enjoy life?
In some sense, the problem of FAI is the problem of rigorously understanding humans, and evo psych suggests that will be a massively difficult problem. That’s what I was trying to suggest with my comment.
I think that bar is unreasonably high. If you have conflict between enjoying eating a lot vs being skinny and beautiful, and the FAI helps you do one or the other, then you aren’t in a position to complain that it did the wrong thing. It’s understanding of you doesn’t have to be more rigorous than your understanding of you.
It does if I want it to give me results any better than I can provide for myself. I also provided the trivial example of internal conflicts- external conflicts are much more problematic. Human desire for status is possibly the source of all human striving and accomplishment. How will a FAI deal with the status conflicts that develop?
No. For example, if it develops some diet drug that lets you safely enjoy eating and still stay skinny and beautiful, that might be a better result than you could provide for yourself, and it doesn’t need any special understanding of you to make that happen. It just makes the drug, makes sure you know the consequences of taking it, and offers it to you. If you choose take it, that tells the AI more about your preferences, but there’s no profound understanding of psychology required.
Putting an inferior argument first is good if you want to try to get the last word, but it’s not a useful part of problem solving. You should try to find the clearest problem where solving that problem solves all the other ones.
If it can do a reasonable job of comparing utilities across people, then maximizing average utility seems to do the right thing here. Comparing utilities between arbitrary rational agents doesn’t work, but comparing utilities between humans seems to—there’s an approximate universal maximum (getting everything you want) and an approximate universal minimum (you and all your friends and relatives getting tortured to death). Status conflicts are not one of the interesting use cases. Do you have anything better?
It might not need special knowledge of my psychology, but it certainly needs special knowledge of my physiology.
But notice that the original point was about human preferences. Even if it provides new technologies that dissolve internal conflicts, the question of whether or not to use the technology becomes a conflict. Remember, we live in a world where some people have strong ethical objections to vaccines. An old psychological finding is that oftentimes, giving people more options makes them worse off. If the AI notices that one of my modules enjoys sensory pleasure, offers to wirehead me, and I reject it on philosophical grounds, I could easily become consumed by regret or struggles with temptation, and wish that I never had been offered wireheading in the first place.
I put the argument of internal conflicts first because it was the clearest example, and you’ll note it obliquely refers to the argument about status. Did you really think that, if a drug were available to make everyone have perfectly sculpted bodies, one would get the same social satisfaction from that variety of beauty?
I doubt it can measure utilities; as I argued two posts ago, and simple average utilitarianism is so wracked with problems I’m not even sure where to begin.
A common tactic in human interaction is to care about everything more than the other person does, and explode (or become depressed) when they don’t get their way. How should such real-life utility monsters be dealt with?
Why do you find status uninteresting?
I haven’t heard of people having strong ethical objections to vaccines. They have strong practical (if ill-founded) objections—they believe vaccines have dangers so extreme as to make the benefits not worth it, or they have strong heuristic objections—I think they believe health is an innate property of an undisturbed body or they believe that anyone who makes money from selling a drug can’t be trusted to tell the truth about its risks.
To my mind, an ethical objection would be a belief that people should tolerate the effects of infectious diseases for some reason such as that suffering is good in itself or that it’s better for selection to enable people to develop innate immunities.
That wasn’t precisely the objection of Christian conservatives to the HPV vaccine (perhaps more nearly that they wanted sex to lead to suffering?), but it is fairly close
I am counting religious objections as ethical objections, and there are several groups out there that refuse all medical treatment.
If everyone’s inferred utility goes from 0 to 1, and the real-life utility monster cares more than the other people about one thing, the inferred utility will say he cares less than other people about something else. Let him play that game until the something else happens, then he loses, and that’s a fine outcome.
I think it can, in principle, estimate utilities from behavior. See http://www.fungible.com/respect.
The problems I’m aware of have to do with creating new people. If you assume a fixed population and humans who have comparable utilities as described above, are there any problems left? Creating new people is a more interesting use case than status conflicts.
As I said, because maximizing average utility seems to get a reasonable result in that case.
That’s not the situation I’m describing; if 0 is “you and all your friends and relatives getting tortured to death” and 1 is “getting everything you want,” the utility monster is someone who puts “not getting one thing I want” at, say, .1 whereas normal people put it at .9999.
And if humans turn out to be adaption-executers, then utility is going to look really weird, because it’ll depend a lot on framing and behavior.
How do you add two utilities together? If you can’t add, how can you average?
If people dislike losses more than they like gains and status is zero-sum, does that mean the reasonable result of average utilitarianism when applied to status is that everyone must be exactly the same status?
Correct but irrelevant. Utility functions are families of mappings from futures to reals, which don’t live in an affine space, as you mention.
This looks more like a mention of an unrelated but cool mathematical concept than a nitpick.
My point is that “If you can’t add, how can you average?” is not a valid argument, even though in this particular case both the premise and the conclusion happen to be correct.
If I ask “If you can’t add, how can you average?” and TimFreeman responds with “by using utilities that live in affine spaces,” I then respond with “great, those utilities are useless for doing what you want to do.” When a rhetorical question has an answer, the answer needs to be material to invalidate its rhetorical function; where’s the invalidity?
I took the rhetorical question to implicitly be the syllogism ‘you can’t sum different people’s utilities, you can’t average what you can’ t sum, therefore you can’ average different people’s utilities’. I just pointed out that the second premise isn’t generally true. (Both the first premise and the conclusion are true, which is why it’s a nitpick.) Did I over-interpret the rhetorical question?
The direction I took the rhetorical question was “utilities aren’t numbers, they’re mappings,” which does not require the second premise. I agree with you that the syllogism you presented is flawed.
Are you sure? The only thing one really wants from a utility function is ranking, which is even weaker a requirement than affine spaces. All monotonic remappings are in the same equivalency class.
It’s practically useful to have reals rather than rankings, because that lets one determine how the function will behave for different probabilistic combinations of futures. If you already have the function fully specified over uncertain futures, then only providing a ranking is sufficient for the output.
The reason why I mentioned that it was a mapping, though, is because the output of a single utility function can be seen as an affine space. The point I was making in the ancestral posts was that while it looks like the outputs of two different utility functions play nicely, careful consideration shows that their combination destroys the mapping, which is what makes utility functions useful.
Hence the ‘families’ comment.
I’m hearing an echo of praxeology here; specifically the notion that humans use something like stack-ranking rather than comparison of real-valued utilities to make decisions. This seems like it could be investigated neurologically ….
Huh, no. If army1987.U($1000) = shminux.U($1000) = 1, army1987.U($10,000) = 1.9, shminux.U($10,000) = 2.1, and army1987.U($100,000) = shminux.U($100,000) = 3, then then I would prefer 50% probability of $1000 and 50% probability of $100,000 rather than 100% probability of $10,000, and you wouldn’t.
Using an interval scale? I don’t have anything to contribute to the question of interpersonal utility comparison, but the average of two values from the same agent’s utility function is easy enough, while addition is still undefined.
I presume the average in question is interpersonal, not intertemporal, as we are discussing status conflicts (between individuals).
You have failed to disagree with me. My proposal exactly fits your alleged counterexample.
Suppose Alice is a utility monster where:
U(Alice, torture of everybody) = 0
U(Alice, everything) = 1
U(Alice, no cookie) = 0.1
U(Alice, Alice dies) = 0.05
And Bob is normal, except he doesn’t like Alice:
U(Bob, torture of everybody) = 0
U(Bob, everything) = 1
U(Bob, Alice lives, no cookie) = 0.8
U(Bob, Alice dies, no cookie) = 0.9
If the FAI has a cookie it can give to Bob or Alice, it will give it to Alice, since U(cookie to Bob) = U(Bob, everything) + U(Alice, everything but a cookie) = 1 + 0.1 = 1.1 < U(cookie to Alice) = U(Bob, everything but a cookie) + U(Alice, everything) = 0.8 + 1 = 1.8. Thus Alice gets her intended reward for being a utility monster.
However, if the are no cookies available and the FAI can kill Alice, it will do so for the benefit of Bob, since U(Bob, Alice lives, no cookie) + U(Alice, Alice lives, no cookie) = 0.8 + 0.1 = 0.9 < U(Bob, Alice dies, no cookie) + U(Alice, Alice dies) = 0.9 + 0.05 = 0.95. The basic problem is that since Alice had the cookie fixation, that ate up so much of her utility range that her desire to live in the absence of the cookie was outweighed by Bob finding her irritating.
Another problem with Alice’s utility is that it supports the FAI doing lotteries that Alice would apparently prefer but a normal person would not. For example, assuming the outcome for Bob does not change, the FAI should prefer 50% Alice dies + 50% Alice gets a cookie (adds to 0.525) over 100% Alice lives without a cookie (which is 0.1). This is a different issue from interpersonal utility comparison.
They are numbers. Add them.
Yes. So far as I can tell, if the FAI is going to do what people want, it has to model people as though they want something, and that means ascribing utility functions to them. Better alternatives are welcome. Giving up because it’s a hard problem is not welcome.
No. If Alice has high status and Bob has low status, and the FAI takes action to lower Alice’s status and raise Bob’s, and people hate losing, then Alice’s utility decrease will exceed Bob’s utility increase, so the FAI will prefer to leave the status as it is. Similarly, the FAI isn’t going to want to increase Alice’s status at the expense of Bob. The FAI just won’t get involved in the status battles.
I have not found this conversation rewarding. Unless there’s an obvious improvement in the quality of your arguments, I’ll drop out.
Edit: Fixed the math on the FAI-kills-Alice scenario. Vaniver continued to change the topic with every turn, so I won’t be continuing the conversation.
What if wants did not exist a priori, but only in response to stimuli? Alice, for example, doesn’t care about cookies, she cares about getting her way. If the FAI tells Alice and Bob “look, I have a cookie; how shall I divide it between you?” Alice decides that the cookie is hers and she will throw the biggest tantrum if the FAI decides otherwise, whereas Bob just grumbles to himself. If the FAI tells Alice and Bob individually “look, I’m going to make a cookie just for you, what would you like in it?” both of them enjoy the sugar, the autonomy of choosing, and the feel of specialness, without realizing that they’re only eating half of the cookie dough.
Suppose Alice is just as happy in both situations, because she got her way in both situations, and that Bob is happier in the second situation, because he gets more cookie. In such a scenario, the FAI would never ask Alice and Bob to come up with a plan to split resources between the two of them, because Alice would turn it into a win/lose situation.
It seems to me that an FAI would engage in want curation rather than want satisfaction. As the saying goes, seek to want what you have, rather than seeking to have what you want. A FAI who engages in that behavior would be more interested in a stimuli-response model of human behavior and mental states than a consequentialist-utility model of human behavior and mental states.
This is one of the reasons why utility monsters tend to seem self-destructive; they gamble farther and harder than most people would.
How do we measure one person’s utility? Preferences revealed by actions? (That is, given a mapping from situations to actions to consequences, I can construct a utility function which takes situations and consequences as inputs and returns the decision taken.) If so, when we add two utilities together, does the resulting number still uniquely identify the actions taken by both parties?
So are the atmospheric pressure in my room and the price of silver. But you cannot add them together (unless you have a conversion factor from millibars to dollars per ounce).
Your analogy is invalid, and in general analogy is a poor substitute for a rational argument. In the thread you’re replying to, I proposed a scheme for getting Alice’s utility to be commensurate with Bob’s so they can be added. It makes sense to argue that the scheme doesn’t work, but it doesn’t make sense to pretend it does not exist.