As I just pointed out again, the vNM axioms merely imply that “rational” decisions can be represented as maximising the expectation of some function mapping world histories into the reals. This function is conventionally called a utility function. In this sense of “utility function”, your preferences over gambles determine your utility (up to an affine transform), so when Omega says “I’ll double your utility” this is just a very roundabout (and rather odd) way of saying something like “I will do something sufficiently good that it will induce you to accept my offer”.* Given standard assumptions about Omega, this pretty obviously means that you accept the offer.
The confusion seems to arise because there are other mappings from world histories into the reals that are also conventionally called utility functions, but which have nothing in particular to do with the vNM utility function. When we read “I’ll double your utility” I think we intuitively parse the phrase as referring to one of these other utility functions, which is when problems start to ensue.
Maximising expected vNM utility is the right thing to do. But “maximise expected vNM utility” is not especially useful advice, because we have no access to our vNM utility function unless we already know our preferences (or can reasonably extrapolate them from preferences we do have access to). Maximising expected utilonsis not necessarily the right thing to do. You can maximize any (potentially bounded!) positive monotonic transform of utilons and you’ll still be “rational”.
* There are sets of “rational” preferences for which such a statement could never be true (your preferences could be represented by a bounded utility function where doubling would go above the bound). If you had such preferences and Omega possessed the usual Omega-properties, then she would never claim to be able to double your utility: ergo the hypothetical implicitly rules out such preferences.
NB: I’m aware that I’m fudging a couple of things here, but they don’t affect the point, and unfudging them seemed likely to be more confusing than helpful.
so when Omega says “I’ll double your utility” this is just a very roundabout (and rather odd) way of saying something like “I will do something sufficiently good that it will induce you to accept my offer”
It’s not that easy. As humans are not formally rational, the problem is about whether to bite this particular bullet, showing a form that following the decision procedure could take and asking if it’s a good idea to adopt a decision procedure that forces such decisions. If you already accept the decision procedure, of course the problem becomes trivial.
Which decision procedure are you talking about? Maximising expected vNM utility and maximizing (e.g.) expected utilons are quite different procedures—which was basically my point.
The former doesn’t force such decisions at all. That’s precisely why I said that it’s not useful advice: all it says is that you should take the gamble if you prefer to take the gamble.* (Moreover, if you did not prefer to take the gamble, the hypothetical doubling of vNM utility could never happen, so the set up already assumes you prefer the gamble. This seems to make the hypothetical not especially useful either.)
On the other hand “maximize expected utilons” does provide concrete advice. It’s just that (AFAIK) there’s no reason to listen to that advice unless you’re risk-neutral over utilons. If you were sufficiently risk averse over utilons then a 50% chance of doubling them might not induce you to take the gamble, and nothing in the vNM axioms would say that you’re behaving irrationally. The really interesting question then becomes whether there are other good reasons to have particular risk preferences with respect to utilons, but it’s a question I’ve never heard a particularly good answer to.
* At least provided doing so would not result in an inconsistency in your preferences. [ETA: Actually, if your preferences are inconsistent, then they won’t have a vNM utility representation, and Omega’s claim that she will double your vNM utility can’t actually mean anything. The set-up therefore seems to imply that you preferences are necessarily consistent. There sure seem to be a lot of surreptitious assumptions built in here!]
Which decision procedure are you talking about? Maximising expected vNM utility and maximizing (e.g.) expected utilons are quite different procedures—which was basically my point.
[...] you should take the gamble if you prefer to take the gamble
The “prefer” here isn’t immediate. People have (internal) arguments about what should be done in what situations precisely because they don’t know what they really prefer. There is an easy answer to go with the whim, but that’s not preference people care about, and so we deliberate.
When all confusion is defeated, and the preference is laid out explicitly, as a decision procedure that just crunches numbers and produces a decision, that is by construction exactly the most preferable action, there is nothing to argue about. Argument is not a part of this form of decision procedure.
In real life, argument is an important part of any decision procedure, and it is the means by which we could select a decision procedure that doesn’t involve argument. You look at the possible solutions produced by many tools, and judge which of them to implement. This makes the decision procedure different from the first kind.
One of the tools you consider may be a “utility maximization” thingy. You can’t say that it’s by definition the right decision procedure, as first you have to accept it as such through argument. And this applies not only to the particular choice of prior and utility, but also to the algorithm itself, to the possibility of representing your true preference in this form.
The “utilons” of the post linked above look different from the vN-M expected utility because their discussion involved argument, informal steps. This doesn’t preclude the topic the argument is about, the “utilons”, from being exactly the same (expected) utility values, approximated to suit more informal discussion. The difference is that the informal part of decision-making is considered as part of decision procedure in that post, unlike what happens with the formal tool itself (that is discussed there informally).
By considering the double-my-utility thought experiment, the following question can be considered: assuming that the best possible utility+prior are chosen within the expected utility maximization framework, do the decisions generated by the resulting procedure look satisfactory? That is, is this form of decision procedure adequate, as an ultimate solution, for all situations? The answer can be “no”, which would mean that expected utility maximization isn’t a way to go, or that you’d need to apply it differently to the problem.
I’m struggling to figure out whether we’re actually disagreeing about anything here, and if so, what it is. I agree with most of what you’ve said, but can’t quite see how it connects to the point I’m trying to make. It seems like we’re somehow managing to talk past each other, but unfortunately I can’t tell whether I’m missing your point, you’re missing mine, or something else entirely. Let’s try again… let me know if/when you think I’m going off the rails here.
If I understand you correctly, you want to evaluate a particular decision procedure “maximize expected utility” (MEU) by seeing whether the results it gives in this situation seem correct. (Is that right?)
My point was that the result given by MEU, and the evidence that this can provide, both depend crucially on what you mean by utility.
One possibility is that by utility, you mean vNM utility. In this case, MEU clearly says you should accept the offer. As a result, it’s tempting to say that if you think accepting the offer would be a bad idea, then this provides evidence against MEU (or equivalently, since the vNM axioms imply MEU, that you think it’s ok to violate the vNM axioms). The problem is that if you violate the vNM axioms, your choices will have no vNM utility representation, and Omega couldn’t possibly promise to double your vNM utility, because there’s no such thing. So for the hypothetical to make sense at all, we have to assume that your preferences conform to the vNM axioms. Moreover, because the vNM axioms necessarily imply MEU, the hypothetical also assumes MEU, and it therefore can’t provide evidence either for or against it.*
If the hypothetical is going to be useful, then utility needs to mean something other than vNM utility. It could mean hedons, it could mean valutilons,** it could mean something else. I do think that responses to the hypothetical in these cases can provide useful evidence about the value of decision procedures such as “maximize expected hedons” (MEH) or “maximize expected valutilons” (MEV). My point on this score was simply that there is no particular reason to think that either MEH or MEV were likely to be an optimal decision procedure to begin with. They’re certainly not implied by the vNM axioms, which require only that you should maximise the expectation of some (positive) monotonic transform of hedons or valutilons or whatever.*** [ETA: As a specific example, if you decide to maximize the expectation of a bounded concave function of hedons/valutilons, then even if hedons/valutilons are unbounded, you’ll at some point stop taking bets to double your hedons/valutilons, but still be an expected vNM utility maximizer.]
Does that make sense?
* This also means that if you think MEU gives the “wrong” answer in this case, you’ve gotten confused somewehere—most likely about what it means to double vNM utility.
** I define these here as the output of a function that maps a specific, certain, world history (no gambles!) into the reals according to how well that particular world history measures up against my values. (Apologies for the proliferation of terminology—I’m trying to guard against the possibility that we’re using “utilons” to mean different things without inadvertently ending up in a messy definitional argument. ;))
*** A corollary of this is that rejecting MEH or MEV does not constitute evidence against the vNM axioms.
You are placing on a test the following well-defined tool: expected utility maximizer with a prior and “utility” function, that evaluates the events on the world. By “utility” function here I mean just some function, so you can drop the word “utility”. Even if people can’t represent their preference as expected some-function maximization, such tool could still be constructed. The question is whether such a tool can be made that always agrees with human preference.
An easy question is what happens when you use “hedons” or something else equally inadequate in the role of utility function: the tool starts to make decisions with which we disagree. Case closed. But maybe there are other settings under which the tool is in perfect agreement with human judgment (after reflection).
Utility-doubling thought experiment compares what is better according to the judgment of the tool (to take the card) with what is better according to the judgment of a person (maybe not take the card). As the tool’s decision in this thought experiment is made invariant on the tool’s settings (“utility” and prior), showing that the tool’s decision is wrong according to a person’t preference (after “careful” reflection), proves that there is no way to set up “utility” and prior so that the “utility” maximization tool represents that person’s preference.
As the tool’s decision in this thought experiment is made invariant on the tool’s settings (“utility” and prior), showing that the tool’s decision is wrong according to a person’s preference (after “careful” reflection), proves that there is no way to set up “utility”
My argument is that, if Omega is offering to double vNM utility, the set-up of the thought experiment rules out the possibility that the decision could be wrong according to a person’s considered preference (because the claim to be doubling vNM utility embodies an assumption about what a person’s considered preference is). AFAICT, the thought experiment then amounts to asking: “If I should maximize expected utility, should I maximize expected utility?” Regardless of whether I should actually maximize expected utility or not, the correct answer to this question is still “yes”. But the thought experiment is completely uninformative.
Do you understand my argument for this conclusion? (Fourth para of my previous comment.) If you do, can you point out where you think it goes astray? If you don’t, could you tell me what part you don’t understand so I can try to clarify my thinking?
On the other hand, if Omega is offering to double something other than vNM utility (hedons/valutilons/whatever) then I don’t think we have any disagreement. (Do we? Do you disagree with anything I said in para 5 of my previous comment?)
My point is just that the thought experiment is underspecified unless we’re clear about what the doubling applies to, and that people sometimes seem to shift back and forth between different meanings.
What was originally at issue is whether we should act in ways that will eventually destroy ourselves.
I think the big-picture conclusion from what you just wrote is that, if we see that we’re acting in ways that will probably exterminate life in short order, that doesn’t necessarily mean it’s the wrong thing to do.
However, in our circumstances, time discounting and “identity discounting” encourage us to start enjoying and dooming ourselves now; whereas it would probably be better to spread life to a few other galaxies first, and then enjoy ourselves.
(I admit that my use of the word “better” is problematic.)
if we see that we’re acting in ways that will probably exterminate life in short order, that doesn’t necessarily mean it’s the wrong thing to do.
Well, I don’t disagree with this, but I would still agree with it if you substituted “right” for “wrong”, so it doesn’t seem like much of a conclusion. ;)
Moving back toward your ignorance prior on a topic can still increase your log-score if the hypothesis was concentrating probability mass in the wrong areas (failing to concentrate a substantial amount in a right area).
You argue that the thought experiment is trivial and doesn’t solve any problems. In my comments above I described a specific setup that shows how to use (interpret) the thought experiment to potentially obtain non-trivial results.
I argue that the thought experiment is ambiguous, and that for a certain definition of utility (vNM utility), it is trivial and doesn’t solve any problems. For this definition of utility I argue that your example doesn’t work. You do not appear to have engaged with this argument, despite repeated requests to point out either where it goes wrong, or where it is unclear. If it goes wrong, I want to know why, but this conversation isn’t really helping.
For other definitions of utility, I do not, and have never claimed that the thought experiment is trivial. In fact, I think it is very interesting.
I argue that the thought experiment is ambiguous, and that for a certain definition of utility (vNM utility), it is trivial and doesn’t solve any problems. For this definition of utility I argue that your example doesn’t work.
If by “your example” you refer to the setup described in this comment, I don’t understand what you are saying here. I don’t use any “definition of utility”, it’s just a parameter of the tool.
It’s also an entity in the problem set-up. When Omega says “I’ll double your utility”, what is she offering to double? Without defining this, the problem isn’t well-specified.
Sorry for coming late to this party. ;)
Much of this discussion seems to me to rest on a similar confusion to that evidenced in “Expectation maximization implies average utilitarianism”.
As I just pointed out again, the vNM axioms merely imply that “rational” decisions can be represented as maximising the expectation of some function mapping world histories into the reals. This function is conventionally called a utility function. In this sense of “utility function”, your preferences over gambles determine your utility (up to an affine transform), so when Omega says “I’ll double your utility” this is just a very roundabout (and rather odd) way of saying something like “I will do something sufficiently good that it will induce you to accept my offer”.* Given standard assumptions about Omega, this pretty obviously means that you accept the offer.
The confusion seems to arise because there are other mappings from world histories into the reals that are also conventionally called utility functions, but which have nothing in particular to do with the vNM utility function. When we read “I’ll double your utility” I think we intuitively parse the phrase as referring to one of these other utility functions, which is when problems start to ensue.
Maximising expected vNM utility is the right thing to do. But “maximise expected vNM utility” is not especially useful advice, because we have no access to our vNM utility function unless we already know our preferences (or can reasonably extrapolate them from preferences we do have access to). Maximising expected utilons is not necessarily the right thing to do. You can maximize any (potentially bounded!) positive monotonic transform of utilons and you’ll still be “rational”.
* There are sets of “rational” preferences for which such a statement could never be true (your preferences could be represented by a bounded utility function where doubling would go above the bound). If you had such preferences and Omega possessed the usual Omega-properties, then she would never claim to be able to double your utility: ergo the hypothetical implicitly rules out such preferences.
NB: I’m aware that I’m fudging a couple of things here, but they don’t affect the point, and unfudging them seemed likely to be more confusing than helpful.
It’s not that easy. As humans are not formally rational, the problem is about whether to bite this particular bullet, showing a form that following the decision procedure could take and asking if it’s a good idea to adopt a decision procedure that forces such decisions. If you already accept the decision procedure, of course the problem becomes trivial.
Which decision procedure are you talking about? Maximising expected vNM utility and maximizing (e.g.) expected utilons are quite different procedures—which was basically my point.
The former doesn’t force such decisions at all. That’s precisely why I said that it’s not useful advice: all it says is that you should take the gamble if you prefer to take the gamble.* (Moreover, if you did not prefer to take the gamble, the hypothetical doubling of vNM utility could never happen, so the set up already assumes you prefer the gamble. This seems to make the hypothetical not especially useful either.)
On the other hand “maximize expected utilons” does provide concrete advice. It’s just that (AFAIK) there’s no reason to listen to that advice unless you’re risk-neutral over utilons. If you were sufficiently risk averse over utilons then a 50% chance of doubling them might not induce you to take the gamble, and nothing in the vNM axioms would say that you’re behaving irrationally. The really interesting question then becomes whether there are other good reasons to have particular risk preferences with respect to utilons, but it’s a question I’ve never heard a particularly good answer to.
* At least provided doing so would not result in an inconsistency in your preferences. [ETA: Actually, if your preferences are inconsistent, then they won’t have a vNM utility representation, and Omega’s claim that she will double your vNM utility can’t actually mean anything. The set-up therefore seems to imply that you preferences are necessarily consistent. There sure seem to be a lot of surreptitious assumptions built in here!]
The “prefer” here isn’t immediate. People have (internal) arguments about what should be done in what situations precisely because they don’t know what they really prefer. There is an easy answer to go with the whim, but that’s not preference people care about, and so we deliberate.
When all confusion is defeated, and the preference is laid out explicitly, as a decision procedure that just crunches numbers and produces a decision, that is by construction exactly the most preferable action, there is nothing to argue about. Argument is not a part of this form of decision procedure.
In real life, argument is an important part of any decision procedure, and it is the means by which we could select a decision procedure that doesn’t involve argument. You look at the possible solutions produced by many tools, and judge which of them to implement. This makes the decision procedure different from the first kind.
One of the tools you consider may be a “utility maximization” thingy. You can’t say that it’s by definition the right decision procedure, as first you have to accept it as such through argument. And this applies not only to the particular choice of prior and utility, but also to the algorithm itself, to the possibility of representing your true preference in this form.
The “utilons” of the post linked above look different from the vN-M expected utility because their discussion involved argument, informal steps. This doesn’t preclude the topic the argument is about, the “utilons”, from being exactly the same (expected) utility values, approximated to suit more informal discussion. The difference is that the informal part of decision-making is considered as part of decision procedure in that post, unlike what happens with the formal tool itself (that is discussed there informally).
By considering the double-my-utility thought experiment, the following question can be considered: assuming that the best possible utility+prior are chosen within the expected utility maximization framework, do the decisions generated by the resulting procedure look satisfactory? That is, is this form of decision procedure adequate, as an ultimate solution, for all situations? The answer can be “no”, which would mean that expected utility maximization isn’t a way to go, or that you’d need to apply it differently to the problem.
I’m struggling to figure out whether we’re actually disagreeing about anything here, and if so, what it is. I agree with most of what you’ve said, but can’t quite see how it connects to the point I’m trying to make. It seems like we’re somehow managing to talk past each other, but unfortunately I can’t tell whether I’m missing your point, you’re missing mine, or something else entirely. Let’s try again… let me know if/when you think I’m going off the rails here.
If I understand you correctly, you want to evaluate a particular decision procedure “maximize expected utility” (MEU) by seeing whether the results it gives in this situation seem correct. (Is that right?)
My point was that the result given by MEU, and the evidence that this can provide, both depend crucially on what you mean by utility.
One possibility is that by utility, you mean vNM utility. In this case, MEU clearly says you should accept the offer. As a result, it’s tempting to say that if you think accepting the offer would be a bad idea, then this provides evidence against MEU (or equivalently, since the vNM axioms imply MEU, that you think it’s ok to violate the vNM axioms). The problem is that if you violate the vNM axioms, your choices will have no vNM utility representation, and Omega couldn’t possibly promise to double your vNM utility, because there’s no such thing. So for the hypothetical to make sense at all, we have to assume that your preferences conform to the vNM axioms. Moreover, because the vNM axioms necessarily imply MEU, the hypothetical also assumes MEU, and it therefore can’t provide evidence either for or against it.*
If the hypothetical is going to be useful, then utility needs to mean something other than vNM utility. It could mean hedons, it could mean valutilons,** it could mean something else. I do think that responses to the hypothetical in these cases can provide useful evidence about the value of decision procedures such as “maximize expected hedons” (MEH) or “maximize expected valutilons” (MEV). My point on this score was simply that there is no particular reason to think that either MEH or MEV were likely to be an optimal decision procedure to begin with. They’re certainly not implied by the vNM axioms, which require only that you should maximise the expectation of some (positive) monotonic transform of hedons or valutilons or whatever.*** [ETA: As a specific example, if you decide to maximize the expectation of a bounded concave function of hedons/valutilons, then even if hedons/valutilons are unbounded, you’ll at some point stop taking bets to double your hedons/valutilons, but still be an expected vNM utility maximizer.]
Does that make sense?
* This also means that if you think MEU gives the “wrong” answer in this case, you’ve gotten confused somewehere—most likely about what it means to double vNM utility.
** I define these here as the output of a function that maps a specific, certain, world history (no gambles!) into the reals according to how well that particular world history measures up against my values. (Apologies for the proliferation of terminology—I’m trying to guard against the possibility that we’re using “utilons” to mean different things without inadvertently ending up in a messy definitional argument. ;))
*** A corollary of this is that rejecting MEH or MEV does not constitute evidence against the vNM axioms.
You are placing on a test the following well-defined tool: expected utility maximizer with a prior and “utility” function, that evaluates the events on the world. By “utility” function here I mean just some function, so you can drop the word “utility”. Even if people can’t represent their preference as expected some-function maximization, such tool could still be constructed. The question is whether such a tool can be made that always agrees with human preference.
An easy question is what happens when you use “hedons” or something else equally inadequate in the role of utility function: the tool starts to make decisions with which we disagree. Case closed. But maybe there are other settings under which the tool is in perfect agreement with human judgment (after reflection).
Utility-doubling thought experiment compares what is better according to the judgment of the tool (to take the card) with what is better according to the judgment of a person (maybe not take the card). As the tool’s decision in this thought experiment is made invariant on the tool’s settings (“utility” and prior), showing that the tool’s decision is wrong according to a person’t preference (after “careful” reflection), proves that there is no way to set up “utility” and prior so that the “utility” maximization tool represents that person’s preference.
My argument is that, if Omega is offering to double vNM utility, the set-up of the thought experiment rules out the possibility that the decision could be wrong according to a person’s considered preference (because the claim to be doubling vNM utility embodies an assumption about what a person’s considered preference is). AFAICT, the thought experiment then amounts to asking: “If I should maximize expected utility, should I maximize expected utility?” Regardless of whether I should actually maximize expected utility or not, the correct answer to this question is still “yes”. But the thought experiment is completely uninformative.
Do you understand my argument for this conclusion? (Fourth para of my previous comment.) If you do, can you point out where you think it goes astray? If you don’t, could you tell me what part you don’t understand so I can try to clarify my thinking?
On the other hand, if Omega is offering to double something other than vNM utility (hedons/valutilons/whatever) then I don’t think we have any disagreement. (Do we? Do you disagree with anything I said in para 5 of my previous comment?)
My point is just that the thought experiment is underspecified unless we’re clear about what the doubling applies to, and that people sometimes seem to shift back and forth between different meanings.
What you just said seems correct.
What was originally at issue is whether we should act in ways that will eventually destroy ourselves.
I think the big-picture conclusion from what you just wrote is that, if we see that we’re acting in ways that will probably exterminate life in short order, that doesn’t necessarily mean it’s the wrong thing to do.
However, in our circumstances, time discounting and “identity discounting” encourage us to start enjoying and dooming ourselves now; whereas it would probably be better to spread life to a few other galaxies first, and then enjoy ourselves.
(I admit that my use of the word “better” is problematic.)
Well, I don’t disagree with this, but I would still agree with it if you substituted “right” for “wrong”, so it doesn’t seem like much of a conclusion. ;)
Moving back toward your ignorance prior on a topic can still increase your log-score if the hypothesis was concentrating probability mass in the wrong areas (failing to concentrate a substantial amount in a right area).
You argue that the thought experiment is trivial and doesn’t solve any problems. In my comments above I described a specific setup that shows how to use (interpret) the thought experiment to potentially obtain non-trivial results.
I argue that the thought experiment is ambiguous, and that for a certain definition of utility (vNM utility), it is trivial and doesn’t solve any problems. For this definition of utility I argue that your example doesn’t work. You do not appear to have engaged with this argument, despite repeated requests to point out either where it goes wrong, or where it is unclear. If it goes wrong, I want to know why, but this conversation isn’t really helping.
For other definitions of utility, I do not, and have never claimed that the thought experiment is trivial. In fact, I think it is very interesting.
If by “your example” you refer to the setup described in this comment, I don’t understand what you are saying here. I don’t use any “definition of utility”, it’s just a parameter of the tool.
It’s also an entity in the problem set-up. When Omega says “I’ll double your utility”, what is she offering to double? Without defining this, the problem isn’t well-specified.
Certainly, you need to resolve any underspecification. There are ways to do this usefully (or not).
Agreed. My point is simply that one particular (tempting) way of resolving the underspecification is non-useful. ;)