If you accept that you’re maximizing expected utility, then you should draw the first card, and all future cards. It doesn’t matter what terms your utility function includes. The logic for the first step is the same as for any other step.
If you don’t accept this, then what precisely do you mean when you talk about your utility function?
The logic for the first step is the same as for any other step.
Actually, on rethinking, this depends entirely on what you mean by “utility”. Here’s a way of framing the problem such that the logic can change.
Assume that we have some function V(x) that maps world histories into (non-negative*) real-valued “valutilons”, and that, with no intervention from Omega, the world history that will play out is valued at V(status quo) = q.
Omega then turns up and offers you the card deal, with a deck as described above: 90% stars, 10% skulls. Stars give you double V(star)=2c, where c is the value of whatever history is currently slated to play out (so c=q when the deal is first offered, but could be higher than that if you’ve played and won before). Skulls give you death: V(skull)=d, and d < q.
If our choices obey the vNM axioms, there will be some function f(x), such that our choices correspond to maximising E[f(x)]. It seems reasonable to assume that f(x) must be (weakly) increasing in V(x). A few questions present themselves:
Is there a function, f(x), such that, for some values of q and d, we should take a card every time one is offered?
Yes. f(x)=V(x) gives this result for all d<q. This is the standard approach.
Is there a function, f(x), such that, for some values of q and d, we should never take a card?
Yes. Set d=0, q=1000, and f(x) = ln(V(x)+1). The card gives expected vNM utility of 0.9ln(2001)~6.8, which is less than ln(1001)~6.9.
Is there a function, f(x), such that, for some values of q and d, we should take some finite number of cards then stop?
Yes. Set d=0, q=1, and f(x) = ln(V(x)+1). The first time you get the offer, its expected vNM utility is 0.9ln(3)~1 which is greater than ln(2)~0.7. But at the 10th time you play (assuming you’re still alive), c=512, and the expected vNM utility of the offer is now 0.9ln(1025)~6.239, which is less than ln(513)~6.240.
So you take 9 cards, then stop. (You can verify for yourself, that the 9th card is still a good bet.)
* This is just to ensure that doubling your valutilons cannot make you worse off, as would happen if they were negative. It should be possible to reframe the problem to avoid this, but let’s stick with it for now.
Redefining “utility” like this doesn’t help us with the actual problem at hand: what do we do if Omega offers to double the f(x) which we’re actually maximizing?
In your restatement of the problem, the only thing we assume about Omega’s offer is that it would change the universe in a desirable way (f is increasing in V(x)). Of course we can find an f such that a doubling in V translates to adding a constant to f, or if we like, even an infinitesimal increase in f. But all this means is that Omega is offering us the wrong thing, which we don’t really value.
Redefining “utility” like this doesn’t help us with the actual problem at hand: what do we do if Omega offers to double the f(x) which we’re actually maximizing?
It wasn’t intended to help with the the problem specified in terms of f(x). For the reasons set out in the thread beginning here, I don’t find the problem specified in terms of f(x) very interesting.
In your restatement of the problem, the only thing we assume about Omega’s offer is that it would change the universe in a desirable way
You’re assuming the output of V(x) is ordinal. It could be cardinal.
all this means is that Omega is offering us the wrong thing
I’m afraid I don’t understand what you mean here. “Wrong” relative to what?
which we don’t really value.
Eh? Valutilons were defined to be something we value (ETA: each of us individually, rather than collectively).
Redefining “utility” like this doesn’t help us with the actual problem at hand:
I guess what I’m suggesting, in part, is that the actual problem at hand isn’t well-defined, unless you specify what you mean by utility in advance.
what do we do if Omega offers to double the f(x) which we’re actually maximizing?
You take cards every time, obviously. But then the result is tautologically true and pretty uninteresting, AFAICT. (The thread beginning here has more on this.) It’s also worth noting that there are vNM-rational preferences for which Omega could not possibly make this offer (f(x) bounded above and q greater than half the bound.)
In your restatement of the problem, the only thing we assume about Omega’s offer is that it would change the universe in a desirable way.
That’s only true given a particular assumption about what the output of V(x) means. If I say that V(x) is, say, a cardinally measurable and interpersonally comparable measure of my well-being, then Omega’s offer to double means rather more than that.
But all this means is that Omega is offering us the wrong thing,
“Wrong” relative to what? Omega offers whatever Omega offers. We can specify the thought experiment any way we like if it helps us answer questions we are interested in. My point is that you can’t learn anything interesting from the thought experiment if Omega is offering to double f(x), so we shouldn’t set it up that way.
which we don’t really value.
Eh? “Valutilons” are specifically defined to be a measure of what we value.
I guess what I’m suggesting, in part, is that the actual problem at hand isn’t well-defined, unless you specify what you mean by utility in advance.
Utility means “the function f, whose expectation I am in fact maximizing”. The discussion then indeed becomes whether f exists and whether it can be doubled.
My point is that you can’t learn anything interesting from the thought experiment if Omega is offering to double f(x), so we shouldn’t set it up that way.
That was the original point of the thread where the thought experiment was first discussed, though.
The interesting result is that if you’re maximizing something you may be vulnerable to a failure mode of taking risks that can be considered excessive. This is in view of the original goals you want to achieve, to which maximizing f is a proxy—whether a designed one (in AI) or an evolved strategy (in humans).
“Valutilons” are specifically defined to be a measure of what we value.
If “we” refers to humans, then “what we value” isn’t well defined.
Utility means “the function f, whose expectation I am in fact maximizing”.
There are many definitions of utility, of which that is one. Usage in general is pretty inconsistent. (Wasn’t that the point of this post?) Either way, definitional arguments aren’t very interesting. ;)
The interesting result is that if you’re maximizing something you may be vulnerable to a failure mode of taking risks that can be considered excessive.
Your maximand already embodies a particular view as to what sorts of risk are excessive. I tend to the view that if you consider the risks demanded by your maximand excessive, then you should either change your maximand, or change your view of what constitutes excessive risk.
There are many definitions of utility, of which that is one. Usage in general is pretty inconsistent. (Wasn’t that the point of this post?) Either way, definitional arguments aren’t very interesting. ;)
Yes, that was the point :-) On my reading of OP, this is the meaning of utility that was intended.
Your maximand already embodies a particular view as to what sorts of risk are excessive. I tend to the view that if you consider the risks demanded by your maximand excessive, then you should either change your maximand, or change your view of what constitutes excessive risk.
Yes. Here’s my current take:
The OP argument demonstrates the danger of using a function-maximizer as a proxy for some other goal. If there can always exist a chance to increase f by an amount proportional to its previous value (e.g. double it), then the maximizer will fall into the trap of taking ever-increasing risks for ever-increasing payoffs in the value of f, and will lose with probability approaching 1 in a finite (and short) timespan.
This qualifies as losing if the original goal (the goal of the AI’s designer, perhaps) does not itself have this quality. This can be the case when the designer sloppily specifies its goal (chooses f poorly), but perhaps more interesting/vivid examples can be found.
To expand on this slightly, it seems like it should be possible to separate goal achievement from risk preference (at least under certain conditions).
You first specify a goal function g(x) designating the degree to which your goals are met in a particular world history, x. You then specify another (monotonic) function, f(g) that embodies your risk-preference with respect to goal attainment (with concavity indicating risk-aversion, convexity risk-tolerance, and linearity risk-neutrality, in the usual way). Then you maximise E[f(g(x))].
If g(x) is only ordinal, this won’t be especially helpful, but if you had a reasonable way of establishing an origin and scale it would seem potentially useful. Note also that f could be unbounded even if g were bounded, and vice-versa. In theory, that seems to suggest that taking ever increasing risks to achieve a bounded goal could be rational, if one were sufficiently risk-loving (though it does seem unlikely that anyone would really be that “crazy”). Also, one could avoid ever taking such risks, even in the pursuit of an unbounded goal, if one were sufficiently risk-averse that one’s f function were bounded.
P.S.
On my reading of OP, this is the meaning of utility that was intended.
If you accept that you’re maximizing expected utility, then you should draw the first card, and all future cards. It doesn’t matter what terms your utility function includes.
Note however, that there is no particular reason that one needs to maximise expected utilons.
The standard axioms for choice under uncertainty imply only that consistent choices over gambles can be represented as maximizing the expectation of some function that maps world histories into the reals. This function is conventionally called a utility function. However, if (as here) you already have another function that maps world histories into the reals, and happen to have called this a utility function as well, this does not imply that your two utility functions (which you’ve derived in completely different ways and for completely different purposes) need to be the same function. In general (and as I’ve I’ve tried, with varying degrees of success to point out elsewhere) the utility function describing your choices over gambles can be any positive monotonic transform of the latter, and you will still comply with the Savage-vNM-Marschak axioms.
All of which is to say that you don’t actually have to draw the first card if you are sufficiently risk averse over utilons (at least as I understand Psychohistorian to have defined the term).
Thanks! You’re the first person who’s started to explain to me what “utilons” are actually supposed to be under a rigorous definition and incidentally why people sometimes seem to be using slightly different definitions in these discussions.
Briefly, as requiring completeness, transitivity, continuity, and (more controversially) independence. Vladimir’s link looks good, so check that for the details.
I see, I misparsed the terms of the argument, I thought it was doubling my current utilons, you’re positing I have a 90% chance of doubling my currently expected utility over my entire life.
The reason I bring up the terms in my utility function, is that they reference concrete objects, people, time passing, and so on. So, measuring expected utility, for me, involves projecting the course of the world, and my place in it.
So, assuming I follow the suggested course of action, and keep drawing cards until I die, to fulfill the terms, Omega must either give me all the utilons before I die, or somehow compress the things I value into something that can be achieved in between drawing cards as fast as I can. This either involves massive changes to reality, which I can verify instantly, or some sort of orthogonal life I get to lead while simultaneously drawing cards, so I guess that’s fine.
Otherwise, given the certainty that I will die essentially immediately, I certainly don’t recognize that I’m getting a 90% chance of doubled expected utility, as my expectations certainly include whether or not I will draw a card.
I don’t think “current utilons” makes that much sense. Utilons should be for a utility function, which is equivalent to a decision function, and the purpose of decisions is probably to influence the future. So utility has to be about the whole future course of the world. “Currently expected utilons” means what you expect to happen, averaged over your uncertainty and actual randomness, and this is what the dilemma should be about.
“Current hedons” certainly does make sense, at least because hedons haven’t been specified as well.
Like Douglas_Knight, I don’t think current utilons are a useful unit.
Suppose your utility function behaves as you describe. If you play once (and win, with 90% probability), Omega will modify the universe in a way that all the concrete things you derive utility from will bring you twice as much utility, over the course of the infinite future. You’ll live out your life with twice as much of all the things you value. So it makes sense to play this once, by the terms of your utility function.
You don’t know, when you play your first game, whether or not you’ll ever play again; your future includes both options. You can decide, for yourself, that you’ll play once but never again. It’s a free decision both now and later.
And now a second has passed and Omega is offering a second game. You remember your decision. But what place do decisions have in a utility function? You’re free to choose to play again if you wish, and the logic for playing is the same as the first time around...
Now, you could bind yourself to your promise (after the first game). Maybe you have a way to hardwire your own decision procedure to force something like this. But how do you decide (in advance) after how many games to stop? Why one and not, say, ten?
OTOH, if you decide not to play at all—would you really forgo a one-time 90% chance of doubling your lifelong future utility? How about a 99.999% chance? The probability of death in any one round of the game can be made as small as you like, as long as it’s finite and fixed for all future rounds. Is there no probability at which you’d take the risk for one round?
Why on earth wouldn’t I consider whether or not I would play again? Am I barred from doing so?
If I know that the card game will continue to be available, and that Omega can truly double my expected utility every draw, either it’s a relatively insignificant increase of expected utility over the next few minutes it takes me to die, in which case it’s a foolish bet, compared to my expected utility over the decades I have left, conservatively, or Omega can somehow change the whole world in the radical fashion needed for my expected utility over the next few minutes it takes me to die to dwarf my expected utility right now.
This paradox seems to depend on the idea that the card game is somehow excepted from the 90% likely doubling of expected utility. As I mentioned before, my expected utility certainly includes the decisions I’m likely to make, and it’s easy to see that continuing to draw cards will result in my death. So, it depends on what you mean. If it’s just doubling expected utility over my expected life IF I don’t die in the card game, then it’s a foolish decision to draw the first or any number of cards. If it’s doubling expected utility in all cases, then I draw cards until I die, happily forcing Omega to make verifiable changes to the universe and myself.
Now, there are terms at which I would take the one round, IF you don’t die in the card game version of the gamble, but it would probably depend on how it’s implemented. I don’t have a way of accessing my utility function directly, and my ability to appreciate maximizing it is indirect at best. So I would be very concerned about the way Omega plans to double my expected utility, and how I’m meant to experience it.
In practice, of course, any possible doubt that it’s not Omega giving you this gamble far outweighs any possibility of such lofty returns, but the thought experiment has some interesting complexities.
You’re free to choose to play again if you wish, and the logic for playing is the same as the first time around
This, again, depends on what you mean by “utility”. Here’s a way of framing the problem such that the logic can change.
Assume that we have some function V(x) that maps world histories into (non-negative*) real-valued “valutilons”, and that, with no intervention from Omega, the world history that will play out is valued at V(status quo) = q.
Then Omega turns up and offers you the card deal, with a deck as described above: 90% stars, 10% skulls. Stars give you double V(star)=2c, where c is the value of whatever history is currently slated to play (so c=q when the deal is first offered, but could be higher than that if you’ve played and won before). Skulls give you death: V(skull)=d, and d < q.
If our choices obey the vNM axioms, there will be some function f(x), such that our choices correspond to maximising E[f(x)]. It seems reasonable to assume that f(x) must be (weakly) increasing in V(x). A few questions present themselves:
Is there a function, f(x), such that, for some values of q and d, we should take cards every time this bet is offered?
Yes. f(x)=V(x) gives this result for all d<q.
Is there a function, f(x), such that, for some values of q and d, we should never take the bet?
Yes. Set d=0, q=1000, and f(x) = ln(V(x)+1). The offer gives vNM utility of 0.9ln(2001)~6.8, which is less than ln(1001)~6.9.
Is there a function, f(x), such that, for some values of q and d, we should take cards for some finite number of offers, and then stop?
Yes. Set d=0, q=1, and f(x) = ln(V(x)+1). The first time you get the offer, it’s vNM utility is 0.9ln(3)~1 which is greater than ln(2)~0.7. But at the 10th time you play (assuming you’re still alive), c=512, and the vNM utility of the offer is now 0.9ln(1025)~6.239, which is less than ln(513)~6.240. So you play up until the 10th offer, then stop.
* This is just to ensure that doubling your valutilons cannot make you worse off, as would happen if they were negative. It should be possible to reframe the problem to avoid this, but let’s stick with this for now.
If you accept that you’re maximizing expected utility, then you should draw the first card, and all future cards. It doesn’t matter what terms your utility function includes. The logic for the first step is the same as for any other step.
If you don’t accept this, then what precisely do you mean when you talk about your utility function?
Actually, on rethinking, this depends entirely on what you mean by “utility”. Here’s a way of framing the problem such that the logic can change.
Assume that we have some function V(x) that maps world histories into (non-negative*) real-valued “valutilons”, and that, with no intervention from Omega, the world history that will play out is valued at V(status quo) = q.
Omega then turns up and offers you the card deal, with a deck as described above: 90% stars, 10% skulls. Stars give you double V(star)=2c, where c is the value of whatever history is currently slated to play out (so c=q when the deal is first offered, but could be higher than that if you’ve played and won before). Skulls give you death: V(skull)=d, and d < q.
If our choices obey the vNM axioms, there will be some function f(x), such that our choices correspond to maximising E[f(x)]. It seems reasonable to assume that f(x) must be (weakly) increasing in V(x). A few questions present themselves:
Is there a function, f(x), such that, for some values of q and d, we should take a card every time one is offered?
Yes. f(x)=V(x) gives this result for all d<q. This is the standard approach.
Is there a function, f(x), such that, for some values of q and d, we should never take a card?
Yes. Set d=0, q=1000, and f(x) = ln(V(x)+1). The card gives expected vNM utility of 0.9ln(2001)~6.8, which is less than ln(1001)~6.9.
Is there a function, f(x), such that, for some values of q and d, we should take some finite number of cards then stop?
Yes. Set d=0, q=1, and f(x) = ln(V(x)+1). The first time you get the offer, its expected vNM utility is 0.9ln(3)~1 which is greater than ln(2)~0.7. But at the 10th time you play (assuming you’re still alive), c=512, and the expected vNM utility of the offer is now 0.9ln(1025)~6.239, which is less than ln(513)~6.240.
So you take 9 cards, then stop. (You can verify for yourself, that the 9th card is still a good bet.)
* This is just to ensure that doubling your valutilons cannot make you worse off, as would happen if they were negative. It should be possible to reframe the problem to avoid this, but let’s stick with it for now.
Redefining “utility” like this doesn’t help us with the actual problem at hand: what do we do if Omega offers to double the f(x) which we’re actually maximizing?
In your restatement of the problem, the only thing we assume about Omega’s offer is that it would change the universe in a desirable way (f is increasing in V(x)). Of course we can find an f such that a doubling in V translates to adding a constant to f, or if we like, even an infinitesimal increase in f. But all this means is that Omega is offering us the wrong thing, which we don’t really value.
It wasn’t intended to help with the the problem specified in terms of f(x). For the reasons set out in the thread beginning here, I don’t find the problem specified in terms of f(x) very interesting.
You’re assuming the output of V(x) is ordinal. It could be cardinal.
I’m afraid I don’t understand what you mean here. “Wrong” relative to what?
Eh? Valutilons were defined to be something we value (ETA: each of us individually, rather than collectively).
I guess what I’m suggesting, in part, is that the actual problem at hand isn’t well-defined, unless you specify what you mean by utility in advance.
You take cards every time, obviously. But then the result is tautologically true and pretty uninteresting, AFAICT. (The thread beginning here has more on this.) It’s also worth noting that there are vNM-rational preferences for which Omega could not possibly make this offer (f(x) bounded above and q greater than half the bound.)
That’s only true given a particular assumption about what the output of V(x) means. If I say that V(x) is, say, a cardinally measurable and interpersonally comparable measure of my well-being, then Omega’s offer to double means rather more than that.
“Wrong” relative to what? Omega offers whatever Omega offers. We can specify the thought experiment any way we like if it helps us answer questions we are interested in. My point is that you can’t learn anything interesting from the thought experiment if Omega is offering to double f(x), so we shouldn’t set it up that way.
Eh? “Valutilons” are specifically defined to be a measure of what we value.
Utility means “the function f, whose expectation I am in fact maximizing”. The discussion then indeed becomes whether f exists and whether it can be doubled.
That was the original point of the thread where the thought experiment was first discussed, though.
The interesting result is that if you’re maximizing something you may be vulnerable to a failure mode of taking risks that can be considered excessive. This is in view of the original goals you want to achieve, to which maximizing f is a proxy—whether a designed one (in AI) or an evolved strategy (in humans).
If “we” refers to humans, then “what we value” isn’t well defined.
There are many definitions of utility, of which that is one. Usage in general is pretty inconsistent. (Wasn’t that the point of this post?) Either way, definitional arguments aren’t very interesting. ;)
Your maximand already embodies a particular view as to what sorts of risk are excessive. I tend to the view that if you consider the risks demanded by your maximand excessive, then you should either change your maximand, or change your view of what constitutes excessive risk.
Yes, that was the point :-) On my reading of OP, this is the meaning of utility that was intended.
Yes. Here’s my current take:
The OP argument demonstrates the danger of using a function-maximizer as a proxy for some other goal. If there can always exist a chance to increase f by an amount proportional to its previous value (e.g. double it), then the maximizer will fall into the trap of taking ever-increasing risks for ever-increasing payoffs in the value of f, and will lose with probability approaching 1 in a finite (and short) timespan.
This qualifies as losing if the original goal (the goal of the AI’s designer, perhaps) does not itself have this quality. This can be the case when the designer sloppily specifies its goal (chooses f poorly), but perhaps more interesting/vivid examples can be found.
To expand on this slightly, it seems like it should be possible to separate goal achievement from risk preference (at least under certain conditions).
You first specify a goal function g(x) designating the degree to which your goals are met in a particular world history, x. You then specify another (monotonic) function, f(g) that embodies your risk-preference with respect to goal attainment (with concavity indicating risk-aversion, convexity risk-tolerance, and linearity risk-neutrality, in the usual way). Then you maximise E[f(g(x))].
If g(x) is only ordinal, this won’t be especially helpful, but if you had a reasonable way of establishing an origin and scale it would seem potentially useful. Note also that f could be unbounded even if g were bounded, and vice-versa. In theory, that seems to suggest that taking ever increasing risks to achieve a bounded goal could be rational, if one were sufficiently risk-loving (though it does seem unlikely that anyone would really be that “crazy”). Also, one could avoid ever taking such risks, even in the pursuit of an unbounded goal, if one were sufficiently risk-averse that one’s f function were bounded.
P.S.
You’re probably right.
Crap. Sorry about the delete. :(
Note however, that there is no particular reason that one needs to maximise expected utilons.
The standard axioms for choice under uncertainty imply only that consistent choices over gambles can be represented as maximizing the expectation of some function that maps world histories into the reals. This function is conventionally called a utility function. However, if (as here) you already have another function that maps world histories into the reals, and happen to have called this a utility function as well, this does not imply that your two utility functions (which you’ve derived in completely different ways and for completely different purposes) need to be the same function. In general (and as I’ve I’ve tried, with varying degrees of success to point out elsewhere) the utility function describing your choices over gambles can be any positive monotonic transform of the latter, and you will still comply with the Savage-vNM-Marschak axioms.
All of which is to say that you don’t actually have to draw the first card if you are sufficiently risk averse over utilons (at least as I understand Psychohistorian to have defined the term).
Thanks! You’re the first person who’s started to explain to me what “utilons” are actually supposed to be under a rigorous definition and incidentally why people sometimes seem to be using slightly different definitions in these discussions.
How is consistency defined here?
You can learn more from e.g. the following lecture notes:
B. L. Slantchev (2008). `Game Theory: Preferences and Expected Utility’. (PDF)
Briefly, as requiring completeness, transitivity, continuity, and (more controversially) independence. Vladimir’s link looks good, so check that for the details.
I will when I have time tomorrow, thanks.
I see, I misparsed the terms of the argument, I thought it was doubling my current utilons, you’re positing I have a 90% chance of doubling my currently expected utility over my entire life.
The reason I bring up the terms in my utility function, is that they reference concrete objects, people, time passing, and so on. So, measuring expected utility, for me, involves projecting the course of the world, and my place in it.
So, assuming I follow the suggested course of action, and keep drawing cards until I die, to fulfill the terms, Omega must either give me all the utilons before I die, or somehow compress the things I value into something that can be achieved in between drawing cards as fast as I can. This either involves massive changes to reality, which I can verify instantly, or some sort of orthogonal life I get to lead while simultaneously drawing cards, so I guess that’s fine.
Otherwise, given the certainty that I will die essentially immediately, I certainly don’t recognize that I’m getting a 90% chance of doubled expected utility, as my expectations certainly include whether or not I will draw a card.
I don’t think “current utilons” makes that much sense. Utilons should be for a utility function, which is equivalent to a decision function, and the purpose of decisions is probably to influence the future. So utility has to be about the whole future course of the world. “Currently expected utilons” means what you expect to happen, averaged over your uncertainty and actual randomness, and this is what the dilemma should be about.
“Current hedons” certainly does make sense, at least because hedons haven’t been specified as well.
Like Douglas_Knight, I don’t think current utilons are a useful unit.
Suppose your utility function behaves as you describe. If you play once (and win, with 90% probability), Omega will modify the universe in a way that all the concrete things you derive utility from will bring you twice as much utility, over the course of the infinite future. You’ll live out your life with twice as much of all the things you value. So it makes sense to play this once, by the terms of your utility function.
You don’t know, when you play your first game, whether or not you’ll ever play again; your future includes both options. You can decide, for yourself, that you’ll play once but never again. It’s a free decision both now and later.
And now a second has passed and Omega is offering a second game. You remember your decision. But what place do decisions have in a utility function? You’re free to choose to play again if you wish, and the logic for playing is the same as the first time around...
Now, you could bind yourself to your promise (after the first game). Maybe you have a way to hardwire your own decision procedure to force something like this. But how do you decide (in advance) after how many games to stop? Why one and not, say, ten?
OTOH, if you decide not to play at all—would you really forgo a one-time 90% chance of doubling your lifelong future utility? How about a 99.999% chance? The probability of death in any one round of the game can be made as small as you like, as long as it’s finite and fixed for all future rounds. Is there no probability at which you’d take the risk for one round?
Why on earth wouldn’t I consider whether or not I would play again? Am I barred from doing so?
If I know that the card game will continue to be available, and that Omega can truly double my expected utility every draw, either it’s a relatively insignificant increase of expected utility over the next few minutes it takes me to die, in which case it’s a foolish bet, compared to my expected utility over the decades I have left, conservatively, or Omega can somehow change the whole world in the radical fashion needed for my expected utility over the next few minutes it takes me to die to dwarf my expected utility right now.
This paradox seems to depend on the idea that the card game is somehow excepted from the 90% likely doubling of expected utility. As I mentioned before, my expected utility certainly includes the decisions I’m likely to make, and it’s easy to see that continuing to draw cards will result in my death. So, it depends on what you mean. If it’s just doubling expected utility over my expected life IF I don’t die in the card game, then it’s a foolish decision to draw the first or any number of cards. If it’s doubling expected utility in all cases, then I draw cards until I die, happily forcing Omega to make verifiable changes to the universe and myself.
Now, there are terms at which I would take the one round, IF you don’t die in the card game version of the gamble, but it would probably depend on how it’s implemented. I don’t have a way of accessing my utility function directly, and my ability to appreciate maximizing it is indirect at best. So I would be very concerned about the way Omega plans to double my expected utility, and how I’m meant to experience it.
In practice, of course, any possible doubt that it’s not Omega giving you this gamble far outweighs any possibility of such lofty returns, but the thought experiment has some interesting complexities.
This, again, depends on what you mean by “utility”. Here’s a way of framing the problem such that the logic can change.
Assume that we have some function V(x) that maps world histories into (non-negative*) real-valued “valutilons”, and that, with no intervention from Omega, the world history that will play out is valued at V(status quo) = q.
Then Omega turns up and offers you the card deal, with a deck as described above: 90% stars, 10% skulls. Stars give you double V(star)=2c, where c is the value of whatever history is currently slated to play (so c=q when the deal is first offered, but could be higher than that if you’ve played and won before). Skulls give you death: V(skull)=d, and d < q.
If our choices obey the vNM axioms, there will be some function f(x), such that our choices correspond to maximising E[f(x)]. It seems reasonable to assume that f(x) must be (weakly) increasing in V(x). A few questions present themselves:
Is there a function, f(x), such that, for some values of q and d, we should take cards every time this bet is offered?
Yes. f(x)=V(x) gives this result for all d<q.
Is there a function, f(x), such that, for some values of q and d, we should never take the bet?
Yes. Set d=0, q=1000, and f(x) = ln(V(x)+1). The offer gives vNM utility of 0.9ln(2001)~6.8, which is less than ln(1001)~6.9.
Is there a function, f(x), such that, for some values of q and d, we should take cards for some finite number of offers, and then stop?
Yes. Set d=0, q=1, and f(x) = ln(V(x)+1). The first time you get the offer, it’s vNM utility is 0.9ln(3)~1 which is greater than ln(2)~0.7. But at the 10th time you play (assuming you’re still alive), c=512, and the vNM utility of the offer is now 0.9ln(1025)~6.239, which is less than ln(513)~6.240. So you play up until the 10th offer, then stop.
* This is just to ensure that doubling your valutilons cannot make you worse off, as would happen if they were negative. It should be possible to reframe the problem to avoid this, but let’s stick with this for now.