Actually, I’m not sure whether the strategy that I selected is optimal.
I’m pretty sure that it isn’t optimal, and for a much simpler reason than having infinitely many worlds. The strategy of
“Only buy the slip if it is reasonably priced, ie. costs < p31 utils, no matter what you observe” leads to a Dutch Book.
This takes a bit of explaining, so I’ll try to simplify.
First let’s suppose that the two hypotheses are the only candidates, and that in prior probability they have equal probability 1⁄2.
H3.1. Across all of space time, there are infinitely many civilizations of observers, but the mean number of observers per civilization (taking a suitable limit construction to define the mean) is 200 billion observers.
H3.2. Across all of space time, there are infinitely many civilizations of observers, but the mean number of observers per civilization (taking the same limit construction) is 200 billion trillion observers.
We’ll also suppose that both H3.1 and H3.2 imply the existence of self-aware observers who have reasoned their way to UDT (call such a being a “UDT agent”), and slightly simplify the evidence sets E0 and E1:
E0. A UDT agent is aware of its own existence, but doesn’t yet know anything much else about the world; it certainly doesn’t know yet how many observers there have been yet in its own civilization.
(If you’re reading Stuart Armstrong’s paper, this corresponds to the “ignorant rational baby stage”).
E1. A UDT agent discovers that it is among the first quadrillion (thousand trillion) observers of its civilization.
Again, we define P[X|Ei] as the utility that the agent will pay for a betting slip which pays off 1 utile in the event that hypothesis X is true. You are proposing the following (no updating):
Now, what does the agent assign to P[E1 | E0]? Imagine that the agent is facing a bet as follows. “Omega is about to tell you how many observers there have been before you in your civilization. This betting slip will pay 1 utile if that number is less than 1 quadrillion.”
It seems clear that P[H3.1 & E1 | E0] is very close to P[H3.1 | E0]. If H3.1 is true, then essentially all observers will learn that their observer-rank is less than one quadrillion (forget about the tiny tail probability for now).
It also seems clear that P[H3.2 & E1| E0] is very close to zero, since if H3.2 is true, only a miniscule fraction of observers will learn that their observer-rank is less than one quadrillion (again forget the tiny tail probability).
So to good approximation, we have betting probabilities P[E1 & H3.1 | E0] = P[E1 | E0] = P[H3.1 | E0] = 1⁄2 and
P[~E1 | E0] = 1⁄2. Thus the agent should pay 1⁄2 for a betting slip which pays out 1 utile in the event E1 & H3.1. The agent should also pay 1⁄4 for a betting slip which pays out 1⁄2 utile in the event ~E1.
Now, suppose the agent learns E1. According to your proposal, the agent still has P[H3.1 | E1] = 1⁄2, so the agent should now be prepared to sell the betting slip for E1 & H3.1 for the same price that she paid for it i.e. she sells it again for 1⁄2 a utile.
Oops: the agent is now guaranteed to lose 1⁄4 utile in all circumstances, regardless of whether or not she learns E1 or ~E1. If she learns ~E1, then she pays 3⁄4 for her two bets and wins the ~E1 bet, for a net loss of 1⁄4. If she learns E1 then she loses the bet on ~E1 and her bet on H3.1 & E1 is cancelled out since she has bought and sold the slip at the same price.
Incidentally, Stuart Armstrong discusses this issue in connection with the “Adam and Eve” problem, though he doesn’t give an explicit example of the Dutch book (I had to construct one). The resolution Stuart proposes is that an agent in the E0 (“ignorant rational baby”) stage should precommit not to sell the betting slip again if she learns E1 (or, strictly, not to sell it again unless the sale price is very close to 1 utile). Since we are discussing UDT agents, no such precommitment is needed; the agent will do whatever she should have precommited to do.
In practice this means that on learning E1, the agent follows the commitment and sets her betting probability for H3.1 very close to 1. This is, of course, a Doomsday shift.
You’re still, y’know, updating. Consider each of these bets from the updateless perspective, as strategies to be willing to accept such bets.
The first bet is to “pay 1⁄2 for a betting slip which pays out 1 utile in the event E1 & H3.1”. Adopting the strategy of accepting this kind of bet would result in 1⁄2 util for an infinite number of beings and −1/2 util for an infinite number of beings if H3.1 is true and would result in −1/2 util for an infinite number of beings if H3.2 is true.
If we could aggregate the utilities here, we could just take an expectation by weighing them according to the prior (equally in this case) and accept the bet iff the result was positive. This would give consistent, un-Dutch-bookable, results; since expectations sum, the sum of three bets with nonnegative expectations must itself have a nonnegative expectation. Unfortunately, we can’t do this since, unless you come up with some weird aggregation method other than total or average for the utility function (though my language above basically presumed totalling), the utility is a divergent series and reordering divergent series changes their sums. There is no correct ordering of the people in this scenario, so there is no correct value of the expected utility.
Moving on to the second bet, “pay 1⁄4 for a betting slip which pays out 1⁄2 utile in the event ~E1”, we see that the strategy of accepting this gives +1/2 infinitely many times and −1/2 infinitely many times if H3.1 is true and it gives −1/2 infinitely many times if H3.2 is true. Again, we can’t do the sums.
Finally the third bet, rephrased as a component of a strategy, would be to sell the betting slip from the first bet back for 1⁄2 util again if E1 is observed. Presumably, this opportunity is not offered if ¬E1, so there is no need for the agent to decide what to do in this case. This gives −1/2 infinitely many times if H3.1 and +1/2 infinitely many times if H3.2. The value of 1⁄2 −1/2 ∞ + 1⁄21⁄2 ∞ is, of course, indeterminate, so we can again neither recommend accepting or declining this bet without a better treatment of infinities.
I’m being careful to define the expressions P[X|Ei] as the amount paid for a betting slip on X in an evidential state Ei. This is NOT the same as the agent’s credence in hypothesis X. I agree with you that credences don’t update in UDT (that’s sort of the point). However, I’m arguing that betting payments must change (“update” if you like) between the two evidential states, or else the agent will get Dutch booked.
You describe your strategy as having an infinite gain or loss in each case, so you don’t know whether it is correct (indeed you don’t know which strategy is correct for the same reason). However, earlier up in the thread I already explained that this problem will arise if an agent’s utility depends on bets won or lost by other agents. If instead, each agent has a private utility function (and there is no addition/subtraction for other agents’ bets; only for her own) then this “adding infinities” problem doesn’t arise. Under your proposed strategy (same betting payments in E0 and E1), each individual agent gets Dutch-booked and makes a guaranteed loss of 1⁄4 utile so it can’t be the optimal strategy.
What is optimal then? In the private utility case (utility is a function only of the agent’s own bets), the optimal strategy looks to be to commit to SSA betting odds (which in the simplified example means an evens bet in the state E0, and a Doomsday betting shift in the state E1).
If the agent’s utility function is an average over all bets actually made in a world (average utilitarianism) then provided we take a sensible way of defining the average, such as take the mean (betting gain—betting loss) over N Hubble volumes, then take the limit as N goes to infinity, the optimal strategy is again SSA betting odds.
If the agent’s utility function is a sum over all bets made in a world, then it is not well-defined, for the reasons you discuss: we can’t decide how to bet without a properly-defined utility function. One approach to making it well-defined may be to use non-standard arithmetic (or surreals), but I haven’t worked that through. Another approach is to sum bets only within N Hubble volumes of the agent (assume the agent doesn’t really care about far far away bets), and then only later take the limit as N tend to infinity. This leads to SIA betting odds.
Until recently, I thought that SIA odds meant betting heavily on H3.2 in the state E0, and then reverting to an evens bet in the state E1 (so it counters the Doomsday argument). However, the more recent analysis of SIA indicates that there is still a Doomsday shift because of “great filter” arguments (a variant of Fermi’s paradox), so the betting odds in state E1 should still be weighted towards H3.1.
Basically it doesn’t look good, since every combination of utility function or SSA with or without SIA is now creating a Doomsday shift. The only remaining let out I’ve been considering is a specially-constructed reference class (as used in SSA), but it looks like that won’t work either: in Armstrong’s analysis, we don’t get to define the reference class arbitrarily, since it consists of all linked decisions. (In the UDT case, all decisions that are made by any agents anywhere applying UDT).
I’m pretty sure that it isn’t optimal, and for a much simpler reason than having infinitely many worlds. The strategy of “Only buy the slip if it is reasonably priced, ie. costs < p31 utils, no matter what you observe” leads to a Dutch Book. This takes a bit of explaining, so I’ll try to simplify.
First let’s suppose that the two hypotheses are the only candidates, and that in prior probability they have equal probability 1⁄2.
H3.1. Across all of space time, there are infinitely many civilizations of observers, but the mean number of observers per civilization (taking a suitable limit construction to define the mean) is 200 billion observers.
H3.2. Across all of space time, there are infinitely many civilizations of observers, but the mean number of observers per civilization (taking the same limit construction) is 200 billion trillion observers.
We’ll also suppose that both H3.1 and H3.2 imply the existence of self-aware observers who have reasoned their way to UDT (call such a being a “UDT agent”), and slightly simplify the evidence sets E0 and E1:
E0. A UDT agent is aware of its own existence, but doesn’t yet know anything much else about the world; it certainly doesn’t know yet how many observers there have been yet in its own civilization.
(If you’re reading Stuart Armstrong’s paper, this corresponds to the “ignorant rational baby stage”).
E1. A UDT agent discovers that it is among the first quadrillion (thousand trillion) observers of its civilization.
Again, we define P[X|Ei] as the utility that the agent will pay for a betting slip which pays off 1 utile in the event that hypothesis X is true. You are proposing the following (no updating):
P[H3.1 |E0] = P[H3.2|E0] = 1⁄2, P[H3.1 |E1] = P[H3.2|E1] = 1⁄2
Now, what does the agent assign to P[E1 | E0]? Imagine that the agent is facing a bet as follows. “Omega is about to tell you how many observers there have been before you in your civilization. This betting slip will pay 1 utile if that number is less than 1 quadrillion.”
It seems clear that P[H3.1 & E1 | E0] is very close to P[H3.1 | E0]. If H3.1 is true, then essentially all observers will learn that their observer-rank is less than one quadrillion (forget about the tiny tail probability for now).
It also seems clear that P[H3.2 & E1| E0] is very close to zero, since if H3.2 is true, only a miniscule fraction of observers will learn that their observer-rank is less than one quadrillion (again forget the tiny tail probability).
So to good approximation, we have betting probabilities P[E1 & H3.1 | E0] = P[E1 | E0] = P[H3.1 | E0] = 1⁄2 and P[~E1 | E0] = 1⁄2. Thus the agent should pay 1⁄2 for a betting slip which pays out 1 utile in the event E1 & H3.1. The agent should also pay 1⁄4 for a betting slip which pays out 1⁄2 utile in the event ~E1.
Now, suppose the agent learns E1. According to your proposal, the agent still has P[H3.1 | E1] = 1⁄2, so the agent should now be prepared to sell the betting slip for E1 & H3.1 for the same price that she paid for it i.e. she sells it again for 1⁄2 a utile.
Oops: the agent is now guaranteed to lose 1⁄4 utile in all circumstances, regardless of whether or not she learns E1 or ~E1. If she learns ~E1, then she pays 3⁄4 for her two bets and wins the ~E1 bet, for a net loss of 1⁄4. If she learns E1 then she loses the bet on ~E1 and her bet on H3.1 & E1 is cancelled out since she has bought and sold the slip at the same price.
Incidentally, Stuart Armstrong discusses this issue in connection with the “Adam and Eve” problem, though he doesn’t give an explicit example of the Dutch book (I had to construct one). The resolution Stuart proposes is that an agent in the E0 (“ignorant rational baby”) stage should precommit not to sell the betting slip again if she learns E1 (or, strictly, not to sell it again unless the sale price is very close to 1 utile). Since we are discussing UDT agents, no such precommitment is needed; the agent will do whatever she should have precommited to do.
In practice this means that on learning E1, the agent follows the commitment and sets her betting probability for H3.1 very close to 1. This is, of course, a Doomsday shift.
You’re still, y’know, updating. Consider each of these bets from the updateless perspective, as strategies to be willing to accept such bets.
The first bet is to “pay 1⁄2 for a betting slip which pays out 1 utile in the event E1 & H3.1”. Adopting the strategy of accepting this kind of bet would result in 1⁄2 util for an infinite number of beings and −1/2 util for an infinite number of beings if H3.1 is true and would result in −1/2 util for an infinite number of beings if H3.2 is true.
If we could aggregate the utilities here, we could just take an expectation by weighing them according to the prior (equally in this case) and accept the bet iff the result was positive. This would give consistent, un-Dutch-bookable, results; since expectations sum, the sum of three bets with nonnegative expectations must itself have a nonnegative expectation. Unfortunately, we can’t do this since, unless you come up with some weird aggregation method other than total or average for the utility function (though my language above basically presumed totalling), the utility is a divergent series and reordering divergent series changes their sums. There is no correct ordering of the people in this scenario, so there is no correct value of the expected utility.
Moving on to the second bet, “pay 1⁄4 for a betting slip which pays out 1⁄2 utile in the event ~E1”, we see that the strategy of accepting this gives +1/2 infinitely many times and −1/2 infinitely many times if H3.1 is true and it gives −1/2 infinitely many times if H3.2 is true. Again, we can’t do the sums.
Finally the third bet, rephrased as a component of a strategy, would be to sell the betting slip from the first bet back for 1⁄2 util again if E1 is observed. Presumably, this opportunity is not offered if ¬E1, so there is no need for the agent to decide what to do in this case. This gives −1/2 infinitely many times if H3.1 and +1/2 infinitely many times if H3.2. The value of 1⁄2 −1/2 ∞ + 1⁄2 1⁄2 ∞ is, of course, indeterminate, so we can again neither recommend accepting or declining this bet without a better treatment of infinities.
I’m being careful to define the expressions P[X|Ei] as the amount paid for a betting slip on X in an evidential state Ei. This is NOT the same as the agent’s credence in hypothesis X. I agree with you that credences don’t update in UDT (that’s sort of the point). However, I’m arguing that betting payments must change (“update” if you like) between the two evidential states, or else the agent will get Dutch booked.
You describe your strategy as having an infinite gain or loss in each case, so you don’t know whether it is correct (indeed you don’t know which strategy is correct for the same reason). However, earlier up in the thread I already explained that this problem will arise if an agent’s utility depends on bets won or lost by other agents. If instead, each agent has a private utility function (and there is no addition/subtraction for other agents’ bets; only for her own) then this “adding infinities” problem doesn’t arise. Under your proposed strategy (same betting payments in E0 and E1), each individual agent gets Dutch-booked and makes a guaranteed loss of 1⁄4 utile so it can’t be the optimal strategy.
What is optimal then? In the private utility case (utility is a function only of the agent’s own bets), the optimal strategy looks to be to commit to SSA betting odds (which in the simplified example means an evens bet in the state E0, and a Doomsday betting shift in the state E1).
If the agent’s utility function is an average over all bets actually made in a world (average utilitarianism) then provided we take a sensible way of defining the average, such as take the mean (betting gain—betting loss) over N Hubble volumes, then take the limit as N goes to infinity, the optimal strategy is again SSA betting odds.
If the agent’s utility function is a sum over all bets made in a world, then it is not well-defined, for the reasons you discuss: we can’t decide how to bet without a properly-defined utility function. One approach to making it well-defined may be to use non-standard arithmetic (or surreals), but I haven’t worked that through. Another approach is to sum bets only within N Hubble volumes of the agent (assume the agent doesn’t really care about far far away bets), and then only later take the limit as N tend to infinity. This leads to SIA betting odds.
Until recently, I thought that SIA odds meant betting heavily on H3.2 in the state E0, and then reverting to an evens bet in the state E1 (so it counters the Doomsday argument). However, the more recent analysis of SIA indicates that there is still a Doomsday shift because of “great filter” arguments (a variant of Fermi’s paradox), so the betting odds in state E1 should still be weighted towards H3.1.
Basically it doesn’t look good, since every combination of utility function or SSA with or without SIA is now creating a Doomsday shift. The only remaining let out I’ve been considering is a specially-constructed reference class (as used in SSA), but it looks like that won’t work either: in Armstrong’s analysis, we don’t get to define the reference class arbitrarily, since it consists of all linked decisions. (In the UDT case, all decisions that are made by any agents anywhere applying UDT).