A probability is an attribute of your model of the world, not an attribute of the world of the world itself. In this case, we’re looking for a world model which cares about the outcome of this particular bet. Specifically, on Sunday, before the experiment starts, the participant wants to choose a policy which produces a good outcome on Wednesday.
On Sunday, the participant determines that the probabilities are
0.5 they will be woken on Monday and offered the bet (value -$300 if they take it)
0.5 they will be woken on Monday and offered the bet, and then woken again on Tuesday and offered the bet. (value +$200 if they take it)
If this were the entire situation, and the participant had to have a single deterministic policy, they should obviously not take the bet (0.5 to accept the bet once and lose $300, 0.5 to accept the bet twice and win $200, so total EV is -$50).
If the participant had access to a source of randomness which would return a random number r between 0 and 1 on demand, with the results of that randomness independent between times of getting a random number, they could do better, by saying that they will take the bet if r <= p on that roll, for some p they determined on Sunday. Now there are 6 cases:
0.5*(1-p) they will be woken on Monday and see r > p (don’t take bet, $0)
0.5*p they will be woken on Monday and see r <= p (take bet, -$300)
0.25*(1-p)*(1-p) they will be woken on Monday and see r > p, and then woken again on Tuesday and see r > p again (don’t take bet either time, $0)
0.25*(1-p)*p they will be woken on Monday and see r > p, and then woken again on Tuesday and see r <= p (take bet on Tuesday, +$200)
0.25*p*(1-p) they will be woken on Monday and see r <= p, and then woken again on Tuesday and see r > p (take bet on Monday, +$200)
0.25*p*p they will be woken on Monday and see r <= p, and then woken again on Tuesday and see r <= p again (take bet on both Monday and Tuesday, only one of which counts, +$200)
The degenerate case of p=1 is the “always take the bet” case above, EV=-$50.p=0 is “never take the bet”, EV=$0. But in between those is a curve
It turns out the maximal EV is obtained when p=0.25, which gives an EV of +$6.25.
If we want to do better than that, we need a way for the participant to ensure that their decision on Tuesday is not only uncorrelated with their decision on Monday, but anticorrelated with their decision on Monday. The only possible source of anticorrelation would be the color of the walls. So the participant can now make a decision on Sunday “with what probability do I take the bet if the walls are red” (p_r) and separately “with what probability do I take the bet if the walls are blue” (p_b). There are now 12 cases to work through (would be 20, but “two wakes, room red Mon, room red Tue” is not possible, nor is “two wakes, room blue Mon, room blue Tue”).
Computing the EV for every possible p_r and p_b
we see that there are two maxima, one where the participant always bets if in a red room and never bets in a blue room, and the other where the participant always bets if in a blue room and never bets in a red room. Both of these maxima have an EV of +$25.
A probability is an attribute of your model of the world, not an attribute of the world of the world itself. In this case, we’re looking for a world model which cares about the outcome of this particular bet. Specifically, on Sunday, before the experiment starts, the participant wants to choose a policy which produces a good outcome on Wednesday.
On Sunday, the participant determines that the probabilities are
0.5 they will be woken on Monday and offered the bet (value -$300 if they take it)
0.5 they will be woken on Monday and offered the bet, and then woken again on Tuesday and offered the bet. (value +$200 if they take it)
If this were the entire situation, and the participant had to have a single deterministic policy, they should obviously not take the bet (0.5 to accept the bet once and lose $300, 0.5 to accept the bet twice and win $200, so total EV is -$50).
If the participant had access to a source of randomness which would return a random number
r
between 0 and 1 on demand, with the results of that randomness independent between times of getting a random number, they could do better, by saying that they will take the bet ifr <= p
on that roll, for somep
they determined on Sunday. Now there are 6 cases:0.5*(1-p) they will be woken on Monday and see
r > p
(don’t take bet, $0)0.5*p they will be woken on Monday and see
r <= p
(take bet, -$300)0.25*(1-p)*(1-p) they will be woken on Monday and see
r > p
, and then woken again on Tuesday and seer > p
again (don’t take bet either time, $0)0.25*(1-p)*p they will be woken on Monday and see
r > p
, and then woken again on Tuesday and seer <= p
(take bet on Tuesday, +$200)0.25*p*(1-p) they will be woken on Monday and see
r <= p
, and then woken again on Tuesday and seer > p
(take bet on Monday, +$200)0.25*p*p they will be woken on Monday and see
r <= p
, and then woken again on Tuesday and seer <= p
again (take bet on both Monday and Tuesday, only one of which counts, +$200)The degenerate case of
p=1
is the “always take the bet” case above, EV=-$50.p=0
is “never take the bet”, EV=$0. But in between those is a curveIt turns out the maximal EV is obtained when p=0.25, which gives an EV of +$6.25.
If we want to do better than that, we need a way for the participant to ensure that their decision on Tuesday is not only uncorrelated with their decision on Monday, but anticorrelated with their decision on Monday. The only possible source of anticorrelation would be the color of the walls. So the participant can now make a decision on Sunday “with what probability do I take the bet if the walls are red” (
p_r
) and separately “with what probability do I take the bet if the walls are blue” (p_b
). There are now 12 cases to work through (would be 20, but “two wakes, room red Mon, room red Tue” is not possible, nor is “two wakes, room blue Mon, room blue Tue”).Computing the EV for every possible
p_r
andp_b
we see that there are two maxima, one where the participant always bets if in a red room and never bets in a blue room, and the other where the participant always bets if in a blue room and never bets in a red room. Both of these maxima have an EV of +$25.
… what was the philosophical question, again?