So, I’m still working on this in my plodding, newbie-at-probability-math fashion.
What I took away from my exchanges with AlephNeil is that I get the clearest picture if I think in terms of a joint probability distribution, and attempt to justify mathematically each step of my building the table, as well as the operations of conditioning and marginalizing.
In the original Sleeping Beauty problem, we have three variables: x is how the coin came up {heads, tails}, y is the day of the week {monday, tuesday}, and z is whether I am asked for my credence (i.e. woken) {wake, sleep}.
P(x,y,z)=P(x)P(y|x)P(z|x,y) and unlike in the “revival” case x and y aren’t clearly independent. In fact the answer very much seems to hinge on what we take the probability of it being tuesday, given that the coin came up heads.
The relevant possible outcomes are: (H,M,W) (H,T,W) (T,M,W) (T,T,W) (H,M,S) (H,T,S) (T,M,S) (T,T,S) - eight in all.
Conditioning on z=W consists of deleting the part of the table that has z=S, summing up all the remaining values, and renormalizing by dividing every cell in cell in the table by the total.
The rules for filling the table are: the values must add up to 1; the “heads” and “tails” branches must receive equal probability mass from P(x); and P(z|x,y) must reflect the experimental rules. So we must have the following:
P(H,M,W) - see below
P(H,T,W)=0
P(T,M,W)=1/4
P(T,T,W)=1/4
P(H,M,S)=0
P(H,T,S) - see below
P(T,M,S)=0
P(T,T,S)=0
The ambiguity seems to arise in allocating probability mass to the outcomes: “the coin comes up heads; it is Monday; I get woken up”, and “the coin comes up heads; it is Tuesday; I do no get woken up”. That is, I’m not sure what the correct conditional distribution P(y|x) should be.
The 1⁄2 answer corresponds to allocating all of the available 1⁄2 probability mass to the first of these outcomes in the joint table, saying P(y=M|x=H)=1 and P(y=T|x=H)=0. Or verbally, “it’s certain that I get woken up on Monday if the coin comes up heads, and after that the experiment is over”. The “not woken up” half of the table receives no probability mass at all.
The 1⁄3 answer corresponds to distributing that probability mass among the two outcomes, saying P(y=M|x)=P(y=T|x)=1/2. Verbally: “however the coin comes up, it could be either Monday or Tuesday”. Here 1⁄4 of the total probability mass is in the “not woken up” half of the table and gets deleted when we condition on being woken.
(ETA: Where does the amnesia appear in this formalization? It doesn’t, but neither does it need to. Its only practical consequence is to outlaw conditioning on the day, so working out the distribution P(x|z) conforms to the amnesia.)
I think the question we now have to ask to resolve the remaining confusion is—what, exactly, is it that Beauty is uncertain about, and at what time?
The variables we are considering only seem to make sense if Beauty is having woken up as part of the experiment. That is, assuming x means “the coin came up heads or tails”, y means “it is Monday or Tuesday”, and z means “I am awake or asleep”—i.e., we’re dealing with uncertainty about facts that are already fixed, just unknown. Then these do not make sense outside that context.
Using that interpretation, then, and sticking to that context, we get the answer of 1⁄2, as if Beauty has just been woken up, she cannot allocate any probability mass to the possibility that she is asleep.
What other interpretations could there be? Perhaps the coin has not yet been flipped, and x is “the coin will come up heads (tails)”, y is “it will be Monday (Tuesday) when I wake up”, z is “I will be awake (asleep) when I wake up” (!). Of course, if the coin has not yet been flipped, I think we can agree 1⁄2 has to be the right answer. (Which then leads to the argument that it has to be 1⁄2 as she hasn’t gained any information, but I guess that’s been gone over before.) But the problem is that this y doesn’t seem well-defined, as she might be woken up more than once. (Hm, this is sounding familiar as well...) We could perhaps introduce separate variables for being woken up on each day; from the pre-flip point of view, that makes more sense. But it still gets you an answer of 1⁄2.
This is all I can come up with; I’m not seeing what other interpretations there could be. Could someone explain just what ‘x’, ‘y’, and ‘z’ correspond to—if they do correspond to anything well-defined rather than having to be thrown out—in the interpretations that get you 1/3? I don’t see any way for the probabilities to represent her uncertainty at the time of waking, while still having her assign nonzero probability to the possibility that she’s asleep.
I think the question we now have to ask to resolve the remaining confusion is—what, exactly, is it that Beauty is uncertain about, and at what time?
“At what time” doesn’t matter in this formalism. You can be uncertain about future events or about past events, all that matters is how you update your uncertainty upon receiving new information.
So a triplet (x,y,z) represents, in the abstract, a conceivable configuration of the component uncertainties in the experimental setup. The coin could have come up heads or tails; it could be Monday or Tuesday; Beauty can be woken up on that day, or left asleep.
The joint probability P(x,y,z) is the plausibility we assign—in a timeless manner—to the corresponding propositions. Strictly speaking, it should be P(x,y,z|B) where B is our background information about the experiment: the rules, the fact that the coin is unbiased (or not known to be biased), and so on.
Our background information directs how we allocate probability mass to the various points in the sample space: P(T,T,S) corresponds to “the coin comes up tails, the day is Tuesday, Beauty is asleep”. The rules of the experiment require that this be zero.
On the other hand, P(H,T,S) corresponds to “the coin comes up heads, the day is Tuesday, Beauty is asleep”, and this can be non-zero.
When you learn (“condition on”) some new information, the probability distribution is altered: you only keep the points which correspond to this particular variable having the value(s) you learned, and you renormalize so that the total probability is 1. So, on learning “heads” you keep only the points having x=H. On learning what day it is you keep only the points having that value for y.
When Beauty wakes up, she learns the value of z, so she can condition on z. That means she throws away the part of the joint distribution where she was supposed to be asleep. If that part of the joint distribution did contain some probability mass (as I’ve argued above it can), then that can make P(x|z=W) something other than 1⁄2.
Hm. Should “S” be representing “Beauty is asleep or the experiment is over”? Seeing as how the experiment ends after one day if heads comes up. But then, we can just modify the problem to say she’s put back to sleep for the rest of Tuesday in the case of heads; that shouldn’t change anything.
It seems to me that if we make the experiment last three days instead of two, that ambiguity goes away: then it becomes clear that Beauty must assign non-zero probability mass to (H,T,S). (Or does it?)
However, that means I’d have to change my mind once again, and decide that the correct answer is in fact 1⁄3.
Here is a Google spreadsheet showing my reasoning. Any feedback welcome.
The three day version goes: “Beauty is explained the rules on Sunday and put to sleep, then a coin is flipped. If it comes up heads, Beauty is awakened on Monday and sleeps through Tuesday and Wednesday. If it comes up tails, Beauty is awakened on Monday, Tuesday and Wednesday. On all awakenings (with the previous day’s memories erased by the sleeping drug) she is asked for her credence in Heads.”
This differs from the original which says “the experiment ends on Monday is the coin comes up heads”. But Beauty would have the same uncertainty if you decided, in the original version, to wake Beauty on Tuesday in the event of heads, rather than Monday.
BTW the Google spreadsheet has a chat area, if you’d like to discuss this live.
So, I’m still working on this in my plodding, newbie-at-probability-math fashion.
What I took away from my exchanges with AlephNeil is that I get the clearest picture if I think in terms of a joint probability distribution, and attempt to justify mathematically each step of my building the table, as well as the operations of conditioning and marginalizing.
In the original Sleeping Beauty problem, we have three variables: x is how the coin came up {heads, tails}, y is the day of the week {monday, tuesday}, and z is whether I am asked for my credence (i.e. woken) {wake, sleep}.
P(x,y,z)=P(x)P(y|x)P(z|x,y) and unlike in the “revival” case x and y aren’t clearly independent. In fact the answer very much seems to hinge on what we take the probability of it being tuesday, given that the coin came up heads.
The relevant possible outcomes are: (H,M,W) (H,T,W) (T,M,W) (T,T,W) (H,M,S) (H,T,S) (T,M,S) (T,T,S) - eight in all.
Conditioning on z=W consists of deleting the part of the table that has z=S, summing up all the remaining values, and renormalizing by dividing every cell in cell in the table by the total.
The rules for filling the table are: the values must add up to 1; the “heads” and “tails” branches must receive equal probability mass from P(x); and P(z|x,y) must reflect the experimental rules. So we must have the following:
P(H,M,W) - see below
P(H,T,W)=0
P(T,M,W)=1/4
P(T,T,W)=1/4
P(H,M,S)=0
P(H,T,S) - see below
P(T,M,S)=0
P(T,T,S)=0
The ambiguity seems to arise in allocating probability mass to the outcomes: “the coin comes up heads; it is Monday; I get woken up”, and “the coin comes up heads; it is Tuesday; I do no get woken up”. That is, I’m not sure what the correct conditional distribution P(y|x) should be.
The 1⁄2 answer corresponds to allocating all of the available 1⁄2 probability mass to the first of these outcomes in the joint table, saying P(y=M|x=H)=1 and P(y=T|x=H)=0. Or verbally, “it’s certain that I get woken up on Monday if the coin comes up heads, and after that the experiment is over”. The “not woken up” half of the table receives no probability mass at all.
The 1⁄3 answer corresponds to distributing that probability mass among the two outcomes, saying P(y=M|x)=P(y=T|x)=1/2. Verbally: “however the coin comes up, it could be either Monday or Tuesday”. Here 1⁄4 of the total probability mass is in the “not woken up” half of the table and gets deleted when we condition on being woken.
(ETA: Where does the amnesia appear in this formalization? It doesn’t, but neither does it need to. Its only practical consequence is to outlaw conditioning on the day, so working out the distribution P(x|z) conforms to the amnesia.)
OK, this seems quite helpful.
I think the question we now have to ask to resolve the remaining confusion is—what, exactly, is it that Beauty is uncertain about, and at what time?
The variables we are considering only seem to make sense if Beauty is having woken up as part of the experiment. That is, assuming x means “the coin came up heads or tails”, y means “it is Monday or Tuesday”, and z means “I am awake or asleep”—i.e., we’re dealing with uncertainty about facts that are already fixed, just unknown. Then these do not make sense outside that context.
Using that interpretation, then, and sticking to that context, we get the answer of 1⁄2, as if Beauty has just been woken up, she cannot allocate any probability mass to the possibility that she is asleep.
What other interpretations could there be? Perhaps the coin has not yet been flipped, and x is “the coin will come up heads (tails)”, y is “it will be Monday (Tuesday) when I wake up”, z is “I will be awake (asleep) when I wake up” (!). Of course, if the coin has not yet been flipped, I think we can agree 1⁄2 has to be the right answer. (Which then leads to the argument that it has to be 1⁄2 as she hasn’t gained any information, but I guess that’s been gone over before.) But the problem is that this y doesn’t seem well-defined, as she might be woken up more than once. (Hm, this is sounding familiar as well...) We could perhaps introduce separate variables for being woken up on each day; from the pre-flip point of view, that makes more sense. But it still gets you an answer of 1⁄2.
This is all I can come up with; I’m not seeing what other interpretations there could be. Could someone explain just what ‘x’, ‘y’, and ‘z’ correspond to—if they do correspond to anything well-defined rather than having to be thrown out—in the interpretations that get you 1/3? I don’t see any way for the probabilities to represent her uncertainty at the time of waking, while still having her assign nonzero probability to the possibility that she’s asleep.
“At what time” doesn’t matter in this formalism. You can be uncertain about future events or about past events, all that matters is how you update your uncertainty upon receiving new information.
So a triplet (x,y,z) represents, in the abstract, a conceivable configuration of the component uncertainties in the experimental setup. The coin could have come up heads or tails; it could be Monday or Tuesday; Beauty can be woken up on that day, or left asleep.
The joint probability P(x,y,z) is the plausibility we assign—in a timeless manner—to the corresponding propositions. Strictly speaking, it should be P(x,y,z|B) where B is our background information about the experiment: the rules, the fact that the coin is unbiased (or not known to be biased), and so on.
Our background information directs how we allocate probability mass to the various points in the sample space: P(T,T,S) corresponds to “the coin comes up tails, the day is Tuesday, Beauty is asleep”. The rules of the experiment require that this be zero.
On the other hand, P(H,T,S) corresponds to “the coin comes up heads, the day is Tuesday, Beauty is asleep”, and this can be non-zero.
When you learn (“condition on”) some new information, the probability distribution is altered: you only keep the points which correspond to this particular variable having the value(s) you learned, and you renormalize so that the total probability is 1. So, on learning “heads” you keep only the points having x=H. On learning what day it is you keep only the points having that value for y.
When Beauty wakes up, she learns the value of z, so she can condition on z. That means she throws away the part of the joint distribution where she was supposed to be asleep. If that part of the joint distribution did contain some probability mass (as I’ve argued above it can), then that can make P(x|z=W) something other than 1⁄2.
Hm. Should “S” be representing “Beauty is asleep or the experiment is over”? Seeing as how the experiment ends after one day if heads comes up. But then, we can just modify the problem to say she’s put back to sleep for the rest of Tuesday in the case of heads; that shouldn’t change anything.
It seems to me that if we make the experiment last three days instead of two, that ambiguity goes away: then it becomes clear that Beauty must assign non-zero probability mass to (H,T,S). (Or does it?)
However, that means I’d have to change my mind once again, and decide that the correct answer is in fact 1⁄3.
Here is a Google spreadsheet showing my reasoning. Any feedback welcome.
Can you explain what the three day version means in English, I’m having a little trouble parsing the spreadsheet.
See here and its grandparent.
The three day version goes: “Beauty is explained the rules on Sunday and put to sleep, then a coin is flipped. If it comes up heads, Beauty is awakened on Monday and sleeps through Tuesday and Wednesday. If it comes up tails, Beauty is awakened on Monday, Tuesday and Wednesday. On all awakenings (with the previous day’s memories erased by the sleeping drug) she is asked for her credence in Heads.”
This differs from the original which says “the experiment ends on Monday is the coin comes up heads”. But Beauty would have the same uncertainty if you decided, in the original version, to wake Beauty on Tuesday in the event of heads, rather than Monday.
BTW the Google spreadsheet has a chat area, if you’d like to discuss this live.