It comes from taking a ratio of expected counts. First, a motivating example.
Suppose people can fall into one of three categories. For example, we might create a categorical age variable, catage, where catage is 1 if age50.
Suppose we randomly select N people from the population. Let n1 be the number of people with catage=1, with n2 and n3 defined similarly. Given the sample size N, the random variables n1, n2 and n3 follow a multinomial distribution, with parameters (probabilities) p1, p2 and p3, respectively, where p1+p2+p3=1 and n1+n2+n3=N (i.e., 2 degrees of freedom).
The probability that agecat=1, p1, is lim N-> infinity n1/(n1+n2+n3).
That concept applied to Sleeping Beauty
With the sleeping beauty problem, what we see is something similar. Imagine we ran the experiment N times. Let n1 be the number of times it was Monday&Heads, n2 the number of times it was Monday&tails, n3 the number of times it was Tuesday&tails.
The 1⁄3 solution makes the assumption that the probability of heads given an awakening is:
lim N-> infinity n1/(n1+n2+n3)
But, we have a problem here. N does not equal n1+n2+n3, it is equal to n1+n2. Also, the random variables n2 and n3 are identical. Thus, we could substitute:
lim N-> infinity n1/(n1+2*n2)
There are really just two random variables (n1 and n2) and 1 degree of freedom. In that case, we can think of n1 as coming from a Binomial distribution with sample size N=n1+n2 and probability p1. The probability of heads&Monday is then
lim N-> infinity n1/(n1+n2)=1/2.
Another example
If you don’t believe the above reasoning, consider another example.
Suppose half of the population are male and the other half are female. Also, suppose that only females have ovaries.
Suppose I record 3 variables: indicator that the person is male, indicator that the person is female, and indicator that the person has ovaries.
I sample N people, and get counts for those 3 variables of n1, n2 and n3. Given that we recorded a variable for a randomly selected person, is the probability that they are male equal to
lim N->infinity n1/(n1+n2+n3) ?
Of course not. It’s lim N->infinity n1/(n1+n2).
Even though n2 and n3 are counts of something different, in a sense, they are really the same variable. Just like Beauty waking up on Monday is the same as Beauty waking up on Tuesday. There is no justification for treating them as separate variables.
When you do treat them as separate variables, you end up with nonsense (such as probabilities greater than 1 link ).
The 1⁄3 solution makes the assumption that the probability of heads given an awakening is:
lim N-> infinity n1/(n1+n2+n3)
I’d quibble about calling it an assumption. The 1⁄3 solution notes that
this is the ratio of observations upon awakening of heads to the total
number of observations, which is one of the problematic facts about the
experimental setup. The 1⁄3 solution assumes that this is relevant to what
we should mean by “credence”, and makes an argument that this is a
justification for the claim that Sleeping Beauty’s credence should be 1⁄3.
Your argument is, I take it, that these counts of observations are
irrelevant, or at best biased. Something else should be counted, or should
be counted differently. The disagreement seems to center on the
denominator; it should count not awakenings, but coin-tosses. Then there is
a difference in the definition of the relevant events and the probabilities
that get calculated from them.
Thirders: An event is an awakening.
The question asks about # awakenings with heads / total awakenings.
This ratio is an estimate of a fraction that can be used to predict frequencies of something of interest.
Halfers: An event is a coin-toss.
The question asks about # tosses with heads / total tosses.
This ratio is an estimate of a fraction which is universally agreed to be a probability, and can be used to predict frequencies of something of interest.
Did I get that right? Is this a fair description?
I think a key difference between halfers and thirders is that for thirders,
the occurrence of an awakening constitutes evidence of the current state of
the system that’s being asked about—whether the coin shows heads or tails,
because the frequency with which the state of the system is asked about (or,
equivalently, an observation is made) is influenced by the current state of
the system. To ward off certain objections, it is of no consequence whether
this influence is deterministic, probabilistic or mixed in nature, the mere
fact that it exists can and should be exploited. I don’t think there’s
disagreement that it exists, but there is over how it’s relevant.
Halfers deny that any new evidence becomes available on awakening, because
the operation of the process is completely known ahead of time.
(Alternatively, if any new evidence could be said to become available, it
cannot be exploited.) From what I can tell, and my understanding is surely
imperfect, there is some kind of cognitive dissonance about what kinds of
things can constitute evidence in some epistomological theory, such that
drawing a distinction between the actual occurrence of an event and the
knowledge that at least one such event will surely occur is illegitimate for
halfers. Is this a fair description?
Suppose half of the population are male and the other half are female. Also, suppose that only females have ovaries.
Suppose I record 3 variables: indicator that the person is male, indicator that the person is female, and indicator that the person has ovaries.
I sample N people, and get counts for those 3 variables of n1, n2 and n3.
Given that we recorded a variable for a randomly selected person, is the probability that they are male equal to
lim N->infinity n1/(n1+n2+n3) ?
That’s as may be, but it doesn’t help Sleeping Beauty in her quandary. If
you think this example helps to prove your point, I think it helps to prove
the opposite. Although she knows, in this variation, that a randomly
selected person will be tested, the random person selection process is not
accessible to her, only the opportunity to know that one of three possible
test results has been collected. She knows very well, given a randomly
selected person (resp. a coin toss), what the probability they are male is
(resp. the given coin toss came Heads). She isn’t being asked about that
conditional probability. (Or maybe you think she is? Please clarify.) To
follow your analogy, upon being awakened, she’s informed that a test result
has been collected from an unknown person, and now, given that a test
result has been collected, what are the chances it cames from a male?
Clearly the selection process for asking Sleeping Beauty questions is
biased. If bias had not been introduced by an extra awakening on Tuesday,
the problem would collapse into triviality. The puzzle asks how this
sampling bias should affect Sleeping Beauty’s calculations of what to answer
on awakening, if at all. One of the reasons for doing statistical analysis
of sampling schemes is to quantify how the mechanism that’s introducing bias
changes the expected values of observations. In the SB case, the biased
selection process is a mixture of random and deterministic mechanisms.
Untangling the random from the deterministic parts is difficult enough for
the participants in this discussion—they can’t even agree on a forking
path diagram! Untangling it for Sleeping Beauty while she’s in the
experiment is epistemically impossible. She has no basis whatsoever
inside the game for saying, “this one is randomly different from the last
one” versus “this one is deterministically identical to the last one,
therefore this one doesn’t count.”
The same considerations apply to the case of the cancer test. Let me
elaborate on your scenario to see if I understand it, and let me know if I’m
mischaracterizing the test protocol in any material way. There is a test
for a disease condition. Every person knows they have a 50% chance going in
of testing positive for the disease. We’ll stipulate that the repeatability
of the test is perfect, though in real life this is achieved only within
epsilon of certainty. (Btw, here’s where the continuity argument enters in:
how crucial is the assumption of absolute certainty versus near certainty?
What hinges on that?) In this protocol, if the initial test result is
positive, then the test is repeated k times (k=2 or 10, or whatever you deem
necessary), either with a new sample or from an aliquot of the original
sample, I don’t think it matters which. Here the repetition is because of the obstinacy
of the head of the test lab and their predilection for amnesia drugs; in
real life the reasons would be something like the very high cost in anguish
and/or money of a false positive, however unlikely. You, as a recorder of
test results, see a certain number of test samples come through the lab.
The identities of the samples are encrypted, so your epistemic state with
regard to any particular test result is identical to that for any other test
sample and its result.
So now the question comes down to this: upon any particular awakening, how
is the test subject’s epistemic state at any particular awakening
significantly different from the lab tech’s epistemic state regarding any
particular test sample? There is a one-to-one correspondence between test
samples being evaluated and questions to the patient about their prognosis.
Should they give the same answer, or is there a reason why they should give
different answers? Just as with the patient, the lab tech knows that any
randomly chosen individual has a 50% chance of of giving a positive test
result, but does she give the same answer to that question as to a
different question: given that she has a particular sample in her hands,
what is the probability that the person it belongs to will test positive?
She knows that she has k times as many samples in her lab that will test
positive than otherwise, but she has no way of knowing whether the sample in
her hands is an initial sample or a replicate. It seems to me that halfers
might be claiming these two questions are the same question, while thirders
claim that they are different questions with different answers. Is this a
fair description? If not, please clarify.
Of course not. It’s lim N->infinity n1/(n1+n2).
Even though n2 and n3 are counts of something different, in a sense, they are really the same variable. Just like Beauty waking up on Monday is the same as Beauty waking up on Tuesday. There is no justification for treating them as separate variables.
What you say is true for any outside observers, and for Sleeping Beauty
after the experiment is over and the logbooks analyzed. But while Sleeping
Beauty is in the experiment, this option is simply not available to her.
The scenario has been carefully constructed to make this so, that’s what
makes it an interesting problem. The whole point of the amnesia drug in the
SB setup (or downloadable avatars, or forking universes, random passersby,
whatever) is that she has NO justification nor even a method for NOT
treating any of her awakenings as separate variables, because the
information that could allow her to do this is unavailable to her. By
construction—and this is the defining feature of Sleeping Beauty—all
Sleeping Beauty’s awakenings are epistemically indistinguishable. She has
no choice but to treat them all identically.
This phenomenon is a common occurrence in queueing systems where there’s a
very definite and well-understood difference between omniscient “outside
observers” and epistemically indistinguishable “arriving customers”, who can
have different values for the probability of observing the system in state
X, where the system is executing a well-defined random process, or even a
combination random-deterministic process.
Thanks for your detailed response. I’ll make a few comments now, and address more of it later (short on time).
Your argument is, I take it, that these counts of observations are irrelevant, or at best biased.
No, I was just saying that this, lim N-> infinity n1/(n1+n2+n3), is not actually a probability in the sleeping beauty case.
The disagreement seems to center on the denominator; it should count not awakenings, but coin-tosses.
No, I wouldn’t say that. My argument is that you should use probability laws to get the answer. If you take ratios of expected counts, well, you have to show that what you get as actually a probability.
I definitely disagree with your bullet points about what halfers think
I said: “Just like Beauty waking up on Monday is the same as Beauty waking up on Tuesday. There is no justification for treating them as separate variables.”
You disagreed, and said:
What you say is true for any outside observers, and for Sleeping Beauty after the experiment is over and the logbooks analyzed. But while Sleeping Beauty is in the experiment, this option is simply not available to her. The scenario has been carefully constructed to make this so, that’s what makes it an interesting problem. The whole point of the amnesia drug in the SB setup (or downloadable avatars, or forking universes, random passersby, whatever) is that she has NO justification nor even a method for NOT treating any of her awakenings as separate variables, because the information that could allow her to do this is unavailable to her. By construction—and this is the defining feature of Sleeping Beauty—all Sleeping Beauty’s awakenings are epistemically indistinguishable. She has no choice but to treat them all identically.
Hm, I think that is what I’m saying. She does have to treat them all identically. They are the same variable. That’s why she has to say the same thing on Monday and Tuesday. That’s why an awakening contains no new info. If she had new evidence at an awakening, she’d give different answers under heads and tails.
Your argument is, I take it, that these counts of observations are irrelevant, or at best biased.
No, I was just saying that this, lim N-> infinity n1/(n1+n2+n3), is not actually a probability in the sleeping beauty case.>
I maintain that it is. I can guarantee you that it is. What obstacle do
you see to accepting that? You’ve made noises that this is because the
counts are correlated, but I haven’t seen any argument for this beyond bare
assertion. Do you want to claim it is impossible for some reason, or are
you just saying you haven’t seen a persuasive argument yet?
The disagreement seems to center on the denominator; it should count not awakenings, but coin-tosses.
No, I wouldn’t say that. My argument is that you should use probability laws to get the answer. If you take ratios of expected counts, well, you have to show that what you get as actually a probability.
What would you require for proof? If I could show you a Markov chain whose
behavior is isomorphic to iterated Sleeping Beauty, would that convince you?
I also am not sure what you mean when you say “use probability laws”. Is
there a failure to comport with the Kolmogorov axioms? Is there a problem
with the definition of the events? Do you mean Bayes’ Theorem, or some other
law(s)? I also am deeply suspicious of the phrase “get the answer”. I will
have no idea what this could mean until we can eliminate ambiguity about
what the question is (there seems to be a lot of that going around), or what
class of questions you’ll admit as legitimate.
defining feature of Sleeping Beauty—all Sleeping Beauty’s awakenings are epistemically indistinguishable. She has no choice but to treat them all identically.
Hm, I think that is what I’m saying. She does have to treat them all identically. They are the same variable. That’s why she has to say the same thing on Monday and Tuesday.
Up to this point, I see we are actually in strenuous agreement on this
aspect, so I can stop belaboring it.
That’s why an awakening contains no new info.
If she had new evidence at an awakening, she’d give different answers under heads and tails.
I don’t mean to claim that as soon as Beauty awakes, new evidence comes to
light that she can add to her store of bits in additive fashion, and thereby
update her credence from 1⁄2 to 1⁄3 along the way. If this is the only kind
of evidence that your theory of Bayesian updating will acknowledge, then it
is too restrictive. Since Beauty is apprised of all the relevant details of
the experimental process on Sunday evening, she can (and should) use the
fact that the predicted frequency of awakenings into a reset epistemic state
is dependent on the state of the coin toss to change the credence she
reports on such awakenings from 1⁄2 to 1⁄3. She can tell you this on Sunday
night, just as I can tell you now, before any of us enter into any such
experimental procedure. So her prediction about what she should answer on
an awakening does not change from Sunday evening to Monday morning.
The key pieces of information she uses to arrive at this revised estimate are:
That the questions will be asked in a reset epistemic state. This requires her to give the same answer on all awakenings.
That the frequency of awakenings is dependent in a specific way on the result of the coin toss. This requires her to update the credence she’ll report on awakenings from 1⁄2 to 1⁄3.
At this point, it is just assertion that it’s not a probability. I have reasons for believing it’s not one, at least, not the probability that people think it is. I’ve explained some of that reasoning.
I think it’s reasonable to look at a large sample ratio of counts (or ratio of expected counts). The best way to do that, in my opinion, is with independent replications of awakenings (that reflect all possibilities at an awakening). I probably haven’t worded this well, but consider the following two approaches. For simplicity, let’s say we wanted to do this (I’m being vague here) 1000 times.
Replicate the entire experiment 1000 times. That is, there will be 1000 independent tosses of the coin. This will lead between 1000 and 2000 awakenings, with expected value of 1500 awakenings. But… whatever the total number of awakenings are, they are not independent. For example, one the first awakening it could be either heads or tails. On the second awakening, it only could be heads if it was heads on the first awakening. So, Beauty’s options on awakening #2 are (possibly) different than her options on awakening #1. We do not have 2 replicates of the same situation. This approach will give you the correct ratio of counts in the long run (for example, we do expect the # of heads & Monday to equal the # of tails and Monday and the # of tails and Tuesday).
Replicate her awakening-state 1000 times. Because her epistemic state is always the same on an awakening, from her perspective, it could be Monday or Tuesday, it could be heads or tails. She knows that it was a fair coin. She knows that if she’s awake it’s definitely Monday if heads, and could be either Monday or Tuesday if tails. She knows that 50% of coin tosses would end up heads, so we assign 0.5 to Monday&heads. She knows that 50% of coin tosses would end up tails, so we assign 0.5 to tails, which implies 0.25 to tails&Monday and 0.25 to tails&Tuesday. If we generate observations from this 1000 times, we’ll get 1000 awakenings. We’ll end up with heads 50% of the time.
The distinction between 1 and 2 is that, in 2, we are trying to repeatedly sample from the joint probability distributions that she should have on an awakening. In 1, we are replicating the entire experiment, with the double counting on tails.
In 1, people are using these ratios of expected counts to get the 1⁄3 answer. 1⁄3 is the correct answer to the question about the long-run frequencies of awakenings preceded by heads to awakenings preceded by tails. But I do not think it is the answer to the question about her credence of heads on an awakening.
In 2, the joint probabilities are determined ahead of time based on what we know about the experiment.
Let n2 and n3 are counts, in repeated trials, of tails&Monday and tails&Tuesday, respectively. You will of course see that n2=n3. They are the same random variable. tails&Monday and tails&Tuesday are the same. It’s like what Jack said about types and tokens. It’s like Vladimir_Nesov said:
Two subsequent states of a given dynamical system make for poor distinct elements of a sample space: when we’ve observed that the first moment of a given dynamical trajectory is not the second, what are we going to do when we encounter the second one? It’s already ruled “impossible”! Thus, Monday and Tuesday under the same circumstances shouldn’t be modeled as two different elements of a sample space.
You said:
I don’t mean to claim that as soon as Beauty awakes, new evidence comes to light that she can add to her store of bits in additive fashion, and thereby update her credence from 1⁄2 to 1⁄3 along the way. If this is the only kind of evidence that your theory of Bayesian updating will acknowledge, then it is too restrictive.
I don’t think it matters if she has the knowledge before the experiment or not. What matters is if she has new information about the likelihood of heads to update on. If she did, we would expect her accuracy to improve. So, for example, if she starts out believing that heads has probability 1⁄2, but learns something about the coin toss, her probability might go up a little if heads and down a little if tails. Suppose, for example, she is informed of a variable X. If P(heads|X)=P(tails|X), then why is she updating at all? Meaning, why is P(heads)=/=P(heads|X)? This would be unusual. It seems to me that the only reason she changes is because she knows she’d be essentially ‘betting’ twice of tails, but that really is distinct from credence for tails.
Yet one more variant. On my view it’s structurally and hence statistically equivalent to Iterated Sleeping Beauty, and I present an argument that it is. This one has the advantage that it does not rely on any science fictional technology. I’m interested to see if anyone can find good reasons why it’s not equivalent.
The Iterated Sleeping Beaty problem (ISB) is the original Standard
Sleeping Beauty (SSB) problem repeated a large number N of times. People always seem to want to do this anyway with all the variations, to use the Law of Large Numbers to gain insight to what they should do in the single shot case.
The Setup
As before, Sleeping Beauty is fully apprised of all the details ahead of time.
The experiment is run for N consecutive days (N is a large number).
At midnight 24 hours prior to the start of the experiment, a fair coin is tossed.
On every subsequent night, if the coin shows Heads, it is tossed again; if it shows Tails, it is turned over to show Heads.
(This process is illustrated by a discrete-time Markov chain with transition matrix:
[1/2 1/2] = P
[ 1 0 ]
and the state vector is the row
x = [ Heads Tails ],
with consecutive state transitions computed as x * P^k
Each morning when Sleeping Beauty awakes, she is asked each of the following questions:
“What is your credence that the most recent coin toss landed Heads?”
“What is your credence that the coin was tossed last night?”
“What is your credence that the coin is showing Heads now?”
The first question is the equivalent of the question that is asked in the Standard Sleeping Beauty problem. The second question corresponds to the question “what is your credence that today is Monday?” (which should also be asked and analyzed in any treatment of the Standard Sleeping Beauty problem.)
Note: in this setup, 3) is different than 1) only because of the operation of turning the coin over instead of tossing it. This is just a perhaps too clever mechanism to count down the days (awakenings, actually) to the point when the coin should be tossed again. It may very well make a better example if we never touch the coin except to toss it, and use some other deterministic countdown mechanism to count repeated awakenings per coin toss. That allows easier generalization to the case where the number of days to awaken when Tails is greater than 2. It also makes 3) directly equivalent to the standard SB question, and also 1) and 3) have the same answers. You decide which mechanism is easier to grasp from a didactic point of view, and analyze that one.
After that, Beauty goes on about her daily routine, takes no amnesia drugs, sedulously avoids all matter duplicators and transhuman uploaders, and otherwise lives a normal life, on one condition: she is not allowed to examine the coin or discover its state (or the countdown timer) until the experiment is over.
Analysis
Q1: How should Beauty answer?
Q2: How is this scenario similar in key respects to the SSB/ISB scenario?
Q3: How does this scenario differ in key respects from the SSB/ISB scenario?
Q4: How would those differences if any make a difference to how Beauty should answer?
My answers:
Q1: Her credence that the most recent coin toss landed Heads should be 1⁄3. Her credence that the coin was tossed last night should be 1⁄3. Her credence that the coin shows Heads should be 2⁄3. (Her credence that the coin shows Heads should be 1⁄3 if we never turn it over, only toss, and 1/K if the countdown timer counts K awakenings per Tail toss.)
Q2: Note that Beauty’s epistemic state regarding the state of the coin, or whether it was tossed the previous midnight, is exactly the same on every morning, but without the use of drugs or other alien technology. She awakens and is asked the questions once every time the coin toss lands Heads, and twice every time it lands tails. In Standard Sleeping Beauty, her epistemic state is reset by the amnesia drugs. In this setup, her epistemic state never needs to be reset because it never changes, simply because she never receives any new information that could change it, including the knowledge of when the coin has been tossed to start a new cycle.
Q3: In ISB, a new experimental cycle is initiated at fixed times—Monday (or Sunday midnight). Here the start of a new “cycle” occurs with random timing. The question arises, does the difference in the speed of time passing make any difference to the moments of awakening when the question is asked? Changing labels from “Monday” and “Tuesday” to “First Day After Coin Toss” and “Second Day After Coin Toss” respectively makes no structural change to the operation of the process. Discrete-time Markov chains have no timing, they have only sequence.
In the standard ISB, there seems to be a natural unit of replication: the coin toss on Sunday night followed by whatever happens through the rest of the week. Here, that unit doesn’t seem so prominent, though it still exists as a renewal point of the chain. In a recurrent Markov chain, the natural unit of replication seems to be the state transition. Picking a renewal point is also an option, but only as a matter of convenience of calculation; it doesn’t change the analysis.
Q4: I don’t see how. The events, and the processes which drive their occurence haven’t changed that I can see, just our perspective in looking at them. What am I overlooking?
Iteration
I didn’t tell you yet how N is determined and how the experiment is terminated. Frankly, I don’t think it matters all that much as N gets large, but let’s remove all ambiguity.
Case A: N is a fixed large number. The experiment is terminated on the first night on which the coin shows Heads, after the Nth night.
Case B: N is not fixed in advance, but is guaranteed to be larger than some other large fixed number N’, such that the coin has been tossed at least N’ times. Once N’ tosses have been counted, the experiment is terminated on any following night on which the coin shows Heads, at the whim of the Lab Director.
Q5: If N (or N’) is large enough, does the difference between Case A and B make a difference to Beauty’s credence? (To help sharpen your answer, consider Case C: Beauty dies of natural causes before the experiment terminates.)
Note that in view of the discussion under Q3 above, we are picking some particular state in the transition diagram and thinking about recurrence to and from that state. We could pick any other state too, and the analysis wouldn’t change in any significant way. It seems more informative (to me at any rate) to think of this as an ongoing prcess that converges to stable behavior at equilibrium.
Extra Credit:
This gets right to the heart of what a probability could mean, what things can count as probabilities, and why we care about Sleeping Beauty’s credence.
Suppose Beauty is sent daily reports showing cumulative counts of the nightly heads/tails observations. The reports are sufficiently old as not to give any information about the current state of the coin or when it was last tossed. (E.g., the data in the report are from at least two coin tosses ago.) Therefore Beauty’s epistemic state about the current state of the coin always remains in its initial/reset state, with the following exception. Discuss how Beauty could use this data to--
corroborate that the coin is in fact fair as she has been told.
update her credences, in case she accrues evidence that shows the coin is not fair.
For me this is the main attraction of this particular model of the Sleeping Beauty setup, so I’m very interested in any possible reasons why it’s not equivalent.
Sorry I was slow to respond .. busy with other things
My answers:
Q1: I agree with you: 1⁄3, 1⁄3, 2⁄3
Q2. ISB is similar to SSB as follows: fair coin; woken up twice if tails, once if heads; epistemic state reset each day
Q3. ISB is different from SSB as follows: more than one coin toss; same number of interviews regardless of result of coin toss
Q4. It makes a big difference. She has different information to condition on. On a given coin flip, the probability of heads is 1⁄2. But, if it is tails we skip a day before flipping again. Once she has been woken up a large number of times, Beauty can easily calculate how likely it is that heads was the most recent result of a coin flip. In SSB, she cannot use the same reasoning. In SSB, Tuesday&heads doesn’t exist, for example.
Consider 3 variations of SSB:
Same as SSB except If heads, she is interviewed on Monday, and then the coin is turned over to tails and she is interviewed on Tuesday. There is amnesia and all of that. So, it’s either the sequence (heads on Monday, tails on Tuesday) or (tails on Monday, tails on Tuesday). Each sequence has a 50% probability, and she should think of the days within a sequence as being equally likely. She’s asked about the current state of the coin. She should answer P(H)=1/4.
Same as SSB except If heads, she is interviewed on Monday, and then the coin is flipped again and she is interviewed on Tuesday. There is amnesia and all of that. So, it’s either the sequence (heads on Monday, tails on Tuesday), (heads on Monday, heads on Tuesday) or (tails on Monday, tails on Tuesday). The first 2 sequences have a 25% chance each and the last one has a 50% chance. When asked about the current state of the coin, she should say P(H)=3/8
The 1⁄2 solution to SSB results from similar reasoning. 50% chance for the sequence (Monday and heads). 50% chance for the sequence (Monday and tails, Tuesday and tails). P(H)=1/2
If you apply this kind of reasoning to ISB, where we are thinking of randomly selected day after a lot of time has passed, you’ll get P(H)=1/3.
I’m struggling to see how ISB isn’t different from SSB in meaningful ways.
Perhaps this is beating a dead horse, but here goes.
Regarding your two variants:
1 Same as SSB except If heads, she is interviewed on Monday, and then the
coin is turned over to tails and she is interviewed on Tuesday. There is
amnesia and all of that. So, it’s either the sequence (heads on Monday,
tails on Tuesday) or (tails on Monday, tails on Tuesday). Each sequence
has a 50% probability, and she should think of the days within a sequence
as being equally likely. She’s asked about the current state of the
coin. She should answer P(H)=1/4.
I agree. When iterated indefinitely, the Markov chain transition matrix is:
acting on state vector [ H1 H2 T1 T2 ], where H,T are coin toss outcomes and 1,2 label Monday,Tuesday. This has probability eigenvector [ 1⁄41⁄4
1/4 1⁄4 ]; 3 out of 4 states show Tails (as opposed to the coin having been tossed Tails). By the way, we have unbiased sampling of the coin toss outcomes here.
If the Markov chain model isn’t persuasive, the alternative calculation is to look at the branching probability diagram
and compute the expected frequencies of letters in the result strings at each leaf on Wednesdays. This is
0.5 * ( H + T ) + 0.5 * ( T + T ) = 0.5 * H + 1.5 * T.
2 Same as SSB except If heads, she is interviewed on Monday, and then the
coin is flipped again and she is interviewed on Tuesday. There is amnesia
and all of that. So, it’s either the sequence (heads on Monday, tails on
Tuesday), (heads on Monday, heads on Tuesday) or (tails on Monday, tails
on Tuesday). The first 2 sequences have a 25% chance each and the last one
has a 50% chance. When asked about the current state of the coin, she
should say P(H)=3/8
I agree. Monday-Tuesday sequences occur with the following probabilities:
HH: 1/4
HT: 1/4
TT: 1/2
Also, the Markov chain model for the iterated process agrees:
to compute expected frequencies of letters in the result strings,
0.25 * ( H + H ) + 0.25 * ( H + T ) + 0.5 * ( T + T ) = 0.75 * H + 1.25 * T
Because of the extra coin toss on Tuesday after Monday Heads, these are biased observations of coin tosses. (Are these credences?) But neither of these two variants is equivalent to Standard Sleeping Beauty or its iterated variants ISB and ICSB.
The 1⁄2 solution to SSB results from similar reasoning. 50% chance for the sequence (Monday and heads). 50% chance for the sequence (Monday and tails, Tuesday and tails). P(H)=1/2
(Sigh). I don’t think your branching probability diagram is correct. I don’t know what other reasoning you are using. This is the diagram I have for Standard Sleeping Beauty
And this is how I use it, using exactly the same method as in the two examples above. With probability 1⁄2 the process accumulates 2 Tails observations per week, and with probability 1⁄2 accumulates 1 Heads observation. The expected number of observations per week is 1.5, the expected number of Heads observations per week is 0.5, the expected number of Tails observations is 1 per week.
0.5 * ( H ) + 0.5 * ( T + T ) = 0.5 * H + 1.0 * T
Likewise when we record Monday/Tuesday observations per week instead of Heads/Tails, the expected number of Monday observations is 1, expected Tuesday observations 0.5, for a total of 1.5. But in both of your variants above, the expected number of Monday observations = expected number of Tuesday observations = 1.
Thanks for your response. I should have been clearer in my terminology. By “Iterated Sleeping Beauty” (ISB) I meant to name the variant that we here have been discussing for some time, that repeats the Standard Sleeping Beauty problem some number say 1000 of times. In 1000 coin tosses over 1000 weeks, the number of Heads awakenings is 1000 and the number of Tails awakenings is 2000. I have no catchy name for the variant I proposed, but I can make up an ugly one if nothing better comes to mind; it could be called Iterated Condensed Sleeping Beauty (ICSB). But I’ll assume you meant this particular variant of mine when you mention ISB.
You say
Q3. ISB is different from SSB as follows: more than one coin toss; same number of interviews regardless of result of coin toss
“More than one coin toss” is the iterated part. As far as I can see,
and I’ve argued it a couple times now, there’s no essential difference between SSB and ISB, so I meant to draw a comparison between my variant and ISB.
“Same number of interviews regardless of result of coin toss” isn’t correct. Sorry if I was unclear in my description. Beauty is interviewed once per toss when Heads, twice when Tails. This is the same in ICSB as in Standard and Iterated Sleeping Beauty. Is there an important difference between Standard Sleeping Beauty and Iterated Sleeping Beauty, or is there an important difference between Iterated Sleeping Beauty and Iterated Condensed Sleeping Beauty?
Q4. It makes a big difference. She has different information to condition
on. On a given coin flip, the probability of heads is 1⁄2. But, if it is
tails we skip a day before flipping again. Once she has been woken up a
large number of times, Beauty can easily calculate how likely it is that
heads was the most recent result of a coin flip.
We not only skip a day before tossing again, we interview on that day too!
I see how over time Beauty gains evidence corroborating the fairness of the
coin (that’s exactly my later rhetorical question), but assuming it’s a fair coin, and barring Type I errors, she’ll never see evidence to change her initial credence in that proposition. In view of this, can you explain how she can use this information to predict with better than initial accuracy the likelihood that Heads was the most recent outcome of the toss? I don’t see how.
In SSB, Tuesday&heads doesn’t exist, for example.
After relabeling Monday and Tuesday to Day 1 and Day 2 following the coin toss, Tuesday&Heads (H2) exists in none of these variants. So what difference is there?
Q1: I agree with you: 1⁄3, 1⁄3, 2⁄3
Good and well, but—are these legitimate credences? If not, why not? And
if so, why aren’t they also in the following:
Standard Iterated Sleeping Beauty is isomorphic to the following Markov
chain, which just subdivides the Tails state in my condensed variant into
Day 1 and Day 2:
[1/2, 1/2, 0]
[0, 0, 1]
[1/2, 1/2, 0]
operating on row vector of states [ Heads&Day1 Tails&Day1 Tails&Day2 ],
abbreviated to [ H1 T1 T2 ]
When I say isomorphic, I mean the distinct observable states of affairs are
the same, and the possible histories of transitions from awakening to next awakening are governed by the same transition probabilities.
So either there’s a reason why my 2-state Markov chain correctly models my
condensed variant that allows you to accept the 1⁄3 answers it computes,
that doesn’t apply to the three-state Markov chain and its 1⁄3 answers
(perhaps you came to those answers independently of my model), or else
there’s some reason why the three-state Markov chain doesn’t correctly model
the Iterated Sleeping Beauty process. Can you help me see where the difficulty may lie?
I’m struggling to see how ISB isn’t different from SSB in meaningful ways.
I assume you are referring to my variant, not what I’m calling Iterated Sleeping Beauty. If so, I’m kind of baffled by this statement, because under similarities, you just listed
fair coin
woken twice if Tails, once if Heads
epistemic state reset each day
With the emendation that 2) is per coin toss, and in 3) “each day” = “each awakening”, you have just listed three essential features that SSB, ISB and ICSB all have in common. It’s exactly those three things that define the SSB problem. I’m claiming that there aren’t any others. If you disagree, then please tell me what they are. Or if parts of my argument remain unclear, I can try to go into more detail.
Replicate the entire experiment 1000 times. That is, there will be 1000 independent tosses of the coin. This will lead between 1000 and 2000 awakenings, with expected value of 1500 awakenings.
and
Replicate her awakening-state 1000 times. Because her epistemic state is always the same on an awakening, from her perspective, it could be Monday or Tuesday, it could be heads or tails.
The distinction between 1 and 2 is that, in 2, we are trying to repeatedly sample from the joint probability distributions that she should have on an awakening. In 1, we are replicating the entire experiment, with the double counting on tails.
This seems a distinction without a difference. The longer the iterated SB process continues, the less important is the distinction between counting tosses versus counting awakenings. This distinction is only about a stopping criterion, not about the convergent behavior of observations or coin tosses to expected values as it’s ongoing. Considered as an ongoing process of indefinite duration, the expected number of tosses and of observations of each type are well-defined, easily computed, and well-behaved with respect to each other. Over the long run, #awakenings accumulates 1.5 times more frequently than #tosses. Beauty is never more than two awakenings away from starting a new coin toss, so whether you choose to stop as soon as an awakening has completed or until you finish a coin-toss cycle, the relative perturbation in the statistics collected so far goes to zero. Briefly, there is no “natural” unit of replication independent of observer interest.
She knows that it was a fair coin. She knows that if she’s awake it’s definitely Monday if heads, and could be either Monday or Tuesday if tails. She knows that 50% of coin tosses would end up heads, so we assign 0.5 to Monday&heads.
This would be an error. You are assigning a 50% probability to an observation (that it is Heads&Monday) without taking into account the bias that’s built in to the process for Beauty to make observations. Alternatively, if you are uncertain whether Monday is true or not—you know it might be Tuesday—then you should be uncertain that P(Heads)=P(Heads&Monday).
You the outside observer know the chance of observing that the coin lands Heads is 50%. You presumably know this because you have corroborated it through an unbiased observation process: look at the coin exactly once per toss. Once Beauty is put to sleep and awoken, she is no longer an outside observer, she is a particpant in a biased observation process, so she should update her expectation about what her observation process will show.
Different observation process, different observations, different likelhoods of what she can expect to see.
Of course, as a card-carrying thirder, I’m assuming that the question about credence is about what Beauty is likely to see upon awakening. That’s what the carefully constructed wording of the question suggests to me.
She knows that 50% of coin tosses would end up tails,
except that as we agreed, she’s not observing coin tosses, she’s observing biased samples of coin tosses. The connection between what she observes and the objective behavior of the coin is just what’s at issue here, so you can’t beg the question.
In 1, people are using these ratios of expected counts to get the 1⁄3 answer. 1⁄3 is the correct answer to the question about the long-run frequencies of awakenings preceded by heads to awakenings preceded by tails. But I do not think it is the answer to the question about her credence of heads on an awakening.
Agreed, but for this: it all depends on what you want credence to mean, and what it’s good for; see discussion below.
In 2, the joint probabilities are determined ahead of time based on what we know about the experiment.
Let n2 and n3 are counts, in repeated trials, of tails&Monday and tails&Tuesday, respectively. You will of course see that n2=n3. They are the same random variable. tails&Monday and tails&Tuesday are the same.
Let me uphold a distinction that’s continually skated over, but which is crucial point of disagreement here. I think you’re confusing your evidence for the thing evidenced. And you are selectively filtering your evidence, which amounts to throwing away information. Tails&Monday and Tails&Tuesday are not the same; they are distinct observations of the same state of the coin, thus they are perfectly correlated in that regard. Aside from the coin, they observe distinct days of the week, and thus different states of affairs. By a state of affairs I mean the conjunction of all the observable properties of interest at the moment of observation.
It’s like what Jack said about types and tokens. It’s like Vladimir_Nesov said:
The distinction between types and tokens is only relevant when you want to interepret your tokens as being about something else, their types, rather than about themselves. But types are carved out of observers’ interests in their significance, which are non-objective, observer-dependent if anything is. Their variety and fineness of distinction is potentially infinite. As I mentioned above, a state of affairs is a conjunction of observable properties of interest. This Boolean lattice has exactly one top: Everything, and unknown atoms if any at bottom. Where you choose to carve out a distinction between type and token is a matter of observer interest.
Two subsequent states of a given dynamical system make for poor distinct elements of a sample space: when we’ve observed that the first moment of a given dynamical trajectory is not the second, what are we going to do when we encounter the second one? It’s already ruled “impossible”! Thus, Monday and Tuesday under the same circumstances shouldn’t be modeled as two different elements of a sample space.
I’ll certainly agree it isn’t desirable, but oughtn’t isn’t the same as isn’t, and in the Sleeping Beauty problem we have no choice. Monday and Tuesday just are different elements in a sample space, by construction.
if she starts out believing that heads has probability 1⁄2, but learns something about the coin toss, her probability might go up a little if heads and down a little if tails.
What you seem to be talking about is using evidence that observations provide to corroborate or update Beauty’s belief that the coin is in fact fair. Is that a reasonable take? But due to the epistemic reset between awakenings, there is never any usable input to this updating procedure. I’ve already stipulated this is impossible. This is precisely what the epistemic reset assumption is for. I thought we were getting off this merry-go-round.
Suppose, for example, she is informed of a variable X. If P(heads|X)=P(tails|X), then why is she updating at all? Meaning, why is P(heads)=/=P(heads|X)? This would be unusual. It seems to me that the only reason she changes is because she knows she’d be essentially ‘betting’ twice of tails, but that really is distinct from credence for tails.
Ok, I guess it depends on what you want the word “credence” to mean, and what you’re going to use it for. If you’re only interested in some updating process that digests incoming information-theoretic quanta, like you would get if you were trying to corroborate that the coin was inded a fair one to within a certain standard error, you don’t have it here. That’s not Sleeping Beauty, that’s her faithful but silent, non-memory-impaired lab partner with the log book. If Beauty herself is to have any meaningful notion of credence in Heads, it’s pointless for it be about whether the coin is indeed fair. That’s a separate question, which in this context is a boring thing to ask her about, because it’s trivially obvious: she’s already accepted the information going in that it is fair and she will never get new information from anywhere regarding that belief. And, while she’s undergoing the process of being awoken inside the experimental setup, a value of credence that’s not connected to her observations is not useful for any purpose that I can see, other than perhaps to maintain her membership in good standing in the Guild of Rational Bayesian Epistomologists. It doesn’t connect to her experience, it doesn’t predict frequencies of anything she has any access to, it’s gone completely metaphysical. Ok, what else is there to talk about? On my view, the only thing left is Sleeping Beauty’s phenomenology when awakened. On Bishop Berkeley’s view, that’s all you ever have.
Beauty gets usable, useful information (I guess it depends on what you want “information” to mean, too) once, on Sunday evening, and she never forgets it thereafter. This information is separate from, in addition to the information that the coin itself is fair. This other information allows her to make a more accurate prediction about the likelihood that, each time she is awoken, the coin is showing heads. Or whether it’s Monday or Tuesday. The information she receives is the details of the sampling process, which has been specifically constructed to give results that are biased with respect to the coin toss itself, and the day of the week. Directly after being informed of the structure of the sampling process, she knows it is biased and therefore ought to update her prediction about what relative frequencies per observation will be of each observable aspect of the possible state of affairs she’s awoken into—Heads vs. Tails, Monday vs. Tuesday.
I think I might understand the interpretation that a halfer puts on the question. I’m just doubtful of its interest or relevance. Do you see any validity (I mean logical coherence, as opposed to wrong-headedness) to this interpretation? Is this just a turf war over who gets to define a coveted word for their purposes?
Consider the case of Sleeping Beauty with an absent-minded experimenter.
If the coin comes up Heads, there is a tiny but non-zero chance that the experimenter mixes up Monday and Tuesday.
If the coin comes up Tails, there is a tiny but non-zero chance that the experimenter mixes up Tails and Heads.
The resulting scenario is represented in a new sheet, Fuzzy two-day, of my spreadsheet document.
Under these assumptions, Beauty may no longer rule out Tuesday & Heads. She has no justification to assign all of the Heads probability mass to Monday & Heads. She is therefore constrained to conditioning on being woken in the way that the usual two-day variant suggests she should, and ends up with a credence arbitrarily close to 1⁄3 if we make the “absent-minded” probability tiny enough.
Why should we get a discontinuous jump to 1⁄2 as this becomes zero?
This sounds like the continuity argument, but I’m not quite clear on how the embedding is supposed to work, can you clarify? Instead of telling me what the experimenter rightly or wrongly believes to be the case, spell out for me how he behaves.
If the coin comes up Heads, there is a tiny but non-zero chance that the experimenter mixes up Monday and Tuesday.
What does this mean operationally? Is there a nonzero chance, let’s call it epsilon or e, that the experimenter will incorrectly behave as if it’s Tuesday when it’s Monday? I.e., with probability e, Beauty is not awoken on Monday, the experiment ends, or is awoken and sent home, and we go on to next Sunday evening without any awakenings that week? Then Heads&Tuesday still with certainty does not occur. So maybe you meant that on Monday, he doesn’t awaken Beauty at all, but awakens her on Tuesday instead? Is this confusion persistent across days, or is it a random confusion that happens each time he needs to examine the state of the coin to know what he should do?
And on Tuesday
If the coin comes up Tails, there is a tiny but non-zero chance that the experimenter mixes up Tails and Heads.
So when the coin comes up Tails, there is a nonzero probability, let’s call it delta or d, that the experimenter will incorrectly behave as if it’s Heads? I.e., on Tuesday morning, he will not awaken Beauty or will wake her and send her home until next Sunday? Then Tails&Tuesday is a possible nonoccurrence.
On reflection, my verbal description doesn’t rmatch the reply I wanted to give, which was: the experimenter behaves such that the probability mass is allocated as in the spreadsheet.
Make it “on any day when Beauty is scheduled to remain asleep, the experimenter has some probability of mistakenly waking her, and vice-versa”.
This is interesting. We shouldn’t get a discontinuous jump.
Consider 2 related situations:
if Heads she is woken up on Monday, and the experiment ends on Tuesday. If tails, she is woken up on Monday and Tuesday, and the experiment ends on Wed. In this case, there is no ‘not awake’ option.
If heads she is woken up on Monday and Tuesday. On Monday she is asked her credence for heads. On Tuesday she is told “it’s Tuesday and heads” (but she is not asked about her credence; that is, she is not interviewed). If tails, it’s the usual woken up both days and asked about her credence. The experiment ends on Wed.
In both of these scenarios, 50% of coin flips will end up heads. In both cases, if she’s interviewed she knows it’s either Monday&heads, Monday&tails or Tuesday&tails. She has no way of telling these three options apart, due to the amnesia.
I don’t think we should be getting different answers in these 2 situations. Yet, I think if we use your probability distributions we do.
I think there are two basic problems. One is that Monday&tails is really not different from Tuesday&tails. They are the same variable. It’s the same experience. If she could time travel and repeat the monday waking it would feel the same to her as the Tuesday waking. The other issue is that, even though in my scenario 2 above, when she is woken but before she knows if she will be interviewed, it would look like there is a 25% chance it’s heads&Monday and a 25% it’s heads&Tuesday. And that’s probably a reasonable way to look at it. But, that doesn’t imply that, once she finds out it’s an interview day, that the probability of heads&Monday shifts to 1⁄3. That’s because on 50% of coin flips she will experience heads&Monday. That’s what makes this different than a usual joint probability table representing independent events.
My reasoning has been to consider scenario 1 from the perspective of an outside observer, who is uncertain about each variable: a) whether it is Monday or Tuesday, b) how the coin came up, c) what happened to Beauty on that day.
To that observer, “Tuesday and heads” is definitely a possibility, and it doesn’t really matter how we label the third variable: “woken”, “interviewed”, whatever. If the experiment has ended, then that’s a day where she hasn’t been interviewed.
If the outside observer learns that Beauty hasn’t been interviewed today, then they may conclude that it’s Tuesday and that the coin came up heads, thus a) they have something to update on and b) that observer must assign probability mass to “Tuesday & Heads & not interviewed”.
If the outside observer learns that Beauty has been interviewed, it seems to me that they would infer that it’s more likely, given their prior state of knowledge, that the coin came up heads.
To the outside observer, scenario 2 isn’t really distinct from scenario 1. The difference only makes a difference to Beauty herself.
However, I see no reason to treat Beauty herself differently than an outside observer, including the possibility of updating on being interviewed or on not being interviewed.
So, if my probability tables are correct for an outside observer, I’m pretty sure they’re correct for Beauty.
(My confidence in the table themselves, however, has been eroded a little by my not being able to calculate Beauty—or an observer—updating on a new piece of information in the “fuzzy” variant, e.g. using P(heads|woken) as a prior probability and updating on learning that it is in fact Tuesday. It seems to me that for the math to check out requires that this operation should recover the “absent-minded experimenter” probability for “tuesday & heads & woken”. But I’m having a busy week so far and haven’t had much time to think about it.)
The 1⁄3 solution makes the assumption that the probability of heads given an awakening is:
lim N-> infinity n1/(n1+n2+n3)
But, we have a problem here. N does not equal n1+n2+n3, it is equal to n1+n2.
Why is that a problem? Why would N have to be equal to n1+n2+n3? Only because it does in your other example?
(ETA)I’m not sure of where you’re formula “lim N-> infinity n1/(n1+n2+n3)” comes from—as the third example shows, it just doesn’t work in all cases. That doesn’t mean that your alternative formula is better in the sleeping beauty case.
Because this, lim N-> infinity n1/(n1+n2+n3), is p1 if the counts are from independent draws of a multinomial distribution.
We have outcome-dependent sampling here. Is lim N-> infinity n1/(n1+n2+n3) equal to p1 in that case? I’d like to see the statistical theory to back up the claim. It’s pretty clear to me that people who believe the answer is 1⁄3 pictured counts in a 3 by 1 contingency table, and applied the wrong theory to it.
(ETA) The formula “lim N-> infinity n1/(n1+n2+n3)” is what people who claim the answer is 1⁄3 are using to justify it. The 1⁄2 solution just uses probability laws. That is, P(H)=1/2. P(W)=1, where W is the event that Beauty has been awakened. Therefore, P(H|W)=1/2.
It’s pretty clear to me that people who believe the answer is 1⁄3 pictured counts in a 3 by 1 contingency table, and applied the wrong theory to it.
I’ll have to disagree with that—there is a pretty clear interpretation in which 1⁄3 is a “correct” answer: if the sleeping beauty is asked to bet X dollars that heads came up, and wins $60 if she’s right, for up to which values X should she accept the bet? (if the coin comes up tails, she gets the possibility to bet twice)
In that scenario, X=$20 is the right answer, which corresponds to a probability of 1⁄3. Do you agree with that? (I haven’t read all the threads, you probably adressed this somewhere)
See, here I’m not using any “lim N-> infinity n1/(n1+n2+n3)” , so I feel you’re being unfair to 1/3rders.
I don’t agree, because the question is about her subjective probability at an awakening. The betting question you described is a different one.
For example, suppose I flip a coin and tell you you will win $60 if heads came up, but I require that you make the bet twice if tails came up? You’d be willing to bet up to $30, but that doesn’t mean you think heads has probability 1⁄3. If Beauty really thinks heads has probability 1⁄3, she’d be willing to accept the bet up to $30 even if we told her that we’d only accept one bet (of course, we wouldn’t tell her that she’s already made a bet on Tuesday. Payout would be on Wed).
The wikipedia page for the sleeping beauty problem says:
Suppose this experiment were repeated 1,000 times. We would expect to get 500 heads and 500 tails. So Beauty would be awoken 500 times after heads on Monday, 500 times after tails on Monday, and 500 times after tails on Tuesday. In other words, only in a third of the cases would heads precede her awakening. So the right answer for her to give is 1⁄3.
That’s why I think people are picturing counts in a contingency table when they come up with the 1⁄3 answer.
I don’t agree, because the question is about her subjective probability at an awakening. The betting question you described is a different one.
She would also bet at an awakening. If you ask her to bet when she just broke up, it would seem weird that she would say “my subjective probability for heads is 1⁄2, but I’ll only willing to bet up to $20 − 1⁄3 of the winnings if it’s heads.”
It seems even weirder in the Xtreme Sleeping Beauty, where she’s awakened a thousand times : “my subjective probability for heads is 1⁄2, but I’m only willing to bet up to 6 cents”.
Yes, you get a different result if you change the betting rules where only one bet per “branch” counts, but I don’t see why that’s closer to the problem as originally stated.
It seems even weirder in the Xtreme Sleeping Beauty, where she’s awakened a thousand times : “my subjective probability for heads is 1⁄2, but I’m only willing to bet up to 6 cents”.
I guess I don’t see why it’s weird. The number of times she will bet is dependent on the outcome. So, even though at each awakening she thinks probability of heads is 1⁄2, she knows if it’s tails she’ll have to bet many more times than if heads. We’re essentially just making her bet more money on a loss than on a win.
In that case, what does it even mean to say “my subjective probability for heads is 1/2”? Subjective probability is often described in terms of bettings—see here.
Seems to me this is mostly a quarrel of definitions, and that when you say “people who believe the answer is 1⁄3 pictured counts in a 3 by 1 contingency table, and applied the wrong theory to it.”, you’re being unfair. They’re just using a different definition of “subjective probability”
“Subjective probability” is a basic term in decision theory and economics, though. If you want to roll your own metric, surely you should call it something else—to avoid much confusion.
Maybe this will help:
Where does the 1⁄3 solution come from?
It comes from taking a ratio of expected counts. First, a motivating example.
Suppose people can fall into one of three categories. For example, we might create a categorical age variable, catage, where catage is 1 if age50.
Suppose we randomly select N people from the population. Let n1 be the number of people with catage=1, with n2 and n3 defined similarly. Given the sample size N, the random variables n1, n2 and n3 follow a multinomial distribution, with parameters (probabilities) p1, p2 and p3, respectively, where p1+p2+p3=1 and n1+n2+n3=N (i.e., 2 degrees of freedom).
The probability that agecat=1, p1, is lim N-> infinity n1/(n1+n2+n3).
That concept applied to Sleeping Beauty
With the sleeping beauty problem, what we see is something similar. Imagine we ran the experiment N times. Let n1 be the number of times it was Monday&Heads, n2 the number of times it was Monday&tails, n3 the number of times it was Tuesday&tails.
The 1⁄3 solution makes the assumption that the probability of heads given an awakening is:
lim N-> infinity n1/(n1+n2+n3)
But, we have a problem here. N does not equal n1+n2+n3, it is equal to n1+n2. Also, the random variables n2 and n3 are identical. Thus, we could substitute:
lim N-> infinity n1/(n1+2*n2)
There are really just two random variables (n1 and n2) and 1 degree of freedom. In that case, we can think of n1 as coming from a Binomial distribution with sample size N=n1+n2 and probability p1. The probability of heads&Monday is then
lim N-> infinity n1/(n1+n2)=1/2.
Another example
If you don’t believe the above reasoning, consider another example.
Suppose half of the population are male and the other half are female. Also, suppose that only females have ovaries.
Suppose I record 3 variables: indicator that the person is male, indicator that the person is female, and indicator that the person has ovaries.
I sample N people, and get counts for those 3 variables of n1, n2 and n3.
Given that we recorded a variable for a randomly selected person, is the probability that they are male equal to
lim N->infinity n1/(n1+n2+n3) ?
Of course not. It’s lim N->infinity n1/(n1+n2).
Even though n2 and n3 are counts of something different, in a sense, they are really the same variable. Just like Beauty waking up on Monday is the same as Beauty waking up on Tuesday. There is no justification for treating them as separate variables.
When you do treat them as separate variables, you end up with nonsense (such as probabilities greater than 1 link ).
I’d quibble about calling it an assumption. The 1⁄3 solution notes that this is the ratio of observations upon awakening of heads to the total number of observations, which is one of the problematic facts about the experimental setup. The 1⁄3 solution assumes that this is relevant to what we should mean by “credence”, and makes an argument that this is a justification for the claim that Sleeping Beauty’s credence should be 1⁄3.
Your argument is, I take it, that these counts of observations are irrelevant, or at best biased. Something else should be counted, or should be counted differently. The disagreement seems to center on the denominator; it should count not awakenings, but coin-tosses. Then there is a difference in the definition of the relevant events and the probabilities that get calculated from them.
Thirders: An event is an awakening.
The question asks about # awakenings with heads / total awakenings.
This ratio is an estimate of a fraction that can be used to predict frequencies of something of interest.
Halfers: An event is a coin-toss.
The question asks about # tosses with heads / total tosses.
This ratio is an estimate of a fraction which is universally agreed to be a probability, and can be used to predict frequencies of something of interest.
Did I get that right? Is this a fair description?
I think a key difference between halfers and thirders is that for thirders, the occurrence of an awakening constitutes evidence of the current state of the system that’s being asked about—whether the coin shows heads or tails, because the frequency with which the state of the system is asked about (or, equivalently, an observation is made) is influenced by the current state of the system. To ward off certain objections, it is of no consequence whether this influence is deterministic, probabilistic or mixed in nature, the mere fact that it exists can and should be exploited. I don’t think there’s disagreement that it exists, but there is over how it’s relevant.
Halfers deny that any new evidence becomes available on awakening, because the operation of the process is completely known ahead of time. (Alternatively, if any new evidence could be said to become available, it cannot be exploited.) From what I can tell, and my understanding is surely imperfect, there is some kind of cognitive dissonance about what kinds of things can constitute evidence in some epistomological theory, such that drawing a distinction between the actual occurrence of an event and the knowledge that at least one such event will surely occur is illegitimate for halfers. Is this a fair description?
That’s as may be, but it doesn’t help Sleeping Beauty in her quandary. If you think this example helps to prove your point, I think it helps to prove the opposite. Although she knows, in this variation, that a randomly selected person will be tested, the random person selection process is not accessible to her, only the opportunity to know that one of three possible test results has been collected. She knows very well, given a randomly selected person (resp. a coin toss), what the probability they are male is (resp. the given coin toss came Heads). She isn’t being asked about that conditional probability. (Or maybe you think she is? Please clarify.) To follow your analogy, upon being awakened, she’s informed that a test result has been collected from an unknown person, and now, given that a test result has been collected, what are the chances it cames from a male?
Clearly the selection process for asking Sleeping Beauty questions is biased. If bias had not been introduced by an extra awakening on Tuesday, the problem would collapse into triviality. The puzzle asks how this sampling bias should affect Sleeping Beauty’s calculations of what to answer on awakening, if at all. One of the reasons for doing statistical analysis of sampling schemes is to quantify how the mechanism that’s introducing bias changes the expected values of observations. In the SB case, the biased selection process is a mixture of random and deterministic mechanisms. Untangling the random from the deterministic parts is difficult enough for the participants in this discussion—they can’t even agree on a forking path diagram! Untangling it for Sleeping Beauty while she’s in the experiment is epistemically impossible. She has no basis whatsoever inside the game for saying, “this one is randomly different from the last one” versus “this one is deterministically identical to the last one, therefore this one doesn’t count.”
The same considerations apply to the case of the cancer test. Let me elaborate on your scenario to see if I understand it, and let me know if I’m mischaracterizing the test protocol in any material way. There is a test for a disease condition. Every person knows they have a 50% chance going in of testing positive for the disease. We’ll stipulate that the repeatability of the test is perfect, though in real life this is achieved only within epsilon of certainty. (Btw, here’s where the continuity argument enters in: how crucial is the assumption of absolute certainty versus near certainty? What hinges on that?) In this protocol, if the initial test result is positive, then the test is repeated k times (k=2 or 10, or whatever you deem necessary), either with a new sample or from an aliquot of the original sample, I don’t think it matters which. Here the repetition is because of the obstinacy of the head of the test lab and their predilection for amnesia drugs; in real life the reasons would be something like the very high cost in anguish and/or money of a false positive, however unlikely. You, as a recorder of test results, see a certain number of test samples come through the lab. The identities of the samples are encrypted, so your epistemic state with regard to any particular test result is identical to that for any other test sample and its result.
So now the question comes down to this: upon any particular awakening, how is the test subject’s epistemic state at any particular awakening significantly different from the lab tech’s epistemic state regarding any particular test sample? There is a one-to-one correspondence between test samples being evaluated and questions to the patient about their prognosis. Should they give the same answer, or is there a reason why they should give different answers? Just as with the patient, the lab tech knows that any randomly chosen individual has a 50% chance of of giving a positive test result, but does she give the same answer to that question as to a different question: given that she has a particular sample in her hands, what is the probability that the person it belongs to will test positive? She knows that she has k times as many samples in her lab that will test positive than otherwise, but she has no way of knowing whether the sample in her hands is an initial sample or a replicate. It seems to me that halfers might be claiming these two questions are the same question, while thirders claim that they are different questions with different answers. Is this a fair description? If not, please clarify.
What you say is true for any outside observers, and for Sleeping Beauty after the experiment is over and the logbooks analyzed. But while Sleeping Beauty is in the experiment, this option is simply not available to her. The scenario has been carefully constructed to make this so, that’s what makes it an interesting problem. The whole point of the amnesia drug in the SB setup (or downloadable avatars, or forking universes, random passersby, whatever) is that she has NO justification nor even a method for NOT treating any of her awakenings as separate variables, because the information that could allow her to do this is unavailable to her. By construction—and this is the defining feature of Sleeping Beauty—all Sleeping Beauty’s awakenings are epistemically indistinguishable. She has no choice but to treat them all identically.
This phenomenon is a common occurrence in queueing systems where there’s a very definite and well-understood difference between omniscient “outside observers” and epistemically indistinguishable “arriving customers”, who can have different values for the probability of observing the system in state X, where the system is executing a well-defined random process, or even a combination random-deterministic process.
Thanks for your detailed response. I’ll make a few comments now, and address more of it later (short on time).
No, I was just saying that this, lim N-> infinity n1/(n1+n2+n3), is not actually a probability in the sleeping beauty case.
No, I wouldn’t say that. My argument is that you should use probability laws to get the answer. If you take ratios of expected counts, well, you have to show that what you get as actually a probability.
I definitely disagree with your bullet points about what halfers think
I said: “Just like Beauty waking up on Monday is the same as Beauty waking up on Tuesday. There is no justification for treating them as separate variables.”
You disagreed, and said:
Hm, I think that is what I’m saying. She does have to treat them all identically. They are the same variable. That’s why she has to say the same thing on Monday and Tuesday. That’s why an awakening contains no new info. If she had new evidence at an awakening, she’d give different answers under heads and tails.
I maintain that it is. I can guarantee you that it is. What obstacle do you see to accepting that? You’ve made noises that this is because the counts are correlated, but I haven’t seen any argument for this beyond bare assertion. Do you want to claim it is impossible for some reason, or are you just saying you haven’t seen a persuasive argument yet?
What would you require for proof? If I could show you a Markov chain whose behavior is isomorphic to iterated Sleeping Beauty, would that convince you?
I also am not sure what you mean when you say “use probability laws”. Is there a failure to comport with the Kolmogorov axioms? Is there a problem with the definition of the events? Do you mean Bayes’ Theorem, or some other law(s)? I also am deeply suspicious of the phrase “get the answer”. I will have no idea what this could mean until we can eliminate ambiguity about what the question is (there seems to be a lot of that going around), or what class of questions you’ll admit as legitimate.
Up to this point, I see we are actually in strenuous agreement on this aspect, so I can stop belaboring it.
I don’t mean to claim that as soon as Beauty awakes, new evidence comes to light that she can add to her store of bits in additive fashion, and thereby update her credence from 1⁄2 to 1⁄3 along the way. If this is the only kind of evidence that your theory of Bayesian updating will acknowledge, then it is too restrictive. Since Beauty is apprised of all the relevant details of the experimental process on Sunday evening, she can (and should) use the fact that the predicted frequency of awakenings into a reset epistemic state is dependent on the state of the coin toss to change the credence she reports on such awakenings from 1⁄2 to 1⁄3. She can tell you this on Sunday night, just as I can tell you now, before any of us enter into any such experimental procedure. So her prediction about what she should answer on an awakening does not change from Sunday evening to Monday morning.
The key pieces of information she uses to arrive at this revised estimate are:
That the questions will be asked in a reset epistemic state. This requires her to give the same answer on all awakenings.
That the frequency of awakenings is dependent in a specific way on the result of the coin toss. This requires her to update the credence she’ll report on awakenings from 1⁄2 to 1⁄3.
At this point, it is just assertion that it’s not a probability. I have reasons for believing it’s not one, at least, not the probability that people think it is. I’ve explained some of that reasoning.
I think it’s reasonable to look at a large sample ratio of counts (or ratio of expected counts). The best way to do that, in my opinion, is with independent replications of awakenings (that reflect all possibilities at an awakening). I probably haven’t worded this well, but consider the following two approaches. For simplicity, let’s say we wanted to do this (I’m being vague here) 1000 times.
Replicate the entire experiment 1000 times. That is, there will be 1000 independent tosses of the coin. This will lead between 1000 and 2000 awakenings, with expected value of 1500 awakenings. But… whatever the total number of awakenings are, they are not independent. For example, one the first awakening it could be either heads or tails. On the second awakening, it only could be heads if it was heads on the first awakening. So, Beauty’s options on awakening #2 are (possibly) different than her options on awakening #1. We do not have 2 replicates of the same situation. This approach will give you the correct ratio of counts in the long run (for example, we do expect the # of heads & Monday to equal the # of tails and Monday and the # of tails and Tuesday).
Replicate her awakening-state 1000 times. Because her epistemic state is always the same on an awakening, from her perspective, it could be Monday or Tuesday, it could be heads or tails. She knows that it was a fair coin. She knows that if she’s awake it’s definitely Monday if heads, and could be either Monday or Tuesday if tails. She knows that 50% of coin tosses would end up heads, so we assign 0.5 to Monday&heads. She knows that 50% of coin tosses would end up tails, so we assign 0.5 to tails, which implies 0.25 to tails&Monday and 0.25 to tails&Tuesday. If we generate observations from this 1000 times, we’ll get 1000 awakenings. We’ll end up with heads 50% of the time.
The distinction between 1 and 2 is that, in 2, we are trying to repeatedly sample from the joint probability distributions that she should have on an awakening. In 1, we are replicating the entire experiment, with the double counting on tails.
In 1, people are using these ratios of expected counts to get the 1⁄3 answer. 1⁄3 is the correct answer to the question about the long-run frequencies of awakenings preceded by heads to awakenings preceded by tails. But I do not think it is the answer to the question about her credence of heads on an awakening.
In 2, the joint probabilities are determined ahead of time based on what we know about the experiment.
Let n2 and n3 are counts, in repeated trials, of tails&Monday and tails&Tuesday, respectively. You will of course see that n2=n3. They are the same random variable. tails&Monday and tails&Tuesday are the same. It’s like what Jack said about types and tokens. It’s like Vladimir_Nesov said:
You said:
I don’t think it matters if she has the knowledge before the experiment or not. What matters is if she has new information about the likelihood of heads to update on. If she did, we would expect her accuracy to improve. So, for example, if she starts out believing that heads has probability 1⁄2, but learns something about the coin toss, her probability might go up a little if heads and down a little if tails. Suppose, for example, she is informed of a variable X. If P(heads|X)=P(tails|X), then why is she updating at all? Meaning, why is P(heads)=/=P(heads|X)? This would be unusual. It seems to me that the only reason she changes is because she knows she’d be essentially ‘betting’ twice of tails, but that really is distinct from credence for tails.
Yet one more variant. On my view it’s structurally and hence statistically equivalent to Iterated Sleeping Beauty, and I present an argument that it is. This one has the advantage that it does not rely on any science fictional technology. I’m interested to see if anyone can find good reasons why it’s not equivalent.
The Iterated Sleeping Beaty problem (ISB) is the original Standard Sleeping Beauty (SSB) problem repeated a large number N of times. People always seem to want to do this anyway with all the variations, to use the Law of Large Numbers to gain insight to what they should do in the single shot case.
The Setup
As before, Sleeping Beauty is fully apprised of all the details ahead of time.
The experiment is run for N consecutive days (N is a large number).
At midnight 24 hours prior to the start of the experiment, a fair coin is tossed.
On every subsequent night, if the coin shows Heads, it is tossed again; if it shows Tails, it is turned over to show Heads.
(This process is illustrated by a discrete-time Markov chain with transition matrix:
and the state vector is the row
with consecutive state transitions computed as x * P^k
Each morning when Sleeping Beauty awakes, she is asked each of the following questions:
“What is your credence that the most recent coin toss landed Heads?”
“What is your credence that the coin was tossed last night?”
“What is your credence that the coin is showing Heads now?”
The first question is the equivalent of the question that is asked in the Standard Sleeping Beauty problem. The second question corresponds to the question “what is your credence that today is Monday?” (which should also be asked and analyzed in any treatment of the Standard Sleeping Beauty problem.)
Note: in this setup, 3) is different than 1) only because of the operation of turning the coin over instead of tossing it. This is just a perhaps too clever mechanism to count down the days (awakenings, actually) to the point when the coin should be tossed again. It may very well make a better example if we never touch the coin except to toss it, and use some other deterministic countdown mechanism to count repeated awakenings per coin toss. That allows easier generalization to the case where the number of days to awaken when Tails is greater than 2. It also makes 3) directly equivalent to the standard SB question, and also 1) and 3) have the same answers. You decide which mechanism is easier to grasp from a didactic point of view, and analyze that one.
After that, Beauty goes on about her daily routine, takes no amnesia drugs, sedulously avoids all matter duplicators and transhuman uploaders, and otherwise lives a normal life, on one condition: she is not allowed to examine the coin or discover its state (or the countdown timer) until the experiment is over.
Analysis
Q1: How should Beauty answer?
Q2: How is this scenario similar in key respects to the SSB/ISB scenario?
Q3: How does this scenario differ in key respects from the SSB/ISB scenario?
Q4: How would those differences if any make a difference to how Beauty should answer?
My answers:
Q1: Her credence that the most recent coin toss landed Heads should be 1⁄3. Her credence that the coin was tossed last night should be 1⁄3. Her credence that the coin shows Heads should be 2⁄3. (Her credence that the coin shows Heads should be 1⁄3 if we never turn it over, only toss, and 1/K if the countdown timer counts K awakenings per Tail toss.)
Q2: Note that Beauty’s epistemic state regarding the state of the coin, or whether it was tossed the previous midnight, is exactly the same on every morning, but without the use of drugs or other alien technology. She awakens and is asked the questions once every time the coin toss lands Heads, and twice every time it lands tails. In Standard Sleeping Beauty, her epistemic state is reset by the amnesia drugs. In this setup, her epistemic state never needs to be reset because it never changes, simply because she never receives any new information that could change it, including the knowledge of when the coin has been tossed to start a new cycle.
Q3: In ISB, a new experimental cycle is initiated at fixed times—Monday (or Sunday midnight). Here the start of a new “cycle” occurs with random timing. The question arises, does the difference in the speed of time passing make any difference to the moments of awakening when the question is asked? Changing labels from “Monday” and “Tuesday” to “First Day After Coin Toss” and “Second Day After Coin Toss” respectively makes no structural change to the operation of the process. Discrete-time Markov chains have no timing, they have only sequence.
In the standard ISB, there seems to be a natural unit of replication: the coin toss on Sunday night followed by whatever happens through the rest of the week. Here, that unit doesn’t seem so prominent, though it still exists as a renewal point of the chain. In a recurrent Markov chain, the natural unit of replication seems to be the state transition. Picking a renewal point is also an option, but only as a matter of convenience of calculation; it doesn’t change the analysis.
Q4: I don’t see how. The events, and the processes which drive their occurence haven’t changed that I can see, just our perspective in looking at them. What am I overlooking?
Iteration
I didn’t tell you yet how N is determined and how the experiment is terminated. Frankly, I don’t think it matters all that much as N gets large, but let’s remove all ambiguity.
Case A: N is a fixed large number. The experiment is terminated on the first night on which the coin shows Heads, after the Nth night.
Case B: N is not fixed in advance, but is guaranteed to be larger than some other large fixed number N’, such that the coin has been tossed at least N’ times. Once N’ tosses have been counted, the experiment is terminated on any following night on which the coin shows Heads, at the whim of the Lab Director.
Q5: If N (or N’) is large enough, does the difference between Case A and B make a difference to Beauty’s credence? (To help sharpen your answer, consider Case C: Beauty dies of natural causes before the experiment terminates.)
Note that in view of the discussion under Q3 above, we are picking some particular state in the transition diagram and thinking about recurrence to and from that state. We could pick any other state too, and the analysis wouldn’t change in any significant way. It seems more informative (to me at any rate) to think of this as an ongoing prcess that converges to stable behavior at equilibrium.
Extra Credit:
This gets right to the heart of what a probability could mean, what things can count as probabilities, and why we care about Sleeping Beauty’s credence.
Suppose Beauty is sent daily reports showing cumulative counts of the nightly heads/tails observations. The reports are sufficiently old as not to give any information about the current state of the coin or when it was last tossed. (E.g., the data in the report are from at least two coin tosses ago.) Therefore Beauty’s epistemic state about the current state of the coin always remains in its initial/reset state, with the following exception. Discuss how Beauty could use this data to--
corroborate that the coin is in fact fair as she has been told.
update her credences, in case she accrues evidence that shows the coin is not fair.
For me this is the main attraction of this particular model of the Sleeping Beauty setup, so I’m very interested in any possible reasons why it’s not equivalent.
Sorry I was slow to respond .. busy with other things
My answers:
Q1: I agree with you: 1⁄3, 1⁄3, 2⁄3
Q2. ISB is similar to SSB as follows: fair coin; woken up twice if tails, once if heads; epistemic state reset each day
Q3. ISB is different from SSB as follows: more than one coin toss; same number of interviews regardless of result of coin toss
Q4. It makes a big difference. She has different information to condition on. On a given coin flip, the probability of heads is 1⁄2. But, if it is tails we skip a day before flipping again. Once she has been woken up a large number of times, Beauty can easily calculate how likely it is that heads was the most recent result of a coin flip. In SSB, she cannot use the same reasoning. In SSB, Tuesday&heads doesn’t exist, for example.
Consider 3 variations of SSB:
Same as SSB except If heads, she is interviewed on Monday, and then the coin is turned over to tails and she is interviewed on Tuesday. There is amnesia and all of that. So, it’s either the sequence (heads on Monday, tails on Tuesday) or (tails on Monday, tails on Tuesday). Each sequence has a 50% probability, and she should think of the days within a sequence as being equally likely. She’s asked about the current state of the coin. She should answer P(H)=1/4.
Same as SSB except If heads, she is interviewed on Monday, and then the coin is flipped again and she is interviewed on Tuesday. There is amnesia and all of that. So, it’s either the sequence (heads on Monday, tails on Tuesday), (heads on Monday, heads on Tuesday) or (tails on Monday, tails on Tuesday). The first 2 sequences have a 25% chance each and the last one has a 50% chance. When asked about the current state of the coin, she should say P(H)=3/8
The 1⁄2 solution to SSB results from similar reasoning. 50% chance for the sequence (Monday and heads). 50% chance for the sequence (Monday and tails, Tuesday and tails). P(H)=1/2
If you apply this kind of reasoning to ISB, where we are thinking of randomly selected day after a lot of time has passed, you’ll get P(H)=1/3.
I’m struggling to see how ISB isn’t different from SSB in meaningful ways.
Perhaps this is beating a dead horse, but here goes. Regarding your two variants:
I agree. When iterated indefinitely, the Markov chain transition matrix is:
acting on state vector [ H1 H2 T1 T2 ], where H,T are coin toss outcomes and 1,2 label Monday,Tuesday. This has probability eigenvector [ 1⁄4 1⁄4 1/4 1⁄4 ]; 3 out of 4 states show Tails (as opposed to the coin having been tossed Tails). By the way, we have unbiased sampling of the coin toss outcomes here.
If the Markov chain model isn’t persuasive, the alternative calculation is to look at the branching probability diagram
[http://entity.users.sonic.net/img/lesswrong/sbv1tree.png (SB variant 1)]
and compute the expected frequencies of letters in the result strings at each leaf on Wednesdays. This is
I agree. Monday-Tuesday sequences occur with the following probabilities:
Also, the Markov chain model for the iterated process agrees:
acting on state vector [ H1 H2 T1 T2 ] gives probability eigenvector [ 1⁄4 1⁄8 1⁄4 3⁄8 ]
Alternatively, use the branching probability diagram
[http://entity.users.sonic.net/img/lesswrong/sbv2tree.png (SB variant 2)]
to compute expected frequencies of letters in the result strings,
Because of the extra coin toss on Tuesday after Monday Heads, these are biased observations of coin tosses. (Are these credences?) But neither of these two variants is equivalent to Standard Sleeping Beauty or its iterated variants ISB and ICSB.
(Sigh). I don’t think your branching probability diagram is correct. I don’t know what other reasoning you are using. This is the diagram I have for Standard Sleeping Beauty
[http://entity.users.sonic.net/img/lesswrong/ssbtree.png (Standard SB)]
And this is how I use it, using exactly the same method as in the two examples above. With probability 1⁄2 the process accumulates 2 Tails observations per week, and with probability 1⁄2 accumulates 1 Heads observation. The expected number of observations per week is 1.5, the expected number of Heads observations per week is 0.5, the expected number of Tails observations is 1 per week.
Likewise when we record Monday/Tuesday observations per week instead of Heads/Tails, the expected number of Monday observations is 1, expected Tuesday observations 0.5, for a total of 1.5. But in both of your variants above, the expected number of Monday observations = expected number of Tuesday observations = 1.
Thanks for your response. I should have been clearer in my terminology. By “Iterated Sleeping Beauty” (ISB) I meant to name the variant that we here have been discussing for some time, that repeats the Standard Sleeping Beauty problem some number say 1000 of times. In 1000 coin tosses over 1000 weeks, the number of Heads awakenings is 1000 and the number of Tails awakenings is 2000. I have no catchy name for the variant I proposed, but I can make up an ugly one if nothing better comes to mind; it could be called Iterated Condensed Sleeping Beauty (ICSB). But I’ll assume you meant this particular variant of mine when you mention ISB.
You say
“More than one coin toss” is the iterated part. As far as I can see, and I’ve argued it a couple times now, there’s no essential difference between SSB and ISB, so I meant to draw a comparison between my variant and ISB.
“Same number of interviews regardless of result of coin toss” isn’t correct. Sorry if I was unclear in my description. Beauty is interviewed once per toss when Heads, twice when Tails. This is the same in ICSB as in Standard and Iterated Sleeping Beauty. Is there an important difference between Standard Sleeping Beauty and Iterated Sleeping Beauty, or is there an important difference between Iterated Sleeping Beauty and Iterated Condensed Sleeping Beauty?
We not only skip a day before tossing again, we interview on that day too! I see how over time Beauty gains evidence corroborating the fairness of the coin (that’s exactly my later rhetorical question), but assuming it’s a fair coin, and barring Type I errors, she’ll never see evidence to change her initial credence in that proposition. In view of this, can you explain how she can use this information to predict with better than initial accuracy the likelihood that Heads was the most recent outcome of the toss? I don’t see how.
After relabeling Monday and Tuesday to Day 1 and Day 2 following the coin toss, Tuesday&Heads (H2) exists in none of these variants. So what difference is there?
Good and well, but—are these legitimate credences? If not, why not? And if so, why aren’t they also in the following:
Standard Iterated Sleeping Beauty is isomorphic to the following Markov chain, which just subdivides the Tails state in my condensed variant into Day 1 and Day 2:
operating on row vector of states [ Heads&Day1 Tails&Day1 Tails&Day2 ], abbreviated to [ H1 T1 T2 ]
When I say isomorphic, I mean the distinct observable states of affairs are the same, and the possible histories of transitions from awakening to next awakening are governed by the same transition probabilities.
So either there’s a reason why my 2-state Markov chain correctly models my condensed variant that allows you to accept the 1⁄3 answers it computes, that doesn’t apply to the three-state Markov chain and its 1⁄3 answers (perhaps you came to those answers independently of my model), or else there’s some reason why the three-state Markov chain doesn’t correctly model the Iterated Sleeping Beauty process. Can you help me see where the difficulty may lie?
I assume you are referring to my variant, not what I’m calling Iterated Sleeping Beauty. If so, I’m kind of baffled by this statement, because under similarities, you just listed
fair coin
woken twice if Tails, once if Heads
epistemic state reset each day
With the emendation that 2) is per coin toss, and in 3) “each day” = “each awakening”, you have just listed three essential features that SSB, ISB and ICSB all have in common. It’s exactly those three things that define the SSB problem. I’m claiming that there aren’t any others. If you disagree, then please tell me what they are. Or if parts of my argument remain unclear, I can try to go into more detail.
Two ways to iterate the experiment:
and
This seems a distinction without a difference. The longer the iterated SB process continues, the less important is the distinction between counting tosses versus counting awakenings. This distinction is only about a stopping criterion, not about the convergent behavior of observations or coin tosses to expected values as it’s ongoing. Considered as an ongoing process of indefinite duration, the expected number of tosses and of observations of each type are well-defined, easily computed, and well-behaved with respect to each other. Over the long run, #awakenings accumulates 1.5 times more frequently than #tosses. Beauty is never more than two awakenings away from starting a new coin toss, so whether you choose to stop as soon as an awakening has completed or until you finish a coin-toss cycle, the relative perturbation in the statistics collected so far goes to zero. Briefly, there is no “natural” unit of replication independent of observer interest.
This would be an error. You are assigning a 50% probability to an observation (that it is Heads&Monday) without taking into account the bias that’s built in to the process for Beauty to make observations. Alternatively, if you are uncertain whether Monday is true or not—you know it might be Tuesday—then you should be uncertain that P(Heads)=P(Heads&Monday).
You the outside observer know the chance of observing that the coin lands Heads is 50%. You presumably know this because you have corroborated it through an unbiased observation process: look at the coin exactly once per toss. Once Beauty is put to sleep and awoken, she is no longer an outside observer, she is a particpant in a biased observation process, so she should update her expectation about what her observation process will show. Different observation process, different observations, different likelhoods of what she can expect to see.
Of course, as a card-carrying thirder, I’m assuming that the question about credence is about what Beauty is likely to see upon awakening. That’s what the carefully constructed wording of the question suggests to me.
except that as we agreed, she’s not observing coin tosses, she’s observing biased samples of coin tosses. The connection between what she observes and the objective behavior of the coin is just what’s at issue here, so you can’t beg the question.
Agreed, but for this: it all depends on what you want credence to mean, and what it’s good for; see discussion below.
Let me uphold a distinction that’s continually skated over, but which is crucial point of disagreement here. I think you’re confusing your evidence for the thing evidenced. And you are selectively filtering your evidence, which amounts to throwing away information. Tails&Monday and Tails&Tuesday are not the same; they are distinct observations of the same state of the coin, thus they are perfectly correlated in that regard. Aside from the coin, they observe distinct days of the week, and thus different states of affairs. By a state of affairs I mean the conjunction of all the observable properties of interest at the moment of observation.
The distinction between types and tokens is only relevant when you want to interepret your tokens as being about something else, their types, rather than about themselves. But types are carved out of observers’ interests in their significance, which are non-objective, observer-dependent if anything is. Their variety and fineness of distinction is potentially infinite. As I mentioned above, a state of affairs is a conjunction of observable properties of interest. This Boolean lattice has exactly one top: Everything, and unknown atoms if any at bottom. Where you choose to carve out a distinction between type and token is a matter of observer interest.
I’ll certainly agree it isn’t desirable, but oughtn’t isn’t the same as isn’t, and in the Sleeping Beauty problem we have no choice. Monday and Tuesday just are different elements in a sample space, by construction.
What you seem to be talking about is using evidence that observations provide to corroborate or update Beauty’s belief that the coin is in fact fair. Is that a reasonable take? But due to the epistemic reset between awakenings, there is never any usable input to this updating procedure. I’ve already stipulated this is impossible. This is precisely what the epistemic reset assumption is for. I thought we were getting off this merry-go-round.
Ok, I guess it depends on what you want the word “credence” to mean, and what you’re going to use it for. If you’re only interested in some updating process that digests incoming information-theoretic quanta, like you would get if you were trying to corroborate that the coin was inded a fair one to within a certain standard error, you don’t have it here. That’s not Sleeping Beauty, that’s her faithful but silent, non-memory-impaired lab partner with the log book. If Beauty herself is to have any meaningful notion of credence in Heads, it’s pointless for it be about whether the coin is indeed fair. That’s a separate question, which in this context is a boring thing to ask her about, because it’s trivially obvious: she’s already accepted the information going in that it is fair and she will never get new information from anywhere regarding that belief. And, while she’s undergoing the process of being awoken inside the experimental setup, a value of credence that’s not connected to her observations is not useful for any purpose that I can see, other than perhaps to maintain her membership in good standing in the Guild of Rational Bayesian Epistomologists. It doesn’t connect to her experience, it doesn’t predict frequencies of anything she has any access to, it’s gone completely metaphysical. Ok, what else is there to talk about? On my view, the only thing left is Sleeping Beauty’s phenomenology when awakened. On Bishop Berkeley’s view, that’s all you ever have.
Beauty gets usable, useful information (I guess it depends on what you want “information” to mean, too) once, on Sunday evening, and she never forgets it thereafter. This information is separate from, in addition to the information that the coin itself is fair. This other information allows her to make a more accurate prediction about the likelihood that, each time she is awoken, the coin is showing heads. Or whether it’s Monday or Tuesday. The information she receives is the details of the sampling process, which has been specifically constructed to give results that are biased with respect to the coin toss itself, and the day of the week. Directly after being informed of the structure of the sampling process, she knows it is biased and therefore ought to update her prediction about what relative frequencies per observation will be of each observable aspect of the possible state of affairs she’s awoken into—Heads vs. Tails, Monday vs. Tuesday.
I think I might understand the interpretation that a halfer puts on the question. I’m just doubtful of its interest or relevance. Do you see any validity (I mean logical coherence, as opposed to wrong-headedness) to this interpretation? Is this just a turf war over who gets to define a coveted word for their purposes?
Consider the case of Sleeping Beauty with an absent-minded experimenter.
If the coin comes up Heads, there is a tiny but non-zero chance that the experimenter mixes up Monday and Tuesday.
If the coin comes up Tails, there is a tiny but non-zero chance that the experimenter mixes up Tails and Heads.
The resulting scenario is represented in a new sheet, Fuzzy two-day, of my spreadsheet document.
Under these assumptions, Beauty may no longer rule out Tuesday & Heads. She has no justification to assign all of the Heads probability mass to Monday & Heads. She is therefore constrained to conditioning on being woken in the way that the usual two-day variant suggests she should, and ends up with a credence arbitrarily close to 1⁄3 if we make the “absent-minded” probability tiny enough.
Why should we get a discontinuous jump to 1⁄2 as this becomes zero?
This sounds like the continuity argument, but I’m not quite clear on how the embedding is supposed to work, can you clarify? Instead of telling me what the experimenter rightly or wrongly believes to be the case, spell out for me how he behaves.
What does this mean operationally? Is there a nonzero chance, let’s call it epsilon or e, that the experimenter will incorrectly behave as if it’s Tuesday when it’s Monday? I.e., with probability e, Beauty is not awoken on Monday, the experiment ends, or is awoken and sent home, and we go on to next Sunday evening without any awakenings that week? Then Heads&Tuesday still with certainty does not occur. So maybe you meant that on Monday, he doesn’t awaken Beauty at all, but awakens her on Tuesday instead? Is this confusion persistent across days, or is it a random confusion that happens each time he needs to examine the state of the coin to know what he should do?
And on Tuesday
So when the coin comes up Tails, there is a nonzero probability, let’s call it delta or d, that the experimenter will incorrectly behave as if it’s Heads? I.e., on Tuesday morning, he will not awaken Beauty or will wake her and send her home until next Sunday? Then Tails&Tuesday is a possible nonoccurrence.
On reflection, my verbal description doesn’t rmatch the reply I wanted to give, which was: the experimenter behaves such that the probability mass is allocated as in the spreadsheet.
Make it “on any day when Beauty is scheduled to remain asleep, the experimenter has some probability of mistakenly waking her, and vice-versa”.
This is interesting. We shouldn’t get a discontinuous jump.
Consider 2 related situations:
if Heads she is woken up on Monday, and the experiment ends on Tuesday. If tails, she is woken up on Monday and Tuesday, and the experiment ends on Wed. In this case, there is no ‘not awake’ option.
If heads she is woken up on Monday and Tuesday. On Monday she is asked her credence for heads. On Tuesday she is told “it’s Tuesday and heads” (but she is not asked about her credence; that is, she is not interviewed). If tails, it’s the usual woken up both days and asked about her credence. The experiment ends on Wed.
In both of these scenarios, 50% of coin flips will end up heads. In both cases, if she’s interviewed she knows it’s either Monday&heads, Monday&tails or Tuesday&tails. She has no way of telling these three options apart, due to the amnesia.
I don’t think we should be getting different answers in these 2 situations. Yet, I think if we use your probability distributions we do.
I think there are two basic problems. One is that Monday&tails is really not different from Tuesday&tails. They are the same variable. It’s the same experience. If she could time travel and repeat the monday waking it would feel the same to her as the Tuesday waking. The other issue is that, even though in my scenario 2 above, when she is woken but before she knows if she will be interviewed, it would look like there is a 25% chance it’s heads&Monday and a 25% it’s heads&Tuesday. And that’s probably a reasonable way to look at it. But, that doesn’t imply that, once she finds out it’s an interview day, that the probability of heads&Monday shifts to 1⁄3. That’s because on 50% of coin flips she will experience heads&Monday. That’s what makes this different than a usual joint probability table representing independent events.
My reasoning has been to consider scenario 1 from the perspective of an outside observer, who is uncertain about each variable: a) whether it is Monday or Tuesday, b) how the coin came up, c) what happened to Beauty on that day.
To that observer, “Tuesday and heads” is definitely a possibility, and it doesn’t really matter how we label the third variable: “woken”, “interviewed”, whatever. If the experiment has ended, then that’s a day where she hasn’t been interviewed.
If the outside observer learns that Beauty hasn’t been interviewed today, then they may conclude that it’s Tuesday and that the coin came up heads, thus a) they have something to update on and b) that observer must assign probability mass to “Tuesday & Heads & not interviewed”.
If the outside observer learns that Beauty has been interviewed, it seems to me that they would infer that it’s more likely, given their prior state of knowledge, that the coin came up heads.
To the outside observer, scenario 2 isn’t really distinct from scenario 1. The difference only makes a difference to Beauty herself.
However, I see no reason to treat Beauty herself differently than an outside observer, including the possibility of updating on being interviewed or on not being interviewed.
So, if my probability tables are correct for an outside observer, I’m pretty sure they’re correct for Beauty.
(My confidence in the table themselves, however, has been eroded a little by my not being able to calculate Beauty—or an observer—updating on a new piece of information in the “fuzzy” variant, e.g. using P(heads|woken) as a prior probability and updating on learning that it is in fact Tuesday. It seems to me that for the math to check out requires that this operation should recover the “absent-minded experimenter” probability for “tuesday & heads & woken”. But I’m having a busy week so far and haven’t had much time to think about it.)
Why is that a problem? Why would N have to be equal to n1+n2+n3? Only because it does in your other example?
(ETA)I’m not sure of where you’re formula “lim N-> infinity n1/(n1+n2+n3)” comes from—as the third example shows, it just doesn’t work in all cases. That doesn’t mean that your alternative formula is better in the sleeping beauty case.
Because this, lim N-> infinity n1/(n1+n2+n3), is p1 if the counts are from independent draws of a multinomial distribution.
We have outcome-dependent sampling here. Is lim N-> infinity n1/(n1+n2+n3) equal to p1 in that case? I’d like to see the statistical theory to back up the claim. It’s pretty clear to me that people who believe the answer is 1⁄3 pictured counts in a 3 by 1 contingency table, and applied the wrong theory to it.
(ETA) The formula “lim N-> infinity n1/(n1+n2+n3)” is what people who claim the answer is 1⁄3 are using to justify it. The 1⁄2 solution just uses probability laws. That is, P(H)=1/2. P(W)=1, where W is the event that Beauty has been awakened. Therefore, P(H|W)=1/2.
I’ll have to disagree with that—there is a pretty clear interpretation in which 1⁄3 is a “correct” answer: if the sleeping beauty is asked to bet X dollars that heads came up, and wins $60 if she’s right, for up to which values X should she accept the bet? (if the coin comes up tails, she gets the possibility to bet twice)
In that scenario, X=$20 is the right answer, which corresponds to a probability of 1⁄3. Do you agree with that? (I haven’t read all the threads, you probably adressed this somewhere)
See, here I’m not using any “lim N-> infinity n1/(n1+n2+n3)” , so I feel you’re being unfair to 1/3rders.
I don’t agree, because the question is about her subjective probability at an awakening. The betting question you described is a different one.
For example, suppose I flip a coin and tell you you will win $60 if heads came up, but I require that you make the bet twice if tails came up? You’d be willing to bet up to $30, but that doesn’t mean you think heads has probability 1⁄3. If Beauty really thinks heads has probability 1⁄3, she’d be willing to accept the bet up to $30 even if we told her that we’d only accept one bet (of course, we wouldn’t tell her that she’s already made a bet on Tuesday. Payout would be on Wed).
The wikipedia page for the sleeping beauty problem says:
That’s why I think people are picturing counts in a contingency table when they come up with the 1⁄3 answer.
She would also bet at an awakening. If you ask her to bet when she just broke up, it would seem weird that she would say “my subjective probability for heads is 1⁄2, but I’ll only willing to bet up to $20 − 1⁄3 of the winnings if it’s heads.”
It seems even weirder in the Xtreme Sleeping Beauty, where she’s awakened a thousand times : “my subjective probability for heads is 1⁄2, but I’m only willing to bet up to 6 cents”.
Yes, you get a different result if you change the betting rules where only one bet per “branch” counts, but I don’t see why that’s closer to the problem as originally stated.
I guess I don’t see why it’s weird. The number of times she will bet is dependent on the outcome. So, even though at each awakening she thinks probability of heads is 1⁄2, she knows if it’s tails she’ll have to bet many more times than if heads. We’re essentially just making her bet more money on a loss than on a win.
In that case, what does it even mean to say “my subjective probability for heads is 1/2”? Subjective probability is often described in terms of bettings—see here.
Seems to me this is mostly a quarrel of definitions, and that when you say “people who believe the answer is 1⁄3 pictured counts in a 3 by 1 contingency table, and applied the wrong theory to it.”, you’re being unfair. They’re just using a different definition of “subjective probability”
Don’t you think so?
Based on my interaction with people here, I think we all are talking about the same thing when it comes to subjective probability.
I agree that you can use betting to describe subjective probability, but there are a lot of possible ways to bet.
“Subjective probability” is a basic term in decision theory and economics, though. If you want to roll your own metric, surely you should call it something else—to avoid much confusion.
That is why I’d rather talk in terms of bets than subjective probability—they don’t require precise technical definitions.