Conditioning on Observers
Response to Beauty quips, “I’d shut up and multiply!”
Related to The Presumptuous Philosopher’s Presumptuous Friend, The Absent-Minded Driver, Sleeping Beauty gets counterfactually mugged
This is somewhat introductory. Observers play a vital role in the classic anthropic thought experiments, most notably the Sleeping Beauty and Presumptuous Philosopher gedankens. Specifically, it is remarkably common to condition simply on the existence of an observer, in spite of the continuity problems this raises. The source of confusion appears to be based on the distinction between the probability of an observer and the expectation number of observers, with the former not being a linear function of problem definitions.
There is a related difference between the expected gain of a problem and the expected gain per decision, which has been exploited in more complex counterfactual mugging scenarios. As in the case of the 1⁄2 or 1⁄3 confusion, the issue is the number of decisions that are expected to be made, and recasting problems so that there is at most one decision provides a clear intuition pump.
In the classic sleeping beauty problem, experimenters flip a fair coin on Sunday, sedate you and induce amnesia, and wake you either on just the following Monday or both the following Monday and Tuesday. Each time you are woken, you are asked for your credence that the coin came up heads.
The standard answers to this question are that the answer should be 1⁄2 or 1⁄3. For convenience let us say that the event W is being woken, H is that the coin flip came up heads and T is that the coin flip came up tails. The basic logic for the 1⁄2 argument is that:
P(H)=P(T)=1/2, P(W|H) = P(W|T) = P(W) = 1 so by Bayes rule P(H|W) = 1⁄2
The obvious issue to be taken with this approach is one of continuity. The assessment is independent of the number of times you are woken in each branch, and this implies that all non zero observer branches have their posterior probability equal to their prior probability. Clearly the subjective probability of a zero observer branch is zero, so this implies discontinuity in the decision theory. Whilst not in and of itself fatal, it is surprising. There is apparent secondary confusion over the number of observations in the sleeping beauty problem, for example:
If we want to replicate the situation 1000 times, we shouldn’t end up with 1500 observations. The correct way to replicate the awakening decision is to use the probability tree I included above. You’d end up with expected cell counts of 500, 250, 250, instead of 500, 500, 500.
Under these numbers, the 1000 observations made have required 500 heads and 250 tails, as each tail produces both an observation on Monday and Tuesday. This is not the behaviour of a fair coin. Further consideration of the problem shows that the naive conditioning on W is the point where it would be expected that the number of observations comes in. Hence in 900 observations, there would be 300 heads and 300 tails, with 600 observations following a tail and 300 following a head. To make this rigorous, let Monday and Tuesday be the event of being woken on Monday and Tuesday respectively. Then:
P(H|Monday) = 1⁄2, P(Monday|W) = 2⁄3 (P(Monday|W) = 2*P(Tuesday|W) as Monday occurs regardless of coin flip)
P(H|W) = P(H ∩ Monday|W) + P(H ∩ Tuesday|W) (Total Probability)
= P(H|Monday ∩ W).P(Monday|W) + 0 (As P(Tuesday|H) = 0)
= P(H|Monday).P(Monday|W) = 1⁄3 (As Monday ∩ W = Monday)
Which would appear to support the view of updating on existence. The question of why this holds in the analysis is immediate to answer: The only day on which probability of heads occuring is non zero is Monday, and given an awakening it is not guaranteed that it is Monday. This should not be confused with the correct observation that there is always one awakening on Monday. This has caused problems because “Awakening” is not an event which occurs only once in each branch. Indeed, using the 1⁄3 answer and working back to try to find P(W) yields P(W) = 3⁄2, which is a strong indication that it is not the probability that matters, but the E(# of instances of W). As intuition pumps, we can consider some related problems.
Sleeping Twins
This experiment features Omega. It announces that it will place you and an identical copy of you in identical rooms, sedated. It will then flip a fair coin. If the coin comes up heads, it will wake one of you randomly. If it comes up tails, it will wake both of you. It will then ask what your credence for the coin coming up heads is.
You wake up in a nondescript room. What is your credence?
It is clear from the structure of this problem that it is almost identical to the sleeping beauty problem. It is also clear that your subjective probability of being woken is 1⁄2 if the coin comes up heads and 1 if it comes up tails, so conditioning on the fact that you have been woken the coin came up heads with probability 1⁄3. Why is this so different to the Sleeping Beauty problem? The fundamental difference is that in the Sleeping Twins problem, you are woken at most once, and possibly not, whereas in the Sleeping Beauty problem you are woken once or many times. On the other hand, the number of observer moments on each branch of the experiment is equal to that of the Sleeping Beauty problem, so it is odd that the manner in which these observations are achieved should matter. Clearly information flow is not possible, as provided for by amnesia in the original problem. Let us drive this further
Probabilistic Sleeping Beauty
We return to the experimenters and a new protocol. The experimenters fix a constant k in {1,2,..,20}, sedate you, roll a D20 and flip a coin. If the coin comes up tails, they will wake you on day k. If the coin comes up heads and the D20 comes up k, they will wake you on day 1. In either case they will ask you for your credence that the coin came up heads.
You wake up. What is your credence?
In this problem, the multiple distinct copies of you have been removed, at the cost of an explicit randomiser. It is clear that the structure of the problem is independent of the specific value of the constant k. It is also clear that updating on being woken, the probability that the coin came up heads is 1⁄21 regardless of k. This is troubling for the 1⁄2 answer, however, as playing this game with a single die roll and all possible values of k recovers the Sleeping Beauty problem (modulo induced amnesia). Again, having reduced the expected number of observations to be in [0,1], intuition and calculation seem to imply a reduced chance for the heads branch conditioned on being woken.
This further suggests that the misunderstanding in Sleeping Beauty is one of naively looking at P(W|H) and P(W|T), when the expected numbers of wakings are E(#W|H) = 1, E(#W|T) = 2.
The Apparent Solution
If we allow conditioning on the number of observers, we correctly calculate probabilities in the Sleeping Twins and Probabilistic Sleeping Beauty problems. It is correctly noted that a “single paying” bet is accepted in Sleeping Beauty with odds of 2; this follows naturally under the following decision schema: “If it is your last day awake the decision is binding, otherwise it is not”. Let the event of being the last day awake be L. Then:
P(L|W ∩ T) = 1⁄2, P(L|W ∩ H) = 1, the bet pays k for a cost of 1
E(Gains|Taking the bet) = (k-1) P(L|W ∩ H)P(H|W) - P(L|W ∩ T) P(T|W) = (k-1) P(H|W) - P(T|W)/2
Clearly to accept a bet at payout of 2 implies that P(H|W) - P(T|W)/2 ≥ 0, so 2.P(H|W) ≥ P(T|W), which contraindicates the 1⁄2 solution. The 1⁄3 solution, on the other hand works as expected. Trivially the same result holds if the choice of important decision is randomised. In general, if a decision is made by a collective of additional observers in identical states to you, then the existence of the additional observers does not change anything the overall payoffs. This can be modelled either by splitting payoffs between all decision makers in a group making identical decisions, or equivalently calculating as if there is a 1/N chance that you dictate the decision for everyone given N identical instances of you (“Evenly distributed dictators”). To do otherwise leads to fallacious expected gains, as exploited in Sleeping Beauty gets counterfactually mugged. Of course, if the gains are linear in the number of observers, then this cancels with the division of responsibility and the observer count can be neglected, as in accepting 1⁄3 bets per observer in Sleeping Beauty.
The Absent Minded Driver
If we consider the problem of The Absent-Minded Driver, then we are faced with another scenario in which depending on decisions made there are varying numbers of observer moments in the problem. This allows an apparent time inconsistency to appear, much as in Sleeping Beauty. The problem is as follows:
You are an mildly amnesiac driver on a motorway. You notice approaching junctions but recall nothing. There are 2 junctions. If you turn off at the first, you gain nothing. If you turn off at the second, you gain 4. If you continue past the second, you gain 1.
Clearly analysis of the problem shows that if p is the probability of going forward (constant care of the amnesia), the payout is p[p+4(1-p)], maximised at p = 2⁄3. However once one the road and approaching a junction, let the probability that you are approaching the first be α. The expected gain is then claimed to be αp[p+4(1-p)]+(1-α)[p+4(1-p)] which is not maximised at 2⁄3 unless α = 1. It can be immediately noticed that given p, α = 1/(p+1). However, this is still not correct.
Instead, we can observe that all non zero payouts are the result of two decisions, at the first and second junctions. Let the state of being at the first junction be A, and the second be B. We observe that:
E(Gains due to one decision|A) = 1 . (1-p)*0 + 1⁄2 . p[p+4(1-p)]
E(Gains due to one decision|B) = 1⁄2 . [p+4(1-p)]
P(A|W) = 1/(p+1), P(B|W) = p/(p+1), E(#A) = 1, E(#B) = p, (#A, #B independent of everything else)
Hence the expected gain per decision:
E(Gains due to one decision|W) = [1 . (1-p)*0 + 1⁄2 . p[p+4(1-p)]]/(p+1) + 1⁄2 . [p+4(1-p)].p/(p+1) = [p+4(1-p)].p/(p+1)
But as has already been observed in this case the number of decisions made is dependent on p, and thus
E(Gains|W) = [p+4(1-p)].p , which is the correct metric. Observe also that E(Gains|A) = E(Gains|B) = p[p+4(1-p)]/2
As a result, there is no temporal inconsistency in this problem; the approach of counting up over all observer moments, and splitting outcomes due to a set of decisions across the relevant decisions is seemingly consistent.
Sleeping Beauty gets Counterfactually Mugged
In this problem, the Sleeping Beauty problem is combined with a counterfactual mugging. If Omega flips a head, it simulates you, and if you would give it $100 it will give you $260. If it flips a tail, it asks you for $100 and if you give it to Omega, it induces amnesia and asks again the next day. On the other hand if it flips a tail and you refuse to give it money, it gives you $50.
Hence precommitting to give the money nets $30 on the average, whilst precommiting not to nets $25 on the average. However since you make exactly 1 decision on either branch if you refuse, whilst you make 3 decisions every two plays if you give Omega money, per decision you make $25 from refusing and $20 from accepting (obtained via spreading gains over identical instances of you). Hence correct play depends on whether Omega will ensure you get a consistent number of decisions or plays of the whole scenario. Given a fixed number of plays of the complete scenario, we thus have to remember to account for the increased numbers of decisions made in one branch of possible play. In this sense it is identical to the Absent Minded Driver, in that the number of decisions is a function of your early decisions, and so must be brought in as a factor in expected gains.
Alternately, from a more timeless view we can note that your decisions in the system are perfectly correlated; it is thus the case that there is a single decision made by you, to give money or not to. A decision to give money nets $30 on average, whilst a decision not to nets only $25; the fact that they are split across multiple correlated decisions is irrelevant. Alternately conditional on choosing to give money you have a 1⁄2 chance of there being a second decision, so the expected gains are $30 rather than $20.
Conclusion
The approach of using the updating on the number observer moments is comparable to UDT and other timeless approaches to decision theory; it does not care how the observers come to be, be it a single amnesiac patient over a long period or a series of parallel copies or simulations. All that matters is that they are forced to make decisions.
In cases where a number of decisions are discarded, the splitting of payouts over the decisions, or equivalently remembering the need for your decision not to be ignored, yields sane answers. This can also be considered as spreading a single pertinent decision out over some larger number of irrelevant choices.
Correlated decisions are not so easy; care must be taken when the number of decisions is dependent on behaviour.
In short, the 1⁄3 answer to sleeping beauty would appear to be fundamentally correct. Defences of the 1⁄2 answer appear to have problems with the number of observer moments being outside [0,1] and thus not being probabilities. This is the underlying danger. Use of anthropic or self indication probabilities yields sane answers in the problems considered, and can cogently answer typical questions designed to yield a non anthropic intuition.
- 12 May 2010 8:29 UTC; 2 points) 's comment on Beauty quips, “I’d shut up and multiply!” by (
- 11 May 2010 6:53 UTC; 1 point) 's comment on Beauty quips, “I’d shut up and multiply!” by (
Intuition Pump
Suppose 50% of people in a population have an asymptomatic form of cancer. None of them know if they have it. One of them is randomly selected and a diagnostic test is carried out (the result is not disclosed to them). If they don’t have cancer, they are woken up once. If they do have it, they are woken up 9 times (with amnesia-inducing drug administered each time, blah blah blah). Each time they are woken up, they are asked their credence (subjective probability) for cancer.
Imagine we do this repeatedly, randomly selecting people from a population that has 50% cancer prevalence.
World A: Everyone uses thirder logic
Someone without cancer will say: “I’m 90% sure I have cancer”
Someone with cancer will say: “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.” “I’m 90% sure I have cancer.”
Notice, everyone says they are 90% sure they have cancer, even though only 50% of them actually do.
Sure, the people who have cancer say it more often, but does that matter? At an awakening (you can pick one), people with cancer and people without are saying the same thing.
World B: Everyone uses halfer logic
Someone without cancer will say: “I’m 50% sure I have cancer”
Someone with cancer will say: “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.” “I’m 50% sure I have cancer.”
Here, half of the people have cancer, and all of them say they are 50% sure they have cancer.
My question: which world contains the more rational people?
I do like this intuition pump—it gets across how weird the situation is—but here’s one thing you may not have realised:
Once the experiment is over, then even based on thirder logic, people ought to cease saying “I’m 90% sure...” and start saying “I’m 50% sure...”
(Because they have forgotten the ‘extra information’ that led them to give 90% rather than 50% as their probability of cancer.)
I like your example because it mirrors my thinking about the Sleeping Beauty puzzle, and brings it out even more strongly: whether the 1⁄2 or 1⁄3 answer (or 50% or 90% answer) is appropriate depends on which probability one is interested in.
Depends on how you define someone being rational/well-calibrated.
In Halfer Country, when someone says they’re 50% sure of having cancer, they do indeed have a 50% chance of having cancer.
In Thirder Land, any time someone makes the statement ‘I’m 90% sure I have cancer’, the statement has a 90% chance of coming from someone who has cancer.
Some of us were evidently born in Thirder Land, others in Halfer Country; my intuition works halferly, but the problem’s a bit like a Necker cube for me now—if I think hard enough I can press myself into seeing the other view.
Your comment and neq1′s intuition pump prompted me to create the following reformulation of the problem without amnesia:
I flip a coin hidden from you, then ask you to name a number. If the coin came up heads, I write your answer into my little notebook (which, incidentally, is all you care about). If it came up tails, I write it in the notebook twice.
When the problem is put this way, it’s clear that the answer hinges on how exactly you care about my notebook. Should it matter to us how many times we express our credence in something?
You make a good point. However, I’d argue that those in Thirder Land had nothing to update on. In fact, it’s clear they didn’t since they all give the same answer. If 50% of the population has cancer, but they all think they do with 0.9 probably, that’s not necessarily less accurate than if everyone thinks they have cancer with 0.5 probability (depends on your loss function or whatever). But the question here is really about whether you had evidence to shift from .5 to .9.
Another Rival Intution Pump
Suppose that for exactly one day every week you sleep during the day and wake up in the evening, and for every other day you sleep at night and wake up in the morning.
Suppose that for a minute after waking, you can reason logically but cannot remember what day it is (and have no way of telling the time).
Then during that minute, surely your subjective probability of it being morning is 6⁄7.
OK now let’s change things up a little:
At the beginning of every week, a coin is flipped. If heads then rather than having 6 days diurnal and 1 day nocturnal, you just have 1 day nocturnal and six days in hibernation. If tails then you have 6 days diurnal and 1 day in hibernation.
Then surely the addition of the coin flip and the hibernation doesn’t change the fact that (for any given awakening) you have a 6⁄7 probability of waking in the morning.
Well, consider:
Scenario 1: You have a bag containing one red ball and an arbitrarily large number of green balls. You reach in and pull out one ball at random. What is the probability that the ball is red?
Scenario 2: You have a bag containing one red ball and another bag containing an arbitrarily large number of green balls. A fair coin is flipped; if heads, you are handed the bag with the red ball, and if tails you are handed the bag with the green balls (you can’t tell the difference between the bags). You reach in and pull out one ball at random. What is the probability that the ball is red?
In scenario 1, P(red) is vanishingly small. In scenario 2, P(red) is 1⁄2.
The disanalogy is that you actually pull out all of the green balls, not just one.
Indeed—introducing amnesia and pulling out each of the green balls in turn might muddy this one up as well.
I think the coin flip does change things. In fact, I don’t see why it wouldn’t.
In case 1, you know you are somewhere along a path where you will wake up on one night and wake up on 6 mornings. You can’t determine where along that path you are, so you guess morning has probability 6⁄7
In case 2, there is a 50% chance you are somewhere on a path where you wake up once at night (and never in the morning), and a 50% chance you are somewhere on a path where you wake up on 6 mornings and 0 nights. So, probability it is morning is 1⁄2.
So even though, in the long run, 6⁄7 of your awakenings are in the morning, and you have (for that first minute) no information to help you work out which awakening this is, you still think that on any given awakening you ought to feel that it’s just as likely to be morning as evening?
Sure you can bite the bullet if you like, but quite frankly your intuitions are failing you if you can’t see why that sounds strange.
Not all awakenings are equally likely. 50% chance it’s one of the 6 morning awakening. 50% chance it’s the one night awakening.
Suppose two weeks’ worth of coins are tossed ahead of time.
Then with probability 1⁄4, you will wake up twice in the evening. With probability 1⁄2, you will wake up 6 times in the morning and once in the evening. And with probability 1⁄4 you will wake up 12 times in the morning.
Then by your logic, you ought to say that your probability of waking in the morning is (1/4)x(0/2) + (1/2)x(6/7) + (1/4)x(12/12) = 3⁄7 + 1⁄4 = 19⁄28, rather than 1⁄2 if the coins are tossed ‘just in time’.
How can whether the coins are tossed in advance or not change the subjective probability?
By neq1′s previous reasoning, there’s 50% chance of waking in the mornings and 50% chance of waking in the evening for any particular week. That is the case whether the coins are tossed in advance or not. The probability of a particular morning awakening would be 1⁄12.
I’m not sure where you got your (6/7) figure for neq1′s calculations.
neq1 admits that in my original scenario, before I introduced the coin and hibernations, you have a 6⁄7 probability of waking in the morning. The case where one of the two coins is heads and the other is tails is equivalent to this.
Sorry, I’m not following. What are you doing with these two weeks’ worth of coins?
In the situation I described previously, at the beginning of each week a coin is tossed.
What I’m doing is saying: Suppose week 1 AND week 2′s coin tosses both take place prior to the beginning of week 1.
Your original question was about one week, not two (I thought).
Are we just doing this twice? What happens between the weeks? Do they know the experiment has started over?
Could be, or could be a great number of weeks. Shouldn’t make any difference.
Nothing (except that, if necessary, the next week’s coin is tossed.)
They know it will start over, and once the ‘minute of confusion’ has passed, they become aware of all that has happened up to now. But during the ‘minute of confusion’ they only know that ‘an experiment is in progress and it is week n for some n’ but don’t know which n.
Once you go more than 1 week it’s not the sleeping beauty problem anymore. Half the time she’s woken up once at night, 1⁄4 of the time she’s woken up 6 times in morn and once at night, 1⁄4 of the time she’s woken up 12 times in morn. This doesn’t have to do with when the coins are tossed. It’s just that, if you do it for 1 week you have the sleeping beauty problem; if you do it multiple weeks you don’t
(1) You got the numbers wrong. “Half the time” should say “1/4 of the time”, the first “1/4 of the time” should say “half the time”, and “once at night” should say “twice at night”.
(2) It’s all very well to state that the situation is different but you haven’t provided any reason why (i) a long sequence of (back-to-back) single week experiments should treated differently from a long sequence of two week experiments. Indeed, the two are the same in every respect except whether some of the coins are tossed in advance, or why (ii) a long sequence of back-to-back single week experiments should be treated differently from just one single week experiment.
(1) You’re right, I got the numbers wrong. Thanks.
(2) If she knows she is somewhere along a two week path, the probabilities are different than if she knows she is somewhere along a one week path. She’s conditioning on different information in the two cases.
Well, you do have to specify whether the subject knows in advance that the experiment is going on for two weeks, or if they’re separate experiments—it changes the subject’s knowledge about what’s going on. Though I’m not sure whether anyone thinks that makes much of a difference.
I’d be interested in your feedback on this and this, which is as close as I can get to a formalization of the original Sleeping Beauty problem, and (after all) does point to 1⁄3 as the answer.
You’re giving a very weak argument. AlephNeil is challenging how your math should work out here. Whether we’re talking about “the sleeping beauty problem” is not entirely relevant.
To an outside observer in the first scenario, the probability of a particular awakening being morning, picked at random, is 6⁄7.
To an outside observer in the second scenario, the probability of a particular awakening being morning is 1 in the tails case and 0 in the heads case, and each case is equally likely, so the probability of a particular awakening being morning is 1⁄2.
So adding the coin flip does change things about the scenario if you’re an outside observer, so I would not be surprised to find it changes things for the subject as well.
Maybe my previous reply didn’t really ‘defuse’ yours. I have to admit your objection was compelling—a good intuition pump if nothing else.
But anyway, moving from ‘intuition pumps’ to hopefully more rigorous arguments, I do have this up my sleeve.
(Edit) Looking back at the argument I linked to, I think I can reformulate it much more straightforwardly:
Consider the original Sleeping Beauty problem.
Suppose we fix a pair of symbols {alpha, beta} and say that with probability 1⁄2, alpha = “Monday” and beta = “Tuesday”, and with probability 1⁄2 alpha = “Tuesday” and beta = “Monday”. (These events are independent of the ‘coin toss’ described in the original problem.)
Sleeping beauty doesn’t know which symbol corresponds to which day. Whenever she is woken, she is shown the symbol corresponding to which day it is. Suppose she sees alpha—then she can reason as follows:
If the coin was heads then my probability of being woken on day alpha was 1⁄2. If the coin was tails then my probability of being woken on day alpha was 1. I know that I have been woken on day alpha (and this is my only new information). Therefore, by Bayes’ theorem, the probability that the coin was heads is 1⁄3.
(And then the final step in the argument is to say “of course it couldn’t possibly make any difference whether an ‘alpha or beta’ symbol was visible in the room.”)
Now, over the course of these debates I’ve gradually become more convinced that those arguing that the standard, intuitive notion of probability becomes ambiguous in cases like this are correct, so that the problem has no definitive solution. This makes me a little suspicious of the argument above—surely the 1/2-er should be able to write something equally “rigorous”.
In my use of the words ‘every week’ I am implicitly—I take that back, I am explicitly supposing that every week the procedure is repeated.
So we would obtain an indefinitely long sequence of awakenings, of which 1⁄7 are in the evening and 6⁄7 in the morning.
For any such finite sequence of awakenings, there would (when viewed from the outside) be a 50% chance for a particular week of waking up in an evening, and a 50% chance for a particular week of waking up in the mornings—you can then assign a uniform distribution for particular weeks, getting a 1⁄6 probability of a particular morning in a tails week. If you pick an awakening randomly on that distribution, you have a 1⁄2 probability it’s an evening and a 1⁄12 probability it’s any particular morning (ETA: out of the week).
POLL
There will be six replies to this post. Two will be answers to the question “What is the best answer to the sleeping beauty problem?’. Two more will be answers to the question “Can the Sleeping Beauty problem, as it is written in this post, be reasonably interpreted as returning either 1⁄3 or 1⁄2. Two more comments will be karma dumps.
Question #1: One-third
Someone is vote-rigging. I voted this up and it is currently at 0. Presumably the intention is that you vote up the answer you believe is correct and do not vote down the answer you believe is wrong otherwise this poll will stay at zero for both answers.
I saw it go up and then back down and hoped it was just someone miss-clicking. Shameful. Luckily, the person is also stupid: if you wait for there to be a lot of votes no one will notice. Everyone should consider this answer to have one more vote than it does.
The karma going up, and then back down could—in theory—represent a withdrawn vote, or someone changing their mind.
Could be. I figured not meaning to vote was more likely than a mind-change in the span of like 10 ten seconds on a math problem we’ve been talking about for days.
Sure. What I meant was that there are other explanations—and so not necessarily any need for blame.
The same thing happened to me the first time I voted this up and this down—neither vote seemed to register. I attributed it to lag or a software error eliminating my votes. Reloading the page and revoting has made my votes stick, so it may work for you as well.
That doesn’t appear to be the problem for me—the Vote Up link is bolded and if I click it to remove the upvote the score changes to −1.
You’re right; that’s distinct from what happened to me. Neither of my voting links were bold after I voted the first time and refreshed.
Question #2: The question is not ambiguous. The answer is cut and dry.
Question #2: The question can reasonably be interpreted to yield either answer.
Question #1: One-half
I can’t vote of course, but I say 1⁄2.
Karma dump for question #2
Karma dump for question #1
Somewhat off-topic, but it should broaden our view of the past to know that people were thinking like this long ago: An essay from 1872 by a French political radical arguing that in an infinite universe, each of our lives is infinitely duplicated on other worlds, and so we are all effectively immortal.
This is expected value: the expected number of wakenings per coin flip is 1.5.
Expected value, the probability of heads, the probability of heads given an awakening are all well-defined things with well-defined numbers for this problem. While I understand needing to develop novel methods for ‘subjective observer mathematics’ for these types of problems, I think it would be useful to depart from these known elements.
Even more importantly, if you’re going to discuss at length whether the answer is 1⁄2 or 1⁄3, you need to define more carefully what the question is. My hunch is that the solution to a theory for these type of problems would be to rigorously formalize what is meant by (the subjective) “credence in heads”.
There’s a lot of terminology in this article that I simply don’t understand: what are the “continuity problems” mentioned in the first paragraph? And this sentence means almost nothing to me: “The source of confusion appears to be based on the distinction between the probability of an observer and the expectation number of observers, with the former not being a linear function of problem definitions.”
It’s possible that some of the other commenters who are having trouble with this article are in the same position as me, but are too polite to say so. Without a common understanding, how can we debate this effectively? So it would be useful if you could point to an introductory text that explains your terminology.
With the above caveat in mind, feel free to reject what follows as the product of an ignoramus.
My own guess as to what’s going on is that the Sleeping Beauty problem drives a wedge between two ways of thinking about subjective probability that are normally the same. One way to think about “credence” is that it’s the subjective Bayesian probability defined by P(H|D) = P(D|H) P(H) / P(D), where H is our hypothesis, and D is some data. A second way to think about “credence” is in terms of expectation or betting: what odds would constitute a fair bet (zero expection) in a given situation?
Many examples of probabilistic reasoning are set up so that these two notions give identical results. But there’s no fundamental reason why they should be the same. If I ask you to bet on the outcome of a coin toss, but tell you that your payout (if you win) will be doubled if the coin lands heads, then you can easily calculate that your expectation will be zero if the coin has a 1⁄3 chance of landing heads. But that zero-expectation result doesn’t affect your subjective Bayesian probability of the coin landing heads.
I’ve chosen this example because it is analogous to the Sleeping Beauty problem, except that the wedge that separates zero-expectation from the subjective Bayesian probability is completely obvious and trivial. The genius of the Sleeping Beauty problem is the way in which it hides the wedge.
My view is what the Sleeping Beauty problem shows is that you can’t naively use the subjective Bayesian probability to compute expectations and make bets, if your number of opportunities to make the bet is itself conditioned on the result of the bet. (I dunno if anything interesting follows from that, though, other than a bit of caution when doing probability in an anthropic setting.)
Continuity problem is that the 1⁄2 answer is independent of the ratio of expected number of wakings in the two branches of the experiment, unless the ratio is 0 (or infinite) at which point special case logic is invoked to prevent the trivially absurd claim that credence of Heads is 1⁄2 when you are never woken under Heads.
If you are put through multiple sub-experiments in series, or probabilistically through some element of a set of sub-experiments, then the Expected number of times you are woken is linearly dependent on the distribution of sub-experiments. The probability that you are woken ever is not.
So the problem is that it’s not immediately clear what D should be. If D is split by total probability to be Heads or Tails, and the numbers worked separately in both cases, then to get 1⁄2 requires odd conditional probabilities, but 1⁄3 does not. If you don’t split D, and calculate back from 1⁄3, you get 3⁄2 as the “probability” of D. It isn’t. What’s happened is closer to E(H|D) = E(D|H) E(H) / E(D), over one run of the experiment, and this yields 1⁄3 immediately.
The issue is that certain values of “D” occur multiple times in some branches, and allowing those scenarios to be double counted leads to oddness. I second the observation that caution is generally required.
Why is this a problem? I’m perfectly comfortable with that property. Since you really just have one random variable in each arm. You can call them different days of the week, but with no new information they are all just the same thing
By D do you mean W?
Is this how you came up with the 1⁄3 solution? If so, I think it requires more explanation. Such as what D is precisely.
The next clause of the sentence is the problem
The problem is special casing out the absurdity, and thus getting credences that are discontinuous in the ratio. On the other hand, you seem to take 1/21in PSB (ie you do let it depend on the ratio) but deviate from 1⁄21 when multiple runs of PSB aggregate, which is not what I had expected...
D was used in the comment I was replying to as an “event” that was studiously avoiding being W.
http://lesswrong.com/lw/28u/conditioning_on_observers/201l shows multiple ways I get the 1/3 solution; alternatively betting odds taken on awakening or the long run frequentist probability, they all cohere, and yield 1/3.
The problem as I see it with W is that it’s not a set of outcomes, it’s really a multiset. That’s fine in it’s way, but it gets confusing because it no longer bounds probabilities to [0,1]. Your approach is to quash multiple membership to get a set back.
Excellent post. Thanks!
I have no idea what this sentence is supposed to mean.
I’ve done the same work of formalization for PSB that I did on AlephNeil’s revival question and the joint distribution table does have the same structure, in particular I agree that 1⁄21 is the right answer in PSB. So I agree that this could be a promising start—but it’s unclear (and was also unclear with AlephNeil’s question) how we get from there to SB.
Given that structure for the joint probability distribution I get results which agree with neq1′s answer of 2⁄22 in the variant described here and don’t know where you’re getting 1⁄21 from.
I would like to strongly recommend that we settle this by writing out the full joint probability distribution table (for N more manageable than 20), this isn’t so hard to do as long as we’re playing with discrete variables with not too many values.
Maybe this will help:
Where does the 1⁄3 solution come from?
It comes from taking a ratio of expected counts. First, a motivating example.
Suppose people can fall into one of three categories. For example, we might create a categorical age variable, catage, where catage is 1 if age50.
Suppose we randomly select N people from the population. Let n1 be the number of people with catage=1, with n2 and n3 defined similarly. Given the sample size N, the random variables n1, n2 and n3 follow a multinomial distribution, with parameters (probabilities) p1, p2 and p3, respectively, where p1+p2+p3=1 and n1+n2+n3=N (i.e., 2 degrees of freedom).
The probability that agecat=1, p1, is lim N-> infinity n1/(n1+n2+n3).
That concept applied to Sleeping Beauty
With the sleeping beauty problem, what we see is something similar. Imagine we ran the experiment N times. Let n1 be the number of times it was Monday&Heads, n2 the number of times it was Monday&tails, n3 the number of times it was Tuesday&tails.
The 1⁄3 solution makes the assumption that the probability of heads given an awakening is:
lim N-> infinity n1/(n1+n2+n3)
But, we have a problem here. N does not equal n1+n2+n3, it is equal to n1+n2. Also, the random variables n2 and n3 are identical. Thus, we could substitute:
lim N-> infinity n1/(n1+2*n2)
There are really just two random variables (n1 and n2) and 1 degree of freedom. In that case, we can think of n1 as coming from a Binomial distribution with sample size N=n1+n2 and probability p1. The probability of heads&Monday is then
lim N-> infinity n1/(n1+n2)=1/2.
Another example
If you don’t believe the above reasoning, consider another example.
Suppose half of the population are male and the other half are female. Also, suppose that only females have ovaries.
Suppose I record 3 variables: indicator that the person is male, indicator that the person is female, and indicator that the person has ovaries.
I sample N people, and get counts for those 3 variables of n1, n2 and n3.
Given that we recorded a variable for a randomly selected person, is the probability that they are male equal to
lim N->infinity n1/(n1+n2+n3) ?
Of course not. It’s lim N->infinity n1/(n1+n2).
Even though n2 and n3 are counts of something different, in a sense, they are really the same variable. Just like Beauty waking up on Monday is the same as Beauty waking up on Tuesday. There is no justification for treating them as separate variables.
When you do treat them as separate variables, you end up with nonsense (such as probabilities greater than 1 link ).
I’d quibble about calling it an assumption. The 1⁄3 solution notes that this is the ratio of observations upon awakening of heads to the total number of observations, which is one of the problematic facts about the experimental setup. The 1⁄3 solution assumes that this is relevant to what we should mean by “credence”, and makes an argument that this is a justification for the claim that Sleeping Beauty’s credence should be 1⁄3.
Your argument is, I take it, that these counts of observations are irrelevant, or at best biased. Something else should be counted, or should be counted differently. The disagreement seems to center on the denominator; it should count not awakenings, but coin-tosses. Then there is a difference in the definition of the relevant events and the probabilities that get calculated from them.
Thirders: An event is an awakening.
The question asks about # awakenings with heads / total awakenings.
This ratio is an estimate of a fraction that can be used to predict frequencies of something of interest.
Halfers: An event is a coin-toss.
The question asks about # tosses with heads / total tosses.
This ratio is an estimate of a fraction which is universally agreed to be a probability, and can be used to predict frequencies of something of interest.
Did I get that right? Is this a fair description?
I think a key difference between halfers and thirders is that for thirders, the occurrence of an awakening constitutes evidence of the current state of the system that’s being asked about—whether the coin shows heads or tails, because the frequency with which the state of the system is asked about (or, equivalently, an observation is made) is influenced by the current state of the system. To ward off certain objections, it is of no consequence whether this influence is deterministic, probabilistic or mixed in nature, the mere fact that it exists can and should be exploited. I don’t think there’s disagreement that it exists, but there is over how it’s relevant.
Halfers deny that any new evidence becomes available on awakening, because the operation of the process is completely known ahead of time. (Alternatively, if any new evidence could be said to become available, it cannot be exploited.) From what I can tell, and my understanding is surely imperfect, there is some kind of cognitive dissonance about what kinds of things can constitute evidence in some epistomological theory, such that drawing a distinction between the actual occurrence of an event and the knowledge that at least one such event will surely occur is illegitimate for halfers. Is this a fair description?
That’s as may be, but it doesn’t help Sleeping Beauty in her quandary. If you think this example helps to prove your point, I think it helps to prove the opposite. Although she knows, in this variation, that a randomly selected person will be tested, the random person selection process is not accessible to her, only the opportunity to know that one of three possible test results has been collected. She knows very well, given a randomly selected person (resp. a coin toss), what the probability they are male is (resp. the given coin toss came Heads). She isn’t being asked about that conditional probability. (Or maybe you think she is? Please clarify.) To follow your analogy, upon being awakened, she’s informed that a test result has been collected from an unknown person, and now, given that a test result has been collected, what are the chances it cames from a male?
Clearly the selection process for asking Sleeping Beauty questions is biased. If bias had not been introduced by an extra awakening on Tuesday, the problem would collapse into triviality. The puzzle asks how this sampling bias should affect Sleeping Beauty’s calculations of what to answer on awakening, if at all. One of the reasons for doing statistical analysis of sampling schemes is to quantify how the mechanism that’s introducing bias changes the expected values of observations. In the SB case, the biased selection process is a mixture of random and deterministic mechanisms. Untangling the random from the deterministic parts is difficult enough for the participants in this discussion—they can’t even agree on a forking path diagram! Untangling it for Sleeping Beauty while she’s in the experiment is epistemically impossible. She has no basis whatsoever inside the game for saying, “this one is randomly different from the last one” versus “this one is deterministically identical to the last one, therefore this one doesn’t count.”
The same considerations apply to the case of the cancer test. Let me elaborate on your scenario to see if I understand it, and let me know if I’m mischaracterizing the test protocol in any material way. There is a test for a disease condition. Every person knows they have a 50% chance going in of testing positive for the disease. We’ll stipulate that the repeatability of the test is perfect, though in real life this is achieved only within epsilon of certainty. (Btw, here’s where the continuity argument enters in: how crucial is the assumption of absolute certainty versus near certainty? What hinges on that?) In this protocol, if the initial test result is positive, then the test is repeated k times (k=2 or 10, or whatever you deem necessary), either with a new sample or from an aliquot of the original sample, I don’t think it matters which. Here the repetition is because of the obstinacy of the head of the test lab and their predilection for amnesia drugs; in real life the reasons would be something like the very high cost in anguish and/or money of a false positive, however unlikely. You, as a recorder of test results, see a certain number of test samples come through the lab. The identities of the samples are encrypted, so your epistemic state with regard to any particular test result is identical to that for any other test sample and its result.
So now the question comes down to this: upon any particular awakening, how is the test subject’s epistemic state at any particular awakening significantly different from the lab tech’s epistemic state regarding any particular test sample? There is a one-to-one correspondence between test samples being evaluated and questions to the patient about their prognosis. Should they give the same answer, or is there a reason why they should give different answers? Just as with the patient, the lab tech knows that any randomly chosen individual has a 50% chance of of giving a positive test result, but does she give the same answer to that question as to a different question: given that she has a particular sample in her hands, what is the probability that the person it belongs to will test positive? She knows that she has k times as many samples in her lab that will test positive than otherwise, but she has no way of knowing whether the sample in her hands is an initial sample or a replicate. It seems to me that halfers might be claiming these two questions are the same question, while thirders claim that they are different questions with different answers. Is this a fair description? If not, please clarify.
What you say is true for any outside observers, and for Sleeping Beauty after the experiment is over and the logbooks analyzed. But while Sleeping Beauty is in the experiment, this option is simply not available to her. The scenario has been carefully constructed to make this so, that’s what makes it an interesting problem. The whole point of the amnesia drug in the SB setup (or downloadable avatars, or forking universes, random passersby, whatever) is that she has NO justification nor even a method for NOT treating any of her awakenings as separate variables, because the information that could allow her to do this is unavailable to her. By construction—and this is the defining feature of Sleeping Beauty—all Sleeping Beauty’s awakenings are epistemically indistinguishable. She has no choice but to treat them all identically.
This phenomenon is a common occurrence in queueing systems where there’s a very definite and well-understood difference between omniscient “outside observers” and epistemically indistinguishable “arriving customers”, who can have different values for the probability of observing the system in state X, where the system is executing a well-defined random process, or even a combination random-deterministic process.
Thanks for your detailed response. I’ll make a few comments now, and address more of it later (short on time).
No, I was just saying that this, lim N-> infinity n1/(n1+n2+n3), is not actually a probability in the sleeping beauty case.
No, I wouldn’t say that. My argument is that you should use probability laws to get the answer. If you take ratios of expected counts, well, you have to show that what you get as actually a probability.
I definitely disagree with your bullet points about what halfers think
I said: “Just like Beauty waking up on Monday is the same as Beauty waking up on Tuesday. There is no justification for treating them as separate variables.”
You disagreed, and said:
Hm, I think that is what I’m saying. She does have to treat them all identically. They are the same variable. That’s why she has to say the same thing on Monday and Tuesday. That’s why an awakening contains no new info. If she had new evidence at an awakening, she’d give different answers under heads and tails.
I maintain that it is. I can guarantee you that it is. What obstacle do you see to accepting that? You’ve made noises that this is because the counts are correlated, but I haven’t seen any argument for this beyond bare assertion. Do you want to claim it is impossible for some reason, or are you just saying you haven’t seen a persuasive argument yet?
What would you require for proof? If I could show you a Markov chain whose behavior is isomorphic to iterated Sleeping Beauty, would that convince you?
I also am not sure what you mean when you say “use probability laws”. Is there a failure to comport with the Kolmogorov axioms? Is there a problem with the definition of the events? Do you mean Bayes’ Theorem, or some other law(s)? I also am deeply suspicious of the phrase “get the answer”. I will have no idea what this could mean until we can eliminate ambiguity about what the question is (there seems to be a lot of that going around), or what class of questions you’ll admit as legitimate.
Up to this point, I see we are actually in strenuous agreement on this aspect, so I can stop belaboring it.
I don’t mean to claim that as soon as Beauty awakes, new evidence comes to light that she can add to her store of bits in additive fashion, and thereby update her credence from 1⁄2 to 1⁄3 along the way. If this is the only kind of evidence that your theory of Bayesian updating will acknowledge, then it is too restrictive. Since Beauty is apprised of all the relevant details of the experimental process on Sunday evening, she can (and should) use the fact that the predicted frequency of awakenings into a reset epistemic state is dependent on the state of the coin toss to change the credence she reports on such awakenings from 1⁄2 to 1⁄3. She can tell you this on Sunday night, just as I can tell you now, before any of us enter into any such experimental procedure. So her prediction about what she should answer on an awakening does not change from Sunday evening to Monday morning.
The key pieces of information she uses to arrive at this revised estimate are:
That the questions will be asked in a reset epistemic state. This requires her to give the same answer on all awakenings.
That the frequency of awakenings is dependent in a specific way on the result of the coin toss. This requires her to update the credence she’ll report on awakenings from 1⁄2 to 1⁄3.
At this point, it is just assertion that it’s not a probability. I have reasons for believing it’s not one, at least, not the probability that people think it is. I’ve explained some of that reasoning.
I think it’s reasonable to look at a large sample ratio of counts (or ratio of expected counts). The best way to do that, in my opinion, is with independent replications of awakenings (that reflect all possibilities at an awakening). I probably haven’t worded this well, but consider the following two approaches. For simplicity, let’s say we wanted to do this (I’m being vague here) 1000 times.
Replicate the entire experiment 1000 times. That is, there will be 1000 independent tosses of the coin. This will lead between 1000 and 2000 awakenings, with expected value of 1500 awakenings. But… whatever the total number of awakenings are, they are not independent. For example, one the first awakening it could be either heads or tails. On the second awakening, it only could be heads if it was heads on the first awakening. So, Beauty’s options on awakening #2 are (possibly) different than her options on awakening #1. We do not have 2 replicates of the same situation. This approach will give you the correct ratio of counts in the long run (for example, we do expect the # of heads & Monday to equal the # of tails and Monday and the # of tails and Tuesday).
Replicate her awakening-state 1000 times. Because her epistemic state is always the same on an awakening, from her perspective, it could be Monday or Tuesday, it could be heads or tails. She knows that it was a fair coin. She knows that if she’s awake it’s definitely Monday if heads, and could be either Monday or Tuesday if tails. She knows that 50% of coin tosses would end up heads, so we assign 0.5 to Monday&heads. She knows that 50% of coin tosses would end up tails, so we assign 0.5 to tails, which implies 0.25 to tails&Monday and 0.25 to tails&Tuesday. If we generate observations from this 1000 times, we’ll get 1000 awakenings. We’ll end up with heads 50% of the time.
The distinction between 1 and 2 is that, in 2, we are trying to repeatedly sample from the joint probability distributions that she should have on an awakening. In 1, we are replicating the entire experiment, with the double counting on tails.
In 1, people are using these ratios of expected counts to get the 1⁄3 answer. 1⁄3 is the correct answer to the question about the long-run frequencies of awakenings preceded by heads to awakenings preceded by tails. But I do not think it is the answer to the question about her credence of heads on an awakening.
In 2, the joint probabilities are determined ahead of time based on what we know about the experiment.
Let n2 and n3 are counts, in repeated trials, of tails&Monday and tails&Tuesday, respectively. You will of course see that n2=n3. They are the same random variable. tails&Monday and tails&Tuesday are the same. It’s like what Jack said about types and tokens. It’s like Vladimir_Nesov said:
You said:
I don’t think it matters if she has the knowledge before the experiment or not. What matters is if she has new information about the likelihood of heads to update on. If she did, we would expect her accuracy to improve. So, for example, if she starts out believing that heads has probability 1⁄2, but learns something about the coin toss, her probability might go up a little if heads and down a little if tails. Suppose, for example, she is informed of a variable X. If P(heads|X)=P(tails|X), then why is she updating at all? Meaning, why is P(heads)=/=P(heads|X)? This would be unusual. It seems to me that the only reason she changes is because she knows she’d be essentially ‘betting’ twice of tails, but that really is distinct from credence for tails.
Yet one more variant. On my view it’s structurally and hence statistically equivalent to Iterated Sleeping Beauty, and I present an argument that it is. This one has the advantage that it does not rely on any science fictional technology. I’m interested to see if anyone can find good reasons why it’s not equivalent.
The Iterated Sleeping Beaty problem (ISB) is the original Standard Sleeping Beauty (SSB) problem repeated a large number N of times. People always seem to want to do this anyway with all the variations, to use the Law of Large Numbers to gain insight to what they should do in the single shot case.
The Setup
As before, Sleeping Beauty is fully apprised of all the details ahead of time.
The experiment is run for N consecutive days (N is a large number).
At midnight 24 hours prior to the start of the experiment, a fair coin is tossed.
On every subsequent night, if the coin shows Heads, it is tossed again; if it shows Tails, it is turned over to show Heads.
(This process is illustrated by a discrete-time Markov chain with transition matrix:
and the state vector is the row
with consecutive state transitions computed as x * P^k
Each morning when Sleeping Beauty awakes, she is asked each of the following questions:
“What is your credence that the most recent coin toss landed Heads?”
“What is your credence that the coin was tossed last night?”
“What is your credence that the coin is showing Heads now?”
The first question is the equivalent of the question that is asked in the Standard Sleeping Beauty problem. The second question corresponds to the question “what is your credence that today is Monday?” (which should also be asked and analyzed in any treatment of the Standard Sleeping Beauty problem.)
Note: in this setup, 3) is different than 1) only because of the operation of turning the coin over instead of tossing it. This is just a perhaps too clever mechanism to count down the days (awakenings, actually) to the point when the coin should be tossed again. It may very well make a better example if we never touch the coin except to toss it, and use some other deterministic countdown mechanism to count repeated awakenings per coin toss. That allows easier generalization to the case where the number of days to awaken when Tails is greater than 2. It also makes 3) directly equivalent to the standard SB question, and also 1) and 3) have the same answers. You decide which mechanism is easier to grasp from a didactic point of view, and analyze that one.
After that, Beauty goes on about her daily routine, takes no amnesia drugs, sedulously avoids all matter duplicators and transhuman uploaders, and otherwise lives a normal life, on one condition: she is not allowed to examine the coin or discover its state (or the countdown timer) until the experiment is over.
Analysis
Q1: How should Beauty answer?
Q2: How is this scenario similar in key respects to the SSB/ISB scenario?
Q3: How does this scenario differ in key respects from the SSB/ISB scenario?
Q4: How would those differences if any make a difference to how Beauty should answer?
My answers:
Q1: Her credence that the most recent coin toss landed Heads should be 1⁄3. Her credence that the coin was tossed last night should be 1⁄3. Her credence that the coin shows Heads should be 2⁄3. (Her credence that the coin shows Heads should be 1⁄3 if we never turn it over, only toss, and 1/K if the countdown timer counts K awakenings per Tail toss.)
Q2: Note that Beauty’s epistemic state regarding the state of the coin, or whether it was tossed the previous midnight, is exactly the same on every morning, but without the use of drugs or other alien technology. She awakens and is asked the questions once every time the coin toss lands Heads, and twice every time it lands tails. In Standard Sleeping Beauty, her epistemic state is reset by the amnesia drugs. In this setup, her epistemic state never needs to be reset because it never changes, simply because she never receives any new information that could change it, including the knowledge of when the coin has been tossed to start a new cycle.
Q3: In ISB, a new experimental cycle is initiated at fixed times—Monday (or Sunday midnight). Here the start of a new “cycle” occurs with random timing. The question arises, does the difference in the speed of time passing make any difference to the moments of awakening when the question is asked? Changing labels from “Monday” and “Tuesday” to “First Day After Coin Toss” and “Second Day After Coin Toss” respectively makes no structural change to the operation of the process. Discrete-time Markov chains have no timing, they have only sequence.
In the standard ISB, there seems to be a natural unit of replication: the coin toss on Sunday night followed by whatever happens through the rest of the week. Here, that unit doesn’t seem so prominent, though it still exists as a renewal point of the chain. In a recurrent Markov chain, the natural unit of replication seems to be the state transition. Picking a renewal point is also an option, but only as a matter of convenience of calculation; it doesn’t change the analysis.
Q4: I don’t see how. The events, and the processes which drive their occurence haven’t changed that I can see, just our perspective in looking at them. What am I overlooking?
Iteration
I didn’t tell you yet how N is determined and how the experiment is terminated. Frankly, I don’t think it matters all that much as N gets large, but let’s remove all ambiguity.
Case A: N is a fixed large number. The experiment is terminated on the first night on which the coin shows Heads, after the Nth night.
Case B: N is not fixed in advance, but is guaranteed to be larger than some other large fixed number N’, such that the coin has been tossed at least N’ times. Once N’ tosses have been counted, the experiment is terminated on any following night on which the coin shows Heads, at the whim of the Lab Director.
Q5: If N (or N’) is large enough, does the difference between Case A and B make a difference to Beauty’s credence? (To help sharpen your answer, consider Case C: Beauty dies of natural causes before the experiment terminates.)
Note that in view of the discussion under Q3 above, we are picking some particular state in the transition diagram and thinking about recurrence to and from that state. We could pick any other state too, and the analysis wouldn’t change in any significant way. It seems more informative (to me at any rate) to think of this as an ongoing prcess that converges to stable behavior at equilibrium.
Extra Credit:
This gets right to the heart of what a probability could mean, what things can count as probabilities, and why we care about Sleeping Beauty’s credence.
Suppose Beauty is sent daily reports showing cumulative counts of the nightly heads/tails observations. The reports are sufficiently old as not to give any information about the current state of the coin or when it was last tossed. (E.g., the data in the report are from at least two coin tosses ago.) Therefore Beauty’s epistemic state about the current state of the coin always remains in its initial/reset state, with the following exception. Discuss how Beauty could use this data to--
corroborate that the coin is in fact fair as she has been told.
update her credences, in case she accrues evidence that shows the coin is not fair.
For me this is the main attraction of this particular model of the Sleeping Beauty setup, so I’m very interested in any possible reasons why it’s not equivalent.
Sorry I was slow to respond .. busy with other things
My answers:
Q1: I agree with you: 1⁄3, 1⁄3, 2⁄3
Q2. ISB is similar to SSB as follows: fair coin; woken up twice if tails, once if heads; epistemic state reset each day
Q3. ISB is different from SSB as follows: more than one coin toss; same number of interviews regardless of result of coin toss
Q4. It makes a big difference. She has different information to condition on. On a given coin flip, the probability of heads is 1⁄2. But, if it is tails we skip a day before flipping again. Once she has been woken up a large number of times, Beauty can easily calculate how likely it is that heads was the most recent result of a coin flip. In SSB, she cannot use the same reasoning. In SSB, Tuesday&heads doesn’t exist, for example.
Consider 3 variations of SSB:
Same as SSB except If heads, she is interviewed on Monday, and then the coin is turned over to tails and she is interviewed on Tuesday. There is amnesia and all of that. So, it’s either the sequence (heads on Monday, tails on Tuesday) or (tails on Monday, tails on Tuesday). Each sequence has a 50% probability, and she should think of the days within a sequence as being equally likely. She’s asked about the current state of the coin. She should answer P(H)=1/4.
Same as SSB except If heads, she is interviewed on Monday, and then the coin is flipped again and she is interviewed on Tuesday. There is amnesia and all of that. So, it’s either the sequence (heads on Monday, tails on Tuesday), (heads on Monday, heads on Tuesday) or (tails on Monday, tails on Tuesday). The first 2 sequences have a 25% chance each and the last one has a 50% chance. When asked about the current state of the coin, she should say P(H)=3/8
The 1⁄2 solution to SSB results from similar reasoning. 50% chance for the sequence (Monday and heads). 50% chance for the sequence (Monday and tails, Tuesday and tails). P(H)=1/2
If you apply this kind of reasoning to ISB, where we are thinking of randomly selected day after a lot of time has passed, you’ll get P(H)=1/3.
I’m struggling to see how ISB isn’t different from SSB in meaningful ways.
Perhaps this is beating a dead horse, but here goes. Regarding your two variants:
I agree. When iterated indefinitely, the Markov chain transition matrix is:
acting on state vector [ H1 H2 T1 T2 ], where H,T are coin toss outcomes and 1,2 label Monday,Tuesday. This has probability eigenvector [ 1⁄4 1⁄4 1/4 1⁄4 ]; 3 out of 4 states show Tails (as opposed to the coin having been tossed Tails). By the way, we have unbiased sampling of the coin toss outcomes here.
If the Markov chain model isn’t persuasive, the alternative calculation is to look at the branching probability diagram
[http://entity.users.sonic.net/img/lesswrong/sbv1tree.png (SB variant 1)]
and compute the expected frequencies of letters in the result strings at each leaf on Wednesdays. This is
I agree. Monday-Tuesday sequences occur with the following probabilities:
Also, the Markov chain model for the iterated process agrees:
acting on state vector [ H1 H2 T1 T2 ] gives probability eigenvector [ 1⁄4 1⁄8 1⁄4 3⁄8 ]
Alternatively, use the branching probability diagram
[http://entity.users.sonic.net/img/lesswrong/sbv2tree.png (SB variant 2)]
to compute expected frequencies of letters in the result strings,
Because of the extra coin toss on Tuesday after Monday Heads, these are biased observations of coin tosses. (Are these credences?) But neither of these two variants is equivalent to Standard Sleeping Beauty or its iterated variants ISB and ICSB.
(Sigh). I don’t think your branching probability diagram is correct. I don’t know what other reasoning you are using. This is the diagram I have for Standard Sleeping Beauty
[http://entity.users.sonic.net/img/lesswrong/ssbtree.png (Standard SB)]
And this is how I use it, using exactly the same method as in the two examples above. With probability 1⁄2 the process accumulates 2 Tails observations per week, and with probability 1⁄2 accumulates 1 Heads observation. The expected number of observations per week is 1.5, the expected number of Heads observations per week is 0.5, the expected number of Tails observations is 1 per week.
Likewise when we record Monday/Tuesday observations per week instead of Heads/Tails, the expected number of Monday observations is 1, expected Tuesday observations 0.5, for a total of 1.5. But in both of your variants above, the expected number of Monday observations = expected number of Tuesday observations = 1.
Thanks for your response. I should have been clearer in my terminology. By “Iterated Sleeping Beauty” (ISB) I meant to name the variant that we here have been discussing for some time, that repeats the Standard Sleeping Beauty problem some number say 1000 of times. In 1000 coin tosses over 1000 weeks, the number of Heads awakenings is 1000 and the number of Tails awakenings is 2000. I have no catchy name for the variant I proposed, but I can make up an ugly one if nothing better comes to mind; it could be called Iterated Condensed Sleeping Beauty (ICSB). But I’ll assume you meant this particular variant of mine when you mention ISB.
You say
“More than one coin toss” is the iterated part. As far as I can see, and I’ve argued it a couple times now, there’s no essential difference between SSB and ISB, so I meant to draw a comparison between my variant and ISB.
“Same number of interviews regardless of result of coin toss” isn’t correct. Sorry if I was unclear in my description. Beauty is interviewed once per toss when Heads, twice when Tails. This is the same in ICSB as in Standard and Iterated Sleeping Beauty. Is there an important difference between Standard Sleeping Beauty and Iterated Sleeping Beauty, or is there an important difference between Iterated Sleeping Beauty and Iterated Condensed Sleeping Beauty?
We not only skip a day before tossing again, we interview on that day too! I see how over time Beauty gains evidence corroborating the fairness of the coin (that’s exactly my later rhetorical question), but assuming it’s a fair coin, and barring Type I errors, she’ll never see evidence to change her initial credence in that proposition. In view of this, can you explain how she can use this information to predict with better than initial accuracy the likelihood that Heads was the most recent outcome of the toss? I don’t see how.
After relabeling Monday and Tuesday to Day 1 and Day 2 following the coin toss, Tuesday&Heads (H2) exists in none of these variants. So what difference is there?
Good and well, but—are these legitimate credences? If not, why not? And if so, why aren’t they also in the following:
Standard Iterated Sleeping Beauty is isomorphic to the following Markov chain, which just subdivides the Tails state in my condensed variant into Day 1 and Day 2:
operating on row vector of states [ Heads&Day1 Tails&Day1 Tails&Day2 ], abbreviated to [ H1 T1 T2 ]
When I say isomorphic, I mean the distinct observable states of affairs are the same, and the possible histories of transitions from awakening to next awakening are governed by the same transition probabilities.
So either there’s a reason why my 2-state Markov chain correctly models my condensed variant that allows you to accept the 1⁄3 answers it computes, that doesn’t apply to the three-state Markov chain and its 1⁄3 answers (perhaps you came to those answers independently of my model), or else there’s some reason why the three-state Markov chain doesn’t correctly model the Iterated Sleeping Beauty process. Can you help me see where the difficulty may lie?
I assume you are referring to my variant, not what I’m calling Iterated Sleeping Beauty. If so, I’m kind of baffled by this statement, because under similarities, you just listed
fair coin
woken twice if Tails, once if Heads
epistemic state reset each day
With the emendation that 2) is per coin toss, and in 3) “each day” = “each awakening”, you have just listed three essential features that SSB, ISB and ICSB all have in common. It’s exactly those three things that define the SSB problem. I’m claiming that there aren’t any others. If you disagree, then please tell me what they are. Or if parts of my argument remain unclear, I can try to go into more detail.
Two ways to iterate the experiment:
and
This seems a distinction without a difference. The longer the iterated SB process continues, the less important is the distinction between counting tosses versus counting awakenings. This distinction is only about a stopping criterion, not about the convergent behavior of observations or coin tosses to expected values as it’s ongoing. Considered as an ongoing process of indefinite duration, the expected number of tosses and of observations of each type are well-defined, easily computed, and well-behaved with respect to each other. Over the long run, #awakenings accumulates 1.5 times more frequently than #tosses. Beauty is never more than two awakenings away from starting a new coin toss, so whether you choose to stop as soon as an awakening has completed or until you finish a coin-toss cycle, the relative perturbation in the statistics collected so far goes to zero. Briefly, there is no “natural” unit of replication independent of observer interest.
This would be an error. You are assigning a 50% probability to an observation (that it is Heads&Monday) without taking into account the bias that’s built in to the process for Beauty to make observations. Alternatively, if you are uncertain whether Monday is true or not—you know it might be Tuesday—then you should be uncertain that P(Heads)=P(Heads&Monday).
You the outside observer know the chance of observing that the coin lands Heads is 50%. You presumably know this because you have corroborated it through an unbiased observation process: look at the coin exactly once per toss. Once Beauty is put to sleep and awoken, she is no longer an outside observer, she is a particpant in a biased observation process, so she should update her expectation about what her observation process will show. Different observation process, different observations, different likelhoods of what she can expect to see.
Of course, as a card-carrying thirder, I’m assuming that the question about credence is about what Beauty is likely to see upon awakening. That’s what the carefully constructed wording of the question suggests to me.
except that as we agreed, she’s not observing coin tosses, she’s observing biased samples of coin tosses. The connection between what she observes and the objective behavior of the coin is just what’s at issue here, so you can’t beg the question.
Agreed, but for this: it all depends on what you want credence to mean, and what it’s good for; see discussion below.
Let me uphold a distinction that’s continually skated over, but which is crucial point of disagreement here. I think you’re confusing your evidence for the thing evidenced. And you are selectively filtering your evidence, which amounts to throwing away information. Tails&Monday and Tails&Tuesday are not the same; they are distinct observations of the same state of the coin, thus they are perfectly correlated in that regard. Aside from the coin, they observe distinct days of the week, and thus different states of affairs. By a state of affairs I mean the conjunction of all the observable properties of interest at the moment of observation.
The distinction between types and tokens is only relevant when you want to interepret your tokens as being about something else, their types, rather than about themselves. But types are carved out of observers’ interests in their significance, which are non-objective, observer-dependent if anything is. Their variety and fineness of distinction is potentially infinite. As I mentioned above, a state of affairs is a conjunction of observable properties of interest. This Boolean lattice has exactly one top: Everything, and unknown atoms if any at bottom. Where you choose to carve out a distinction between type and token is a matter of observer interest.
I’ll certainly agree it isn’t desirable, but oughtn’t isn’t the same as isn’t, and in the Sleeping Beauty problem we have no choice. Monday and Tuesday just are different elements in a sample space, by construction.
What you seem to be talking about is using evidence that observations provide to corroborate or update Beauty’s belief that the coin is in fact fair. Is that a reasonable take? But due to the epistemic reset between awakenings, there is never any usable input to this updating procedure. I’ve already stipulated this is impossible. This is precisely what the epistemic reset assumption is for. I thought we were getting off this merry-go-round.
Ok, I guess it depends on what you want the word “credence” to mean, and what you’re going to use it for. If you’re only interested in some updating process that digests incoming information-theoretic quanta, like you would get if you were trying to corroborate that the coin was inded a fair one to within a certain standard error, you don’t have it here. That’s not Sleeping Beauty, that’s her faithful but silent, non-memory-impaired lab partner with the log book. If Beauty herself is to have any meaningful notion of credence in Heads, it’s pointless for it be about whether the coin is indeed fair. That’s a separate question, which in this context is a boring thing to ask her about, because it’s trivially obvious: she’s already accepted the information going in that it is fair and she will never get new information from anywhere regarding that belief. And, while she’s undergoing the process of being awoken inside the experimental setup, a value of credence that’s not connected to her observations is not useful for any purpose that I can see, other than perhaps to maintain her membership in good standing in the Guild of Rational Bayesian Epistomologists. It doesn’t connect to her experience, it doesn’t predict frequencies of anything she has any access to, it’s gone completely metaphysical. Ok, what else is there to talk about? On my view, the only thing left is Sleeping Beauty’s phenomenology when awakened. On Bishop Berkeley’s view, that’s all you ever have.
Beauty gets usable, useful information (I guess it depends on what you want “information” to mean, too) once, on Sunday evening, and she never forgets it thereafter. This information is separate from, in addition to the information that the coin itself is fair. This other information allows her to make a more accurate prediction about the likelihood that, each time she is awoken, the coin is showing heads. Or whether it’s Monday or Tuesday. The information she receives is the details of the sampling process, which has been specifically constructed to give results that are biased with respect to the coin toss itself, and the day of the week. Directly after being informed of the structure of the sampling process, she knows it is biased and therefore ought to update her prediction about what relative frequencies per observation will be of each observable aspect of the possible state of affairs she’s awoken into—Heads vs. Tails, Monday vs. Tuesday.
I think I might understand the interpretation that a halfer puts on the question. I’m just doubtful of its interest or relevance. Do you see any validity (I mean logical coherence, as opposed to wrong-headedness) to this interpretation? Is this just a turf war over who gets to define a coveted word for their purposes?
Consider the case of Sleeping Beauty with an absent-minded experimenter.
If the coin comes up Heads, there is a tiny but non-zero chance that the experimenter mixes up Monday and Tuesday.
If the coin comes up Tails, there is a tiny but non-zero chance that the experimenter mixes up Tails and Heads.
The resulting scenario is represented in a new sheet, Fuzzy two-day, of my spreadsheet document.
Under these assumptions, Beauty may no longer rule out Tuesday & Heads. She has no justification to assign all of the Heads probability mass to Monday & Heads. She is therefore constrained to conditioning on being woken in the way that the usual two-day variant suggests she should, and ends up with a credence arbitrarily close to 1⁄3 if we make the “absent-minded” probability tiny enough.
Why should we get a discontinuous jump to 1⁄2 as this becomes zero?
This sounds like the continuity argument, but I’m not quite clear on how the embedding is supposed to work, can you clarify? Instead of telling me what the experimenter rightly or wrongly believes to be the case, spell out for me how he behaves.
What does this mean operationally? Is there a nonzero chance, let’s call it epsilon or e, that the experimenter will incorrectly behave as if it’s Tuesday when it’s Monday? I.e., with probability e, Beauty is not awoken on Monday, the experiment ends, or is awoken and sent home, and we go on to next Sunday evening without any awakenings that week? Then Heads&Tuesday still with certainty does not occur. So maybe you meant that on Monday, he doesn’t awaken Beauty at all, but awakens her on Tuesday instead? Is this confusion persistent across days, or is it a random confusion that happens each time he needs to examine the state of the coin to know what he should do?
And on Tuesday
So when the coin comes up Tails, there is a nonzero probability, let’s call it delta or d, that the experimenter will incorrectly behave as if it’s Heads? I.e., on Tuesday morning, he will not awaken Beauty or will wake her and send her home until next Sunday? Then Tails&Tuesday is a possible nonoccurrence.
On reflection, my verbal description doesn’t rmatch the reply I wanted to give, which was: the experimenter behaves such that the probability mass is allocated as in the spreadsheet.
Make it “on any day when Beauty is scheduled to remain asleep, the experimenter has some probability of mistakenly waking her, and vice-versa”.
This is interesting. We shouldn’t get a discontinuous jump.
Consider 2 related situations:
if Heads she is woken up on Monday, and the experiment ends on Tuesday. If tails, she is woken up on Monday and Tuesday, and the experiment ends on Wed. In this case, there is no ‘not awake’ option.
If heads she is woken up on Monday and Tuesday. On Monday she is asked her credence for heads. On Tuesday she is told “it’s Tuesday and heads” (but she is not asked about her credence; that is, she is not interviewed). If tails, it’s the usual woken up both days and asked about her credence. The experiment ends on Wed.
In both of these scenarios, 50% of coin flips will end up heads. In both cases, if she’s interviewed she knows it’s either Monday&heads, Monday&tails or Tuesday&tails. She has no way of telling these three options apart, due to the amnesia.
I don’t think we should be getting different answers in these 2 situations. Yet, I think if we use your probability distributions we do.
I think there are two basic problems. One is that Monday&tails is really not different from Tuesday&tails. They are the same variable. It’s the same experience. If she could time travel and repeat the monday waking it would feel the same to her as the Tuesday waking. The other issue is that, even though in my scenario 2 above, when she is woken but before she knows if she will be interviewed, it would look like there is a 25% chance it’s heads&Monday and a 25% it’s heads&Tuesday. And that’s probably a reasonable way to look at it. But, that doesn’t imply that, once she finds out it’s an interview day, that the probability of heads&Monday shifts to 1⁄3. That’s because on 50% of coin flips she will experience heads&Monday. That’s what makes this different than a usual joint probability table representing independent events.
My reasoning has been to consider scenario 1 from the perspective of an outside observer, who is uncertain about each variable: a) whether it is Monday or Tuesday, b) how the coin came up, c) what happened to Beauty on that day.
To that observer, “Tuesday and heads” is definitely a possibility, and it doesn’t really matter how we label the third variable: “woken”, “interviewed”, whatever. If the experiment has ended, then that’s a day where she hasn’t been interviewed.
If the outside observer learns that Beauty hasn’t been interviewed today, then they may conclude that it’s Tuesday and that the coin came up heads, thus a) they have something to update on and b) that observer must assign probability mass to “Tuesday & Heads & not interviewed”.
If the outside observer learns that Beauty has been interviewed, it seems to me that they would infer that it’s more likely, given their prior state of knowledge, that the coin came up heads.
To the outside observer, scenario 2 isn’t really distinct from scenario 1. The difference only makes a difference to Beauty herself.
However, I see no reason to treat Beauty herself differently than an outside observer, including the possibility of updating on being interviewed or on not being interviewed.
So, if my probability tables are correct for an outside observer, I’m pretty sure they’re correct for Beauty.
(My confidence in the table themselves, however, has been eroded a little by my not being able to calculate Beauty—or an observer—updating on a new piece of information in the “fuzzy” variant, e.g. using P(heads|woken) as a prior probability and updating on learning that it is in fact Tuesday. It seems to me that for the math to check out requires that this operation should recover the “absent-minded experimenter” probability for “tuesday & heads & woken”. But I’m having a busy week so far and haven’t had much time to think about it.)
Why is that a problem? Why would N have to be equal to n1+n2+n3? Only because it does in your other example?
(ETA)I’m not sure of where you’re formula “lim N-> infinity n1/(n1+n2+n3)” comes from—as the third example shows, it just doesn’t work in all cases. That doesn’t mean that your alternative formula is better in the sleeping beauty case.
Because this, lim N-> infinity n1/(n1+n2+n3), is p1 if the counts are from independent draws of a multinomial distribution.
We have outcome-dependent sampling here. Is lim N-> infinity n1/(n1+n2+n3) equal to p1 in that case? I’d like to see the statistical theory to back up the claim. It’s pretty clear to me that people who believe the answer is 1⁄3 pictured counts in a 3 by 1 contingency table, and applied the wrong theory to it.
(ETA) The formula “lim N-> infinity n1/(n1+n2+n3)” is what people who claim the answer is 1⁄3 are using to justify it. The 1⁄2 solution just uses probability laws. That is, P(H)=1/2. P(W)=1, where W is the event that Beauty has been awakened. Therefore, P(H|W)=1/2.
I’ll have to disagree with that—there is a pretty clear interpretation in which 1⁄3 is a “correct” answer: if the sleeping beauty is asked to bet X dollars that heads came up, and wins $60 if she’s right, for up to which values X should she accept the bet? (if the coin comes up tails, she gets the possibility to bet twice)
In that scenario, X=$20 is the right answer, which corresponds to a probability of 1⁄3. Do you agree with that? (I haven’t read all the threads, you probably adressed this somewhere)
See, here I’m not using any “lim N-> infinity n1/(n1+n2+n3)” , so I feel you’re being unfair to 1/3rders.
I don’t agree, because the question is about her subjective probability at an awakening. The betting question you described is a different one.
For example, suppose I flip a coin and tell you you will win $60 if heads came up, but I require that you make the bet twice if tails came up? You’d be willing to bet up to $30, but that doesn’t mean you think heads has probability 1⁄3. If Beauty really thinks heads has probability 1⁄3, she’d be willing to accept the bet up to $30 even if we told her that we’d only accept one bet (of course, we wouldn’t tell her that she’s already made a bet on Tuesday. Payout would be on Wed).
The wikipedia page for the sleeping beauty problem says:
That’s why I think people are picturing counts in a contingency table when they come up with the 1⁄3 answer.
She would also bet at an awakening. If you ask her to bet when she just broke up, it would seem weird that she would say “my subjective probability for heads is 1⁄2, but I’ll only willing to bet up to $20 − 1⁄3 of the winnings if it’s heads.”
It seems even weirder in the Xtreme Sleeping Beauty, where she’s awakened a thousand times : “my subjective probability for heads is 1⁄2, but I’m only willing to bet up to 6 cents”.
Yes, you get a different result if you change the betting rules where only one bet per “branch” counts, but I don’t see why that’s closer to the problem as originally stated.
I guess I don’t see why it’s weird. The number of times she will bet is dependent on the outcome. So, even though at each awakening she thinks probability of heads is 1⁄2, she knows if it’s tails she’ll have to bet many more times than if heads. We’re essentially just making her bet more money on a loss than on a win.
In that case, what does it even mean to say “my subjective probability for heads is 1/2”? Subjective probability is often described in terms of bettings—see here.
Seems to me this is mostly a quarrel of definitions, and that when you say “people who believe the answer is 1⁄3 pictured counts in a 3 by 1 contingency table, and applied the wrong theory to it.”, you’re being unfair. They’re just using a different definition of “subjective probability”
Don’t you think so?
Based on my interaction with people here, I think we all are talking about the same thing when it comes to subjective probability.
I agree that you can use betting to describe subjective probability, but there are a lot of possible ways to bet.
“Subjective probability” is a basic term in decision theory and economics, though. If you want to roll your own metric, surely you should call it something else—to avoid much confusion.
That is why I’d rather talk in terms of bets than subjective probability—they don’t require precise technical definitions.
What is supposed to happen, in “Probabilistic Sleeping Beauty”, if the coin comes up heads and the die doesn’t come up k?
You’re woken with a big sign in front of you saying “the experiment is over now”, or however else you wish to allow sleeping beauty to distinguish the experimental wakings from being allowed to go about her normal life.
Failing that, you are never woken; it shouldn’t make any difference, as long as waking to leave is clearly distinguished from being woken for the experiment.
Wait, I didn’t catch this the first time:
“using the 1⁄3 answer and working back to try to find P(W) yields P(W) = 3⁄2, which is a strong indication that it is not the probability that matters”
No. It’s proof that your solution is wrong.
And I know exactly why your solution is wrong. You came up with P(Monday|W) using a ratio of expected counts, but you relied on an assumption that trials are independent. Here, the coin flips are indpendent but the counts are not. Even though you are using three counts, there is just one degree of freedom. Vladmir Nesov got it right, I think, when he said “(Tuesday, tails) is the same event as (Monday, tails)”
The last update in my sleeping beauty post explains the problem in more detail.
Of course P(W) isn’t bound within [0,1]; W is one of any number of events, in this case 2: P(You will be woken for the first time) = 1; P(You will be woken a second time) = 1⁄2. The fact that natural language and the phrasing of the problem attempts to hide this as “you wake up” is not important. That is why P(W) is apparently broken; it double counts some futures, it is the expected number of wakings. This is why I split into conditioning on waking on Monday or Tuesday.
(Tuesday, tails) is not the same event as (Monday, tails). They are distinct queries to whatever decision algorithm you implement; there are any number of trivial means to distinguish them without altering the experiment (Say “we will keep you in a red room on one day and a blue one on the other, with the order to be determined by a random coin flip)
They are strongly correlated events, granted. If either occurs, so will the other. That does not make them the same event. On your argumentation, you would assert confidently to that the coin is fair beforehand, yet also assert that the conditional probability that you wake on Monday depends on the coin flip, when in either branch you are woken then with probability 1.
If P(H) and P(H|W) are probabilities, then it must be true that:
P(H)=P(H|W)P(W)+P(H|~W)P(~W), where ~W means not W (any other event), by the law of total probability
If P(H)=1/2 and P(H|W)=1/3, as you claim, then we have
1/2=1/3P(W)+P(H|~W)(1-P(W))
P(H|~W) should be 0, since we know she will be awakened if heads. But that leads to P(W)=3/2.
P(W) should be 1, but that leads to an equation 1/2=1/3
So, this is a big mess.
The reason it is a big mess is because the 1⁄3 solution was derived by treating one random variable as two.
I already addressed this elsewhere. The problem is that W is not a boolean, it’s a probability distribution over observer moments, so P(W) and P(~W) are undefined (type errors).
At one point in your post you said “For convenience let us say that the event W is being woken” and then later on you suggest W is something else, but I don’t see where you really defined it.
You’re saying W itself is a probability distribution. What probability distribution? Can you be specific?
P(H) and P(H|W) are probabilities. It’s unclear to me how those can be well defined, but the law of total probability doesn’t apply.
Suppose we write out SB as a world-program:
This notation is from decision theory; S is sleeping beauty’s chosen strategy, a function which takes as arguments all the observations, including memories, which sleeping beauty has access to at that point, and returns the value of any decision SB makes. (In this case, the scenario doesn’t actually do anything with SB’s answers, so the program ignores them.)
An observer-moment is a complete state of the program at a point where S is executed, including the arguments to S. Now, take all the possible observer-moments, weighted by the expected number of times that a given run of SleepingBeauty contains that observer moment. To condition on some information, take the subset of those observer-moments which match that information. So, P(coin=heads|I=”you just woke up”) means, of all the calls to S where I=”you just woke up”, weighted by probability of occurance, what fraction of them are on the heads branch? This is 1⁄3. On the other hand, P(coin=heads|I=”the experiment’s over now”)=1/2.
Suppose we write out SB as a world-program:
This notation is from decision theory; S is sleeping beauty’s chosen strategy, a function which takes as arguments all the observations, including memories, which sleeping beauty has access to at that point.
An observer-moment is a complete state of the program at a point where S is executed, including the arguments to S. Now, take all the possible observer-moments, weighted by the probability that a given run of SleepingBeauty contains that observer moment. To condition on some information, take the subset of those observer-moments which match that information. So, P(coin=heads|I=”you just woke up”) means, of all the calls to S where I=”you just woke up”, weighted by probability of occurance, what fraction of them are on the heads branch? This is 1⁄3. On the other hand, P(coin=heads|I=”the experiment’s over now”)=1/2.
“Of course P(W) isn’t bound within [0,1]”
Of course! (?) You derived P(W) using probability laws, i.e., solving for it in this equation: P(H)=P(H|W)P(W), where P(H)=1/2 and P(H|W)=1/3. These are probabilities. And your 1⁄3 solution proves there is an error.
If two variables have correlation of 1, I think you could argue that they are the same (they contain the same quantitative information, at least).
No. You will wake on Monday with probability one. But, on a randomly selected awakening, it is more likely that it’s Monday&Heads than Monday&Tails, because you are on the Heads path on 50% of experiments
What is this random selection procedure you use in the last para?
(“I select an awakening, but I can’t tell which” is the same statement as “Each awakening has probability 1/3″ and describes SB’s epistemic situation.)
Random doesn’t necessarily mean uniform. When Beauty wakes up, she knows she is somewhere on the tails path with probability .5, and somewhere on the tails path with probability .5. If tails, she also knows it’s either monday or tuesday, and from her persepctive, she should treat those days as equally likely (since she has no way of distinguishing). Thus, the distribution from which we would select an awakening at random has probabilities 0.5, 0.25 and 0.25.
This appears to be where you are getting confused. Your probability tree in your post was incorrect. It should look like this:
If you think about writing a program to simulate the experiment this should be obvious.
No, because my probability tree was meant to reflect how beauty should view the probabilities at the time of an awakening. From that perspective, your tree would be incorrect (as two awakenings cannot happen at one time)
After the 1000 experiments, you divided 500 by 2 - getting 250. You should have multiplied 500 by 2 - getting 1000 tails observations in total. It seems like a simple-enough math mistake.
No, that’s not what I did. I’ll assume that you are smart enough to understand what I did, and I just did a poor job of explaining it. So I don’t know if it’s worth trying again. But basically, my probability tree was meant to reflect how Beauty should view the state of the world on an awakening. It was not meant to reflect how data would be generated if we saw the experiment through to the end. I thought it would be useful. But you can scrap that whole thing and my other arguments hold.
Well you did divide 500 by 2 - getting 250. And you should have multiplied the 500 tails events by 2 (the number of interviews that were conducted after each “tails” event) - getting 1000 “tails” interviews in total. 250 has nothing to do with this problem.
No, P(H)=P(H|W)P(W) is incorrect because the W in P(H|W) is different than the W in P(W): the former is a probability distribution over a set of three events, while the latter is a boolean. Using the former definition, as a probability distribution, P(W) isn’t meaningful at all, it’s just a type error.
It isn’t a probability; the only use of it was to note the method leading to a 1⁄2 solution and where I consider it to fail, specifically because the number of times you are woken is not bound in [0,1] and thus “P(W)” as used in the 1⁄2 conditioning is malformed, as it doesn’t keep track of when you’re actually woken up. In as much as it is anything, using the 1⁄2 argumentation, “P(W)” is the expected number of wakings.
Sorry, but if we’re randomly selecting a waking then it is not true that you’re on the heads path 50% of the time. In a pair of runs, one head, one tail, you are woken 3 times, twice on the tails path.
On a randomly selected run of the experiment, there is a 1⁄2 chance of being in either branch, but: Choose a uniformly random waking in a uniformly chosen random run is not the same as Choose a uniformly random waking.
Why are you using the notation P(W) when you mean E(W)? And if you can get an expectation for it, you must know the probability of it.
Randomly selecting a waking does not imply a uniform distribution. On the contrary, we know the distribution is not uniform.
Nitpick:
Change the second 300 to 600.
EDIT: Oops. Brain malfunction. Never mind.
“Under these numbers, the 1000 observations made have required 500 heads and 250 tails, as each tail produces both an observation on Monday and Tuesday. ”
I must have been unclear in explaining my probability tree. The tree represents how Beauty should view things on an awakening. I thought it would be helpful. Apparently it just created more confusion (although some people got it).
“P(Monday|W) = 2/3”
Why? I believe it is 0.75. How did you come up with 2/3?
P(Monday ∩ H | W) = P(Monday ∩ T | W). Regardless of whether the coin came up heads or tails you will be woken on Monday precisely once.
P(Monday ∩ T | W) = P(Tuesday ∩ T | W), because if tails comes up you are surely woken on both Monday and Tuesday.
You still seem to be holding on to the claim that there are as many observations after a head as after a tail; this is clearly false. There isn’t a half measure of observation to spread across the tails branch of the experiment; this is made clearer in Sleeping Twins and the Probabilistic Sleeping Beauty problems.
Once Sleeping Beauty is normalised so that there is at most one observation per “individual” in the experiment, it seems far harder to justify the 1⁄2 answer. The fact of the matter is that your use of P(W) = 1 is causing grief, as on these problems you should consider E(#W) instead, because P(W) is not linear.
What is your credence in the Probabilistic Sleeping Beauty problem?
Probabilistic sleeping beauty
P(H|W)=1/21
Now, let’s change the problem slightly.
The experimenters fix m unique constants, k1,...,km, each in {1,2,..,20}, sedate you, roll a D20 and flip a coin. If the coin comes up tails, they will wake you on days k1,...,km. If the coin comes up heads and the D20 comes up is in {k1,...,km}, they will wake you on day 1.
Here, P(H|W)=m/(20+m)
If m is 1 we get 1⁄21.
If m is 20 we get 1⁄2, which is the solution to the sleeping beauty problem.
The point of the PSB problem is that the approach you’ve just outlined is indefensible.
You agree that for each single constant k_i P(H|W) = 1⁄21. Uncertainty over which constant k_i is used does not alter this.
So if I run PSB 20 times, you would assert in each run that P(H|W) = 1⁄21. So now I simply keep you sedated between experiments. Statistically, 20 runs yields you SB, and each time you answered with 1⁄21 as your credence. Does this not faze you at all?
You have a scenario A where you assert foo with credence P, and scenario B where you also assert foo with credence P, yet if I put you in scenario A and then scenario B, keeping you sedated in the meantime, you do not assert foo with credence P...
Jonathan,
In this problem:
Do you agree that P(H|W)=m/(20+m) ? If not, why not?
Do you also agree that when m=20 we have the sleeping beauty problem (with 20 wake ups instead of 2 for tails)? If not, why not?
No. I assert P(H|W) = 1⁄21 in this case.
Two ways of seeing this: Either calculate the expected number of wakings conditional on the coin flip (m/20 and m for H and T). [As in SB]
Alternatively consider this as m copies of the single constant game, with uncertainty on each waking as to which one you’re playing. All m single constant games are equally likely, and all have P(H|W) = 1⁄21. [The hoped for PSB intuition-pump]
I need more clarification. Sorry. I do think we’re getting somewhere...
The experimenters fix 2 unique constants, k1,k2, each in {1,2,..,20}, sedate you, roll a D20 and flip a coin. If the coin comes up tails, they will wake you on days k1 and k2. If the coin comes up heads and the D20 that comes up is in {k1,k2}, they will wake you on day 1.
Do you agree that P(H|W)=2/22 in this case?
I do.
No; P(H|W) = 1⁄21
Multiple ways to see this: 1) Under heads, I expect to be woken 1⁄10 of the time Under tails, I expect to be woken twice. Hence on the average for every waking after a head I am woken 20 times after a tail. Ergo 1⁄21.
2) Internally split the game into 2 single constant games, one for k1 and one for k2. We can simply play them sequentially (with the same die roll). When I am woken I do not know which of the two games I am playing. We both agree that in the single constant game P(H|W) = 1⁄21.
It’s reasonably clear that playing two single constant games in series (with the same die roll and coin flip) reproduces the 2 constant game. The correleation between the roll and flip in the two games doesn’t affect the expectations, and since you have complete uncertainty over which game you’re in (c/o amnesia), the correlation of your current state with a state you have no information on is irrelevant.
P(H|W ∩ game i) = 1⁄21, so P(H|W) = 1⁄21, as the union over all i of (W ∩ game i) is W. At some level this is why I introduced PSB, it seems clearer that this should be the case when the number of wakings is bounded to 1.
3) Being woken implies either W1 or W2 (currently being woken for the first time or the second time) has occured. In general note that the expected count of something is a probability (and vice versa) if the number of times the event occurs is in {0,1} (trivial using the frequentist def of probability; under the credence view it’s true for betting reasons).
P(W1 | H) = 1⁄10, P(W2 | H) = 0 P(W1 | T) = 1, P(W2 | T) = 1, from the experimental setup.
Hence P(H|W1) = 1⁄11, P(H|W2) = 0 You’re woken in 11⁄20 of experiments for the first time and in 1⁄2 of experiments for the second, so P(W1| I am woken) = 11⁄21
P(H | I am woken ) = P(H ∩ W1 | I am woken ) + P(H ∩ W2 | I am woken ) = P(H | W1 ∩ I am woken).P(W1 | I am woken) + 0 = 1⁄11 . 11⁄21 = 1⁄21.
The issues you’ve raised with this is seem to be that you would either: Set P(W1 | I am woken) = 1 or Set P(W1 | T) = P(W2 | T) = 1⁄2 [ so P(H|W1) = 1⁄6 ], and set P(W1 | I am woken) = 6⁄11.
My problem with this is that if P(W1 | I am woken) =/= 11⁄21, you’re poorly calibrated. Your position appears to be that this is because you’re being “forced to make the bet twice in some circumstances but not others”. Hence what you’re doing is clipping the number of times a bet is made to {0,1}, at which point expectation counts of number of outcomes are probabilities of outcomes. I think such an approach is wrong, because the underlying problem is that the counts of event occurences conditional on H or T aren’t constrained to be in {0,1} anymore. This is why I’m not concerned about the “probabilities” being over-unity. Indeed you’d expect them to be over-unity, because the long run number of wakings exceeds the long run number of experiments. In the limit you get well defined over unity probability, under the frequentist view. Betting odds aren’t constrained in [0,1] either, so again you wouldn’t expect credence to stay in [0,1]. It is bounded in [0,2] in SB or your experiment, because the maximum number of winning events in a branch is 2.
As I see it, the 1⁄21 answer (or 1⁄3 in SB) is the only plausible answer because it holds when we stack up multiple runs of the experiment in series or equivalently have uncertainty over which constant is being used in PSB. The 1⁄11 (equiv. 1⁄2) answer doesn’t have this property, as is seen from 1⁄21 going to 1⁄11 from nothing but running two experiments of identical expected behaviour in series...
Credence isn’t constrained to be in [0,1]???
It seems to me that you are working very hard to justify your solution. It’s a solution by argument/intuition. Why don’t you just do the math?
I just used Bayes rule. W is an awakening. We want to know P(H|W), because the question is about her subjective probability when (if) she is woken up.
To get P(H|W), we need the following:
P(W|H)=2/20 (if heads, wake up if D20 landed on k1 or k2)
P(H)=1/2 (fair coin)
P(W|T)=1 (if tails, woken up regardless of result of coin flip)
P(T)=1/2 (fair coin)
Using Bayes rule, we get:
P(H|W)=(2/20)(1/2) / [(2/20)(1/2)+(1)*(1/2)] = 1⁄11
With your approach, you avoid directly applying Bayes’ theorem, and you argue that it’s ok for credence to be outside of [0,1]. This suggests to me that you are trying to derive a solution that matches your intuition. My suggestion is to let the math speak, and then to figure out why your intuition is wrong.
You and I both agree on Bayes implying 1⁄21 in the single constant case. Considering the 2 constant game as 2 single constant games in series, with uncertainty over which one (k1 and k2 the mutually exclusive “this is the k1/k2 game”)
P(H | W) = P(H ∩ k1|W) + P(H ∩ k2|W) = P(H | k1 ∩ W)P(k1|W) + P(H|k2 ∩ W)P(k2|W) = 1⁄21 . 1⁄2 + 1⁄21 . 1⁄2 = 1⁄21
This is the logic that to me drives PSB to SB and the 1⁄3 solution. I worked it through in SB by conditioning on the day (slightly different but not substantially).
I have had a realisation. You work directly with W, I work with subsets of W that can only occur at most once in each branch and apply total probability.
Formally, I think what is going on is this: (Working with simple SB) We have a sample space S = {H,T}
“You have been woken” is not an event, in the sense of being a set of experimental outcomes. “You will be woken at least once” is, but these are not the same thing.
“You will be woken at least once” is a nice straightforward event, in the sense of being a set of experimental outcomes {H,T}. “You have been woken” should be considered formally as the multiset {H,T,T}. Formally just working thorough with multisets wherever sets are used as events in probability theory, we recover all of the standard theorems (including Bayes) without issue.
What changes is that since P(S) = 1, and there are multisets X such that X contains S, P(X) > 1.
Hence P({H,T,T}) = 3⁄2; P({H}|{H,T,T}) = 1⁄3.
In the 2 constant PSB setup you suggest, we have S = {H,T} x {1,..,20} W = {(H,k1),(H,k2), (T,1),(T,1),(T,2),(T,2),....,(T,20),(T,20)}
And P(H|W) = 1⁄21 without issue.
My statement is that this more accurately represents the experimental setup; when you wake, conditioned on all background information, you don’t know how many times you’ve been woken before, but this changes the conditional probabilities of H and T. If you merely use background knowledge of “You have been woken at least once”, and squash all of the events “You are woken for the nth time” into a single event by using union on the events, then you discard information.
This is closely related to my earlier (intuition) that the problem was something to do with linearity.
In sets, union and intersection are only linear when the working on some collection of atomic sets, but are generally linear in multisets. [eg. (A υ B) \ B ≠ A in general in sets]
Observe that the approach I take of splitting “events” down to disjoint things that occur at most once is precisely taking a multiset event apart into well behaved events and then applying probability theory.
What was concerning me is that the true claim that P({H,T}|T) = 1 seemed to discard pertinent information (ie the potential for waking on the second day). With W as the multiset {H,T,T}, P(W|T) = 2. You can regard this as expectation number of times you see Tails, or the extension of probability to multisets.
The difference in approach is that you have to put the double counting of waking given tails in as a boost to payoffs given Tails, which seems odd as from the point of view of you having just been woken you are being offered immediate take-it-or-leave-it odds. This is made clearer by looking at the twins scenario; each person is offered at most one bet.
You just changed the problem. If you wake me up between runs of PSB, then P(H|W)=1/21 each time. If not, I have different information to condition on.
No; between sedation and amnesia you know nothing but the fact that you’ve been woken up, and that 20 runs of this experiment are to be performed.
Why would an earlier independent trial have any impact on you or your credences, when you can neither remember it nor be influenced by it?
I don’t know. It’s a much more complicated problem, because you have 20 coin flips (if I understand the problem correctly). I haven’t taken the time to work through the math yet. It’s not obvious to me, though, why this corresponds to the sleeping beauty problem. In fact, it seems pretty clear that it doesn’t.
The reason it corresponds to Sleeping Beauty is that in the limit of a large number of trials, we can consider blocks of 20 trials where heads was the flip and all values of the die roll occurred, and similar blocks for tails, and have some epsilon proportion left over. (WLLN)
Each of those blocks corresponds to Sleeping Beauty under heads/tails.
No. I never made that claim, so I cannot “hold on to it”. The number of observations after tails doesn’t matter here.
Imagine we repeat the sleeping beauty experiment many times. On half of the experiments, she’d be on the heads path. On half of the experiments, she’d be on the tails path. If she is on the tails path, it could be either monday or tuesday. Thus, on an awakening Monday ∩ T is less likely than Monday ∩ H
The claim is implied by your logic; the fact that you don’t engage with it does not prevent it from being a consequence that you need to deal with. Furthermore it appears to be the intuition by which you are constructing your models of Sleeping Beauty.
Granted; no contest
And assuredly she will be woken on both days in any given experimental run. She will be woken twice. Both events occur whenever tails comes up.P(You will be woken on Monday | Tails) = P(You will be woken on Tuesday | Tails) = 1
The arrangement that you are putting forward as a model is that Sleeping Beauty is to be woken once and only once regardless of the coin flip, and thus if she could wake on Tuesday given Tails occurred then that must reduce the change of her waking on Monday given that Tails occurred. However in the Sleeping Beauty problem the number of wakings is not constant. This is the fundamental problem in your approach.
I think we could make faster progress if you started with the assumption that I have read and understood the problem. Yes, I know that she is woken up twice when tails.
You agree that
Given that she is awake right now, what should be her state of mind. Well, she knows if heads it’s Monday. She knows if tails it’s either Monday or Tuesday. The fact that she will (or has) been woken up on both days doesn’t matter to her right now. It’s either Monday or Tuesday. Given that she cannot distinguish between the two, it would make sense for her to think of those as equally likely at this awakening (under tails). Thus, P(Monday|T,W)=1/2, P(T|W)=1/2, P(Monday ∩ T | W)=1/4.
The problem with your 1⁄3 solution is you treat the data as if they are counts in a 3 by 1 contingency table (the 3 cells being Monday&H, Monday&T, Tuesday&T). If the counts were the result of independent draws from a multinomial distribution, you would get p(H|W)=1/3. You have dependent draws though. You have 1 degree of freedom instead of the usual 2 degrees of freedom. That’s why your ratio is not a probability. That’s why your solution results in nonsense like p(W)=3/2.
As I see it, initially (as a prior, before considering that I’ve been woken up), both Heads and Tails are equally likely, and it is equally likely to be either day. Since I’ve been woken up, I know that it’s not (Tuesday ∩ Heads), but I gain no further information.
Hence the 3 remaining probabilities are renormalised to 1⁄3.
Alternatively: I wake up; I know from the setup that I will be in this subjective state once under Heads and twice under Tails, and they are a priori equally likely. I have no data that can distinguish between the three states of identical subjective state, so my posterior is uniform over them.
If she knows it’s Tuesday then it’s Tails. If she knows it’s Monday then she learns nothing of the coin flip. If she knows the flip was Tails then she is indifferent to Monday and Tuesday. 1⁄3 drops out as the only consistent answer at that point.
It’s not equally likely to be either day. If I am awake, it’s more likely that it’s Monday, since that always occurs under heads, and will occur on half of tails awakenings.
Heads and tails are equally likely, a priori, yes. It is equally likely that you will be woken up twice as it is that you will be woken up. Yes. That’s true. But we are talking about your state of mind on an awakening. It can’t be both Monday and Tuesday. So, what should your subjective probability be? Well, I know it’s tails and (Monday or Tuesday) with probability 0.5. I know it’s heads and Monday with probability 0.5.
Before I am woken up, my prior belief is that I spend 24 hours on Monday and 24 on Tuesday regardless of the coin flip. Hence before I condition on waking, my probabilities are 1⁄4 in each cell.
When I wake, one cell is driven to 0, and the is no information to distinguish the remaining 3. This is the point that the sleeping twins problem was intended to illuminate.
Given awakenings that I know to be on Monday, there are two histories with the same measure. They are equally likely. If I run the experiment and count the number of events Monday ∩ H and Monday ∩ T, I will get the same numbers (mod. epsilon errors). Your assertion that it’s H/T with probability 0.5 is false given that you have woken. Hence sleeping twins.
That is Beauty’s probability of which day it is AFTER considering that she has been woken up.