As Wei Dai has said, arguing about which probability is “right” is futile until you have fixed your decision theory and goals that will actually make use of use those probabilities to act. In most use-cases of probability theory, such issues don’t come up.
In sleeping beauty, you are in a situation where such considerations do matter.
If we further specify that sleeping beauty can make a bet and (if she wins) will get the money straight away on Monday, and be allowed to spend it straight away on eating a chocolate bar, and will then be put to sleep again (if the coin came up tails), woken up on Tuesday and be given the same money again and allowed to eat another chocolate bar, then she will do best by saying that the probability of tails is 2⁄3.
But if we specify that the money will be put into an account (and she will only be paid one winning) that she can spend after the experiment is over, which is next week, then she will find that 1⁄2 is the “right” answer
In the sleeping beauty problem, whether the 2⁄3 or 1⁄2 is “right” is just a debate about words. The real issue is what kind of many-instance decision algorithm you are running.
EDIT: Another way of putting this would be to simply abandon the concept of probability altogether and use something like UDT. Probability theory doesn’t work in cases where you have multiple instances of your decision algorithm running.
A bet where she can immediately win, be paid, and consumer her winnings seems to me far more directly connected to the probability of “what state am I in” than a bet where whether the bet is consummated and the bet paid depends on what else happens in other situations that may exist later. It seems crazy to treat both of those as equally valid bets about what state she is in at the moment.
Re: “But if we specify that the money will be put into an account (and she will only be paid one winning) that she can spend after the experiment is over, which is next week, then she will find that 1⁄2 is the “right” answer”
That seems like a rather bizarre way to interpret: “What is your credence NOW for the proposition that our coin landed heads?” [emphasis added]
Again, consider the scenario where at each awakening we offer a bet where she’d lose $1.50 if heads and win $1 if tails, and we tell her that we will only accept whichever bet she made on the final interview.
If her credence for heads on an awakening, on every awakening (she can’t distinguish between awakenings), really was 1⁄3, she would agree to accept the bet. But we all know accepting the bet would be irrational. Thus, her credence for heads on an awakening is not 1⁄3.
“What is your credence now for the proposition that our coin landed heads?”
...actually means. Personally, I think your position on that is indefensible.
This would make it clear exactly where the problem lies—if not for the fact that you also appear to be in a complete muddle about how many times Beauty awakens and is interviewed.
We both know what question is being asked. We both know how many times she awakens and is interviewed. I know what subjective probability is (I assume you do too). I showed you my math. I also explained why your ratio of expected frequencies does not correspond to the subjective probability that you think it does.
I started by reading the wikipedia page. At that point, the 1⁄3 solution made some sense to me, but I was bothered by the fact that you couldn’t derive it from probability laws. I then read articles by Bostrom and Radford. I spent a lot of time working on the problem, etc. Eventually, I figured out precisely why the 1⁄3 solution is wrong.
Is Wikipedia a stronger authority than me here? Probably. But I know where the argument there fails, so it’s not very convincing.
It’s fascinating to me that you won’t tell me which probability is wrong, p(H)=1/2, P(monday|H)=1
It’s also interesting that you won’t defend your answer (other than saying I’m wrong). You are in a situation where the number of trials depends on outcome, but are using an estimator that is valid for independent trials. Show me that yours converges to a probability. Standard theory doesn’t hold here.
Probabilities are subjective. From Beauty’s POV, if she has just awakened to face an interview, then p(H)=1/3. If she has learned that is Friday and the experiment is over, (but she has not yet been told which side the coin came down), then she updates on that info, and then p(H)=1/2. So, the value of p(H) depends on who is being asked—and on what information they have at the time.
It’s the first one—P(H)=1/2 is wrong. Before going any further, we should adopt Jaynes’ habit of always labelling the prior knowledge in our probabilities, because there are in fact two probabilities that we care about: P(H|the experiment ran), and P(H|Sleeping Beauty has just been woken). These are 1⁄2 and 1⁄3, respectively. The first of these probabilities is given in the problem statement, but the second is what is asked for, and what should be used for calculating expected value in any betting, because any bets made occur twice if the coin was tails.
How can these things be different, P(H|the experiment ran) and P(H|Sleeping Beauty has just been woken)?
Yes, a bet would occur twice if tails, if you set the problem up that way. But the question has to do with her credence at an awakening.
The 1⁄3 calculation is derived from treating the 3 counts as if they arose from independent draws of a mulitinomial distribution. They are not independent draws. There is 1 degree of freedom, not 2. Thus, the ratio that lead to the 1⁄3 value is not the probability that people seem to think it is. It’s not clear that the ratio is a probability at all.
What’s this about a multinomial distribution and degrees of freedom? I calculated P(H|W) as E(occurances of H&&W)/E(occurances of W) = (1/2)/(3/2) = 1⁄3.
Yes, exactly. That would be a valid probability if these were expected frequencies from independent draws of a multinomial distribution (it would have 2 degrees of freedom). Your ratio of expected values does not result in P(H|W).
It might become clear if you think about it this way. Your expected number of occurrences of W is greater than the largest possible value of occurrences of H&W. You don’t have a ratio of number of events to number of independent trials.
Picture a 3 by 1 contingency table, where we have counts in 3 cells: Monday&H, Monday&T, Tuesday&T. Typically, a 3 by 1 contingency table will have 2 degrees of freedom (the count in the 3rd cell is determined by the number of trials and the counts in the other cells). Standard statistical theory says you can estimate the probability for cell one by taking the cell one count and dividing by the total. That’s not the situation with the sleeping beauty problem. There is just one degree of freedom. If we know the count the number of coin flips and the count in one of the cells, we know the count in the other two. Standard statistical theory does not apply. The ratio of count for cell one to the total is not the probability for cell one.
Occurances of H&&W are a strict subset of occurances of W, so if to use the terminology of events and trials, each waking is a trial, and each waking where the coin was heads is a positive result. That’s 1⁄3 of all trials, so a probability of 1⁄3.
If each waking is a trial, then you have a situation where the number of trials is outcome dependent. Your estimator would be valid if the number of trials was not outcome dependent. This is the heart of the matter. The ratio of cell counts here is just not a probability.
The number of trials being outcome dependent only matters if you are using the frequentist definition of probability, or if it causes you to collect fewer trials than you need to overcome noise. We’re computing with probabilities straight from the problem statement, so there’s no noise, and as a Bayesian, I don’t care about the frequentists’ broken definition.
This has nothing to do with Bayesian vs. Frequentist. We’re just calculated probabilities from the problem statement, like you said. From the problem, we know P(H)=1/2, P(Monday|H)=1, etc, which leads to P(H|Monday or Tuesday)=1/2. The 1⁄3 calculation is not from the problem statement, but rather from a misapplication of large sample theory. The outcome-dependent sampling biases your estimator.
And it’s strange that you don’t call your approach Frequentist, when you derived it from expected cell counts in repeated samples.
And it’s strange that you don’t call your approach Frequentist, when you derived it from expected cell counts in repeated samples.
Don’t forget—around here ‘Bayesian’ is used normatively, and as part of some sort of group identification. “Bayesians” here will often use frequentist approaches in particular problems.
But that can be legitimate, as Bayesian calculations are a superset of frequentist calculations. Nothing bars a Bayesian from postulating that a limiting frequency exists in an unbounded number of trials in some hypothetical situation; but you won’t see one, e.g., accept R.A. Fisher’s argument for his use of p-values for statistical inference.
I adopted some frequentist terminology for purposes of this discussion because none of the other explanations I or others had posted seemed to be getting through, and I thought that might be the problem.
The reason I said that there’s a frequentist vs. Bayesian issue here is because the frequentist probabilitiy definition I’m most familiar with is P(f) = lim n->inf sum(f(i), i=1..n)/n, where f(x) is the x’th repetition of an independent repeatable experiment, and that definition is hard to reconcile with SB sometimes being asked twice. I assumed that issue, or a rule justified by that issue, was behind your objection.
I adopted some frequentist terminology for purposes of this discussion because none of the other explanations I or others had posted seemed to be getting through, and I thought that might be the problem.
The reason I said that there’s a frequentist vs. Bayesian issue here is because the frequentist probabilitiy definition I’m most familiar with is P(f) = lim n->inf sum(f(i), i=1..n)/n, where f(x) is the x’th repetition of an independent repeatable experiment, and that definition is hard to reconcile with SB sometimes being asked twice. I assumed that issue, or a rule justified by that issue, was behind your objection.
In the sleeping beauty problem, whether the 2⁄3 or 1⁄2 is “right” is just a debate about words. The real issue is what kind of many-instance decision algorithm you are running.
Not quite. The question of what do we mean by probability in this case is valid, but probability shouldn’t be just about bets. Probability is bound to a specific model of the situation, with sample space, probability measure, and events. The concept of “probability” doesn’t just mean “the password you use to win bets to your satisfaction”. Of course this depends on your ontological assumptions, but usually we are safe with a “possible worlds” model.
It is for making decisions, specifically for expressing preference under the expected utility axioms and where uniform distribution is suggested by indifference to moral value of a set of outcomes and absence of prior knowledge about the outcomes. Preference is usually expressed about sets of possible worlds, and I don’t see how you can construct a natural sample space out of possible worlds for the answer of 2⁄3.
Of course that’s the obvious answer, but it also has some problems that don’t seem easily redeemable. The sample space has to reflect the outcome of one’s actions in the world on which preference is defined, which usually means the set of possible worlds. “Experience-moments” are not carved the right way (not mutually exclusive, can’t update on observations, etc.)
“Experience-moments” are not carved the right way (not mutually exclusive, can’t update on observations, etc.)
Experience moments are “mutually exclusive”, in the sense that every experience moment can be uniquely identified in theory, and any given agent at any given time is only having one specific observer moment. However there is the possibility of subjectively indistinguishable experiences. I don’t understand what you mean by “can’t update”.
By “can’t update” I refer to the problem with marking Thursday “impossible”, since you’ll encounter Thursday later.
However there is the possibility of subjectively indistinguishable experiences.
It’s not a problem with the model of ontology and preference, it’s merely specifics of what kinds of observation events are expected.
Experience moments are “mutually exclusive”, in the sense that every experience moment can be uniquely identified in theory, and any given agent at any given time is only having one specific observer moment.
If the goal is to identify an event corresponding to observations in the form of a set of possible worlds, and there are different-looking observations that could correspond to the same event (e.g. observed at different time in the same possible world), their difference is pure logical uncertainty. They differ, but only in the same sense as 2+2 and (7-5)*(9-7) differ, where you need but to compute denotation: the agent running on the described model doesn’t care about the difference, indeed wants to factor it out.
What happens when you’ve observed that “it’s not Tuesday”, and the next day it’s Tuesday? Have you encountered an event of zero probability?
If you update on your knowledge that “it’s not Tuesday”, it means that you’ve thrown away the parts of your sample space that contain the territory corresponding to Tuesday, marked them impossible, no longer part of what you can think about, what you can expect to observe again (interpret as implied by observations). Assuming the model is honest, that you really do conceptualize the world through that model, your mind is now blind to the possibility of Tuesday. Come Tuesday, you’ll be able to understand your observations in any way but as implying that it’s Tuesday, or that the events you observe are the ones that could possibly occur on Tuesday.
This is not a way to treat your mind. (But then again, I’m probably being too direct in applying the consequences of really believing what is being suggested, as in the case of Pascal’s Wager, for it to reflect the problem statement you consider.)
I don’t see how this is related to the problem of observer-moments—the argument above holds for any event X: “What if you’ve observed ~X, and then you find that X”. What’s the connection?
In a probability space where you have distinct (non-intersecting) “Monday” and “Tuesday”, it is expected (in the informal sense, outside the broken model) that you’ll observe Tuesday after observing Monday, that upon observing Monday you rule out Tuesday, and that upon observing Tuesday you won’t be able to recognize it as such because it’s already ruled out. “Observer-moments” can be located on the same history, and a probability space that distinguishes them will tear down your understanding of the other observer-moments once you’ve observed one of them and excluded the rest. This model promises you a map disconnected from reality.
It is not the case with a probability space based on possible worlds that after concluding ~X, you expect (in the informal sense) to observe X after that. Possible worlds model is in accordance with this (informal) axiom. Sample space based on “observer-moments” is not.
This has nothing to do with semantics. If smart people are saying “2+2=5” and I point out it’s 4, would you say “what matters is why you want to know what 2+2 is”?
The question here is very well defined. There is only one right answer. The fact that even very smart people come up with the wrong answer has all kinds of implications about the type of errors we might make on a regular basis (and lead to bad theories, decisions, etc).
If you mean something else by probability than “at what odds would you be indifferent to accepting a bet on this proposition” then you need to explain what you mean. You are just coming across as confused. You’ve already acknowledged that sleeping beauty would be wrong to turn down a 50:50 bet on tails. What proposition is being bet on when you would be correct to be indifferent at 50:50 odds?
There is a mismatch between the betting question and the original question about probability.
At an awakening, she has no more information about heads or tails than she had originally, but we’re forcing her to bet twice under tails. So, even if her credence for heads was a half, she still wouldn’t make the bet.
Suppose I am going to flip a coin and I tell you you win $1 if heads and lose $2 if tails. You could calculate that the p(H) would have to be 2⁄3 in order for this to be a fair bet (have 0 expectation). That doesn’t imply that the p(H) is actually 2⁄3. It’s a different question. This is a really important point, a point that I think has caused much confusion.
Do you think this analysis works for the fact that a well-calibrated Beauty answers “1/3”? Do you think there’s a problem with our methods of judging calibration?
You seem to agree she should take a 50:50 bet on tails. What would be the form of the bet where she should be indifferent to 50:50 odds? If you can answer this question and explain why you think it is a more relevant probability then you may be able to resolve the confusion.
Roko has already given an example of such a bet: where she only gets one pay out in the tails case. Is this what you are claiming is the more relevant probability? If so, why is this probability more relevant in your estimation?
The interviewer asks about her credence ‘right now’ (at an awakening). If we want to set up a betting problem based around that decision, why would it involve placing bets on possibly two different days?
If, at an awakening, Beauty really believes that it’s tails with credence 0.67, then she would gladly take a single bet of win $1 if tails and lose $1.50 if heads. If she wouldn’t take that bet, why should we believe that her credence for heads at an awakening is 1/3?
I’m treating credence for heads as her confidence in heads, as expressed as a number between 0 and 1 (inclusive), given everything she knows at the time. I see it as the same things as a posterior probability.
I don’t think disagreement is due to different uses of the word credence. It appears to me that we are all talking about the same thing.
I think that the difference between evaluating 2+2 and assigning probabilities (and the reason for the large amount of disagreement) is that 2+2 is a statement in a formal language, whereas what kind of anthropic principle to accept/how to interpret probability is a philosophical one.
Don’t be fooled by the simple Bayes’ theorem calculations—they are not the hard part of this question.
A philosophical question, as opposed to a formal one, is a question that hasn’t been properly understood yet. It is a case of ignorance in the mind, not a case of fuzzy territory.
If smart people are saying “2+2=5” and I point out it’s 4, would you say “what matters is why you want to know what 2+2 is”?
Yes. For example, let’s take a clearer mathematical statement, “3 is prime”. It seems that’s true whatever people say. However, if you come across some mathematicians who are having a discussion that assumes 3 is not prime, then you should think you’re missing some context rather than that they are bad at math.
I chose this example because I once constructed an integer-like system based on half-steps (the successor function adds .5). The system has a notion of primality, and 3 is not prime.
If you want a standard system where 3 is not prime consider Z[omega] where omega^3=1 and omega is not 1. That is, the set of numbers formed by taking all sums, differences, and products of 1 and omega.
What you should say when asked “What is 2+2?” is a separate question from what is 2+2. 2+2 is 4, but you should probably say something else if the situation calls to that. The circumstances that could force you to say something in response to a given question are unrelated to what the answer to that question really is. The truth of the answer to a question is implicit in the question, not in the question-answering situation, unless the question is about the question-answering situation.
“Should” refers to moral value of the outcome, and if someone is holding a gun to a puppy’s head and says “if you say that 2+2=4, the puppy will die!”, you shouldn’t answer “4” to the question, even though it’s correct that the answer is 4. Correctness is a concept separate from shouldness.
If someone asks you, “What do you get if you add 2 and 2”, and you are aware that if you answer “4“ he’ll shoot the puppy and if you answer “5” then he’ll let you and the puppy go, then the correct answer is “5”.
You are disputing definitions. You seem to include “should” among the possible meanings of “correct”. When you say, “in this situation, the correct answer is 5”, you refer to the “correctness” of the answer “5”, not to the correctness of 2+2 being 5. Thus, we are talking about an action, not about the truth of 2+2. The action can, for example, be judged according to moral value of its outcome, which is what you seem to mean by “correct” [action].
Thus, in this terminology, “5” is the correct answer, while it’s also correct that the [true] answer is 4. When I say just “the answer is 4“, this is a shorthand for “the true answer is 4”, and doesn’t refer to the actual action, for which it’s true that “the [actual] answer is 5”.
Right, so for some arbitrary formal system, you can derive “4” from “2+2“, and for some other one, you can derive “5” from “2+2”, and in other situations, the correct response to “2+2” is “tacos”.
When you ask “What is 2+2?”, you mean a specific class of formal systems, not an “arbitrary formal system”. The subject matter is fixed by the question, the truth of its answer doesn’t refer to the circumstances of answering it, to situations where you decide what utterance to produce in response.
The truth might be a strategy conditional on the situation in which you answer it, one that could be correctly followed given the specific situation, but that strategy is itself fixed by the question.
For example, I might ask “What should you say when asked the value of 2+2, taking into account the possibility of being threatened by puppy’s death if you say something other than 5?” The correct answer to that question is a strategy where you say “4″ unless puppy’s life is in danger, in which case you say “5”. Note that the strategy is still fixed by the question, even though your action differs with situation in which you carry it out; your action correctly brings about the truth of the answer to the question.
Given that Beauty is being asked the question, the probability that heads had come up is 1⁄3. This doesn’t mean the probability of heads itself is 1⁄3. So I think this is a confusion about what the question is asking. Is the question asking what is the probability of heads, or what is the probability of heads given an awakening?
Bayes theorem:
x = # of times awakened after heads
y = # of times awakened after tails
p(heads/awakened) = n(heads and awakened) / n(awakened) = x / (x+y)
Yields 1⁄3 when x=1 and y=2.
Where is the probability of heads? Actually we already assumed in the calculation above that p(heads) = 0.5. For a general biased coin, the calculation is slightly more complex:
I’m leaving this comment because I think the equations help explain how the probability-of-heads and the probability-of-heads-given-awakening are inter-related but, obviously—I know you know this already—not the same thing.
To clarify, since the probability-of-heads and the probability-of-heads-given-single-awakening-event are different things, it is indeed a matter of semantics: if Beauty is asked about the probability of heads per event … what is the event? Is the event the flip of the coin (p=1/2) or an awakening (p=1/3)? In the post narrative, this remains unclear.
Which event is meant would become clear if it was a wager (and, generally, if anything whatsoever rested on the question). For example: if she is paid per coin flip for being correct (event=coin flip) then she should bet heads to be correct 1 out of 2 times; if she is paid per awakening for being correct (event=awakening) then she should bet tails to be correct 2 out of 3 times.
Actually .. arguing with myself now .. Beauty wasn’t asked about a probability, she was asked if she thought heads had been flipped, in the past. So this is clear after all—did she think heads was flipped, or not?
Viewing it this way, I see the isomorphism with the class of anthropic arguments that ask if you can deduce something about the longevity of humans given that you are an early human. (Being a human in a certain century is like awakening on a certain day.) I suppose then my solution should be the same. Waking up is not evidence either way that heads or tails was flipped. Since her subjective experience is the same however the coin is flipped (she wakes up) she cannot update upon awakening that it is more likely that tails was flipped. Not even if flipping tails means she wakes up 10 billion times more than if heads was flipped.
However, I will think longer if there are any significant differences between the two problems. Thoughts?
Why was this comment down-voted so low? (I rarely ask, but this time I can’t guess.) Is it too basic math? If people are going to argue whether 1⁄3 or 1⁄2, I think it is useful to know their debating about two different probabilities: the probability of heads or the probability of heads given an awakening.
By “awakened” here you mean “awakened at all”. I think you’ve shown already that the probability that heads was flipped given that she was awakened at all is 1⁄2, since in both cases she’s awakened at all and the probability of heads is 1⁄2. I think your dispute is with people who don’t think “I was awakened at all” is all that Beauty knows when she wakes up.
Beauty also knows how many times she it likely to have been woken up when the coin lands heads—and the same for tails. She knew that from the start of the experiment.
OK, I see now why you are emphasizing being awoken at all. That is the relevant event, because that is exactly what she experiences and all that she has to base her decision upon.
(But keep in mind that people are just busy answering different questions, they’re not necessarily incorrect for answering a different question.)
As Wei Dai has said, arguing about which probability is “right” is futile until you have fixed your decision theory and goals that will actually make use of use those probabilities to act. In most use-cases of probability theory, such issues don’t come up.
In sleeping beauty, you are in a situation where such considerations do matter.
If we further specify that sleeping beauty can make a bet and (if she wins) will get the money straight away on Monday, and be allowed to spend it straight away on eating a chocolate bar, and will then be put to sleep again (if the coin came up tails), woken up on Tuesday and be given the same money again and allowed to eat another chocolate bar, then she will do best by saying that the probability of tails is 2⁄3.
But if we specify that the money will be put into an account (and she will only be paid one winning) that she can spend after the experiment is over, which is next week, then she will find that 1⁄2 is the “right” answer
In the sleeping beauty problem, whether the 2⁄3 or 1⁄2 is “right” is just a debate about words. The real issue is what kind of many-instance decision algorithm you are running.
EDIT: Another way of putting this would be to simply abandon the concept of probability altogether and use something like UDT. Probability theory doesn’t work in cases where you have multiple instances of your decision algorithm running.
A bet where she can immediately win, be paid, and consumer her winnings seems to me far more directly connected to the probability of “what state am I in” than a bet where whether the bet is consummated and the bet paid depends on what else happens in other situations that may exist later. It seems crazy to treat both of those as equally valid bets about what state she is in at the moment.
Re: “But if we specify that the money will be put into an account (and she will only be paid one winning) that she can spend after the experiment is over, which is next week, then she will find that 1⁄2 is the “right” answer”
That seems like a rather bizarre way to interpret: “What is your credence NOW for the proposition that our coin landed heads?” [emphasis added]
NOW. One bet.
Again, consider the scenario where at each awakening we offer a bet where she’d lose $1.50 if heads and win $1 if tails, and we tell her that we will only accept whichever bet she made on the final interview.
If her credence for heads on an awakening, on every awakening (she can’t distinguish between awakenings), really was 1⁄3, she would agree to accept the bet. But we all know accepting the bet would be irrational. Thus, her credence for heads on an awakening is not 1⁄3.
So: you are debating what:
“What is your credence now for the proposition that our coin landed heads?”
...actually means. Personally, I think your position on that is indefensible.
This would make it clear exactly where the problem lies—if not for the fact that you also appear to be in a complete muddle about how many times Beauty awakens and is interviewed.
We both know what question is being asked. We both know how many times she awakens and is interviewed. I know what subjective probability is (I assume you do too). I showed you my math. I also explained why your ratio of expected frequencies does not correspond to the subjective probability that you think it does.
Does it not concern you even a little that the Wikipedia article you linked to quite clearly says you are wrong and explains why?
I started by reading the wikipedia page. At that point, the 1⁄3 solution made some sense to me, but I was bothered by the fact that you couldn’t derive it from probability laws. I then read articles by Bostrom and Radford. I spent a lot of time working on the problem, etc. Eventually, I figured out precisely why the 1⁄3 solution is wrong.
Is Wikipedia a stronger authority than me here? Probably. But I know where the argument there fails, so it’s not very convincing.
I think we are nearing the end here. Someone just wrote a whole post explaining why the correct answer is 1/3: http://lesswrong.com/lw/28u/conditioning_on_observers/
It’s fascinating to me that you won’t tell me which probability is wrong, p(H)=1/2, P(monday|H)=1
It’s also interesting that you won’t defend your answer (other than saying I’m wrong). You are in a situation where the number of trials depends on outcome, but are using an estimator that is valid for independent trials. Show me that yours converges to a probability. Standard theory doesn’t hold here.
Probabilities are subjective. From Beauty’s POV, if she has just awakened to face an interview, then p(H)=1/3. If she has learned that is Friday and the experiment is over, (but she has not yet been told which side the coin came down), then she updates on that info, and then p(H)=1/2. So, the value of p(H) depends on who is being asked—and on what information they have at the time.
It’s the first one—P(H)=1/2 is wrong. Before going any further, we should adopt Jaynes’ habit of always labelling the prior knowledge in our probabilities, because there are in fact two probabilities that we care about: P(H|the experiment ran), and P(H|Sleeping Beauty has just been woken). These are 1⁄2 and 1⁄3, respectively. The first of these probabilities is given in the problem statement, but the second is what is asked for, and what should be used for calculating expected value in any betting, because any bets made occur twice if the coin was tails.
How can these things be different, P(H|the experiment ran) and P(H|Sleeping Beauty has just been woken)?
Yes, a bet would occur twice if tails, if you set the problem up that way. But the question has to do with her credence at an awakening.
The 1⁄3 calculation is derived from treating the 3 counts as if they arose from independent draws of a mulitinomial distribution. They are not independent draws. There is 1 degree of freedom, not 2. Thus, the ratio that lead to the 1⁄3 value is not the probability that people seem to think it is. It’s not clear that the ratio is a probability at all.
What’s this about a multinomial distribution and degrees of freedom? I calculated P(H|W) as E(occurances of H&&W)/E(occurances of W) = (1/2)/(3/2) = 1⁄3.
Yes, exactly. That would be a valid probability if these were expected frequencies from independent draws of a multinomial distribution (it would have 2 degrees of freedom). Your ratio of expected values does not result in P(H|W).
It might become clear if you think about it this way. Your expected number of occurrences of W is greater than the largest possible value of occurrences of H&W. You don’t have a ratio of number of events to number of independent trials.
Picture a 3 by 1 contingency table, where we have counts in 3 cells: Monday&H, Monday&T, Tuesday&T. Typically, a 3 by 1 contingency table will have 2 degrees of freedom (the count in the 3rd cell is determined by the number of trials and the counts in the other cells). Standard statistical theory says you can estimate the probability for cell one by taking the cell one count and dividing by the total. That’s not the situation with the sleeping beauty problem. There is just one degree of freedom. If we know the count the number of coin flips and the count in one of the cells, we know the count in the other two. Standard statistical theory does not apply. The ratio of count for cell one to the total is not the probability for cell one.
Occurances of H&&W are a strict subset of occurances of W, so if to use the terminology of events and trials, each waking is a trial, and each waking where the coin was heads is a positive result. That’s 1⁄3 of all trials, so a probability of 1⁄3.
If each waking is a trial, then you have a situation where the number of trials is outcome dependent. Your estimator would be valid if the number of trials was not outcome dependent. This is the heart of the matter. The ratio of cell counts here is just not a probability.
The number of trials being outcome dependent only matters if you are using the frequentist definition of probability, or if it causes you to collect fewer trials than you need to overcome noise. We’re computing with probabilities straight from the problem statement, so there’s no noise, and as a Bayesian, I don’t care about the frequentists’ broken definition.
This has nothing to do with Bayesian vs. Frequentist. We’re just calculated probabilities from the problem statement, like you said. From the problem, we know P(H)=1/2, P(Monday|H)=1, etc, which leads to P(H|Monday or Tuesday)=1/2. The 1⁄3 calculation is not from the problem statement, but rather from a misapplication of large sample theory. The outcome-dependent sampling biases your estimator.
And it’s strange that you don’t call your approach Frequentist, when you derived it from expected cell counts in repeated samples.
Don’t forget—around here ‘Bayesian’ is used normatively, and as part of some sort of group identification. “Bayesians” here will often use frequentist approaches in particular problems.
But that can be legitimate, as Bayesian calculations are a superset of frequentist calculations. Nothing bars a Bayesian from postulating that a limiting frequency exists in an unbounded number of trials in some hypothetical situation; but you won’t see one, e.g., accept R.A. Fisher’s argument for his use of p-values for statistical inference.
I adopted some frequentist terminology for purposes of this discussion because none of the other explanations I or others had posted seemed to be getting through, and I thought that might be the problem.
The reason I said that there’s a frequentist vs. Bayesian issue here is because the frequentist probabilitiy definition I’m most familiar with is P(f) = lim n->inf sum(f(i), i=1..n)/n, where f(x) is the x’th repetition of an independent repeatable experiment, and that definition is hard to reconcile with SB sometimes being asked twice. I assumed that issue, or a rule justified by that issue, was behind your objection.
I adopted some frequentist terminology for purposes of this discussion because none of the other explanations I or others had posted seemed to be getting through, and I thought that might be the problem.
The reason I said that there’s a frequentist vs. Bayesian issue here is because the frequentist probabilitiy definition I’m most familiar with is P(f) = lim n->inf sum(f(i), i=1..n)/n, where f(x) is the x’th repetition of an independent repeatable experiment, and that definition is hard to reconcile with SB sometimes being asked twice. I assumed that issue, or a rule justified by that issue, was behind your objection.
Not quite. The question of what do we mean by probability in this case is valid, but probability shouldn’t be just about bets. Probability is bound to a specific model of the situation, with sample space, probability measure, and events. The concept of “probability” doesn’t just mean “the password you use to win bets to your satisfaction”. Of course this depends on your ontological assumptions, but usually we are safe with a “possible worlds” model.
I’d like to hear what you and Wei Dai discuss that one further; I was taken with Wei’s insight that probability is for making decisions.…
It is for making decisions, specifically for expressing preference under the expected utility axioms and where uniform distribution is suggested by indifference to moral value of a set of outcomes and absence of prior knowledge about the outcomes. Preference is usually expressed about sets of possible worlds, and I don’t see how you can construct a natural sample space out of possible worlds for the answer of 2⁄3.
The sample space would be the three-element set {monday-tails, monday-heads, tuesday-tails} of possible sleeping beauty experience-moments.
Of course that’s the obvious answer, but it also has some problems that don’t seem easily redeemable. The sample space has to reflect the outcome of one’s actions in the world on which preference is defined, which usually means the set of possible worlds. “Experience-moments” are not carved the right way (not mutually exclusive, can’t update on observations, etc.)
Experience moments are “mutually exclusive”, in the sense that every experience moment can be uniquely identified in theory, and any given agent at any given time is only having one specific observer moment. However there is the possibility of subjectively indistinguishable experiences. I don’t understand what you mean by “can’t update”.
By “can’t update” I refer to the problem with marking Thursday “impossible”, since you’ll encounter Thursday later.
It’s not a problem with the model of ontology and preference, it’s merely specifics of what kinds of observation events are expected.
If the goal is to identify an event corresponding to observations in the form of a set of possible worlds, and there are different-looking observations that could correspond to the same event (e.g. observed at different time in the same possible world), their difference is pure logical uncertainty. They differ, but only in the same sense as 2+2 and (7-5)*(9-7) differ, where you need but to compute denotation: the agent running on the described model doesn’t care about the difference, indeed wants to factor it out.
Sorry, I don’t know of this problem. I thought that the days in this example were Monday and Tuesday—what’s going on with Thursday?
I humbly apologize for my inability to read (may the Values of Less Wrong be merciful).
Ah, OK. But I still don’t understand this:
Hmm, my argument is summarized in this phrase:
If you update on your knowledge that “it’s not Tuesday”, it means that you’ve thrown away the parts of your sample space that contain the territory corresponding to Tuesday, marked them impossible, no longer part of what you can think about, what you can expect to observe again (interpret as implied by observations). Assuming the model is honest, that you really do conceptualize the world through that model, your mind is now blind to the possibility of Tuesday. Come Tuesday, you’ll be able to understand your observations in any way but as implying that it’s Tuesday, or that the events you observe are the ones that could possibly occur on Tuesday.
This is not a way to treat your mind. (But then again, I’m probably being too direct in applying the consequences of really believing what is being suggested, as in the case of Pascal’s Wager, for it to reflect the problem statement you consider.)
I don’t see how this is related to the problem of observer-moments—the argument above holds for any event X: “What if you’ve observed ~X, and then you find that X”. What’s the connection?
In a probability space where you have distinct (non-intersecting) “Monday” and “Tuesday”, it is expected (in the informal sense, outside the broken model) that you’ll observe Tuesday after observing Monday, that upon observing Monday you rule out Tuesday, and that upon observing Tuesday you won’t be able to recognize it as such because it’s already ruled out. “Observer-moments” can be located on the same history, and a probability space that distinguishes them will tear down your understanding of the other observer-moments once you’ve observed one of them and excluded the rest. This model promises you a map disconnected from reality.
It is not the case with a probability space based on possible worlds that after concluding ~X, you expect (in the informal sense) to observe X after that. Possible worlds model is in accordance with this (informal) axiom. Sample space based on “observer-moments” is not.
This has nothing to do with semantics. If smart people are saying “2+2=5” and I point out it’s 4, would you say “what matters is why you want to know what 2+2 is”?
The question here is very well defined. There is only one right answer. The fact that even very smart people come up with the wrong answer has all kinds of implications about the type of errors we might make on a regular basis (and lead to bad theories, decisions, etc).
If you mean something else by probability than “at what odds would you be indifferent to accepting a bet on this proposition” then you need to explain what you mean. You are just coming across as confused. You’ve already acknowledged that sleeping beauty would be wrong to turn down a 50:50 bet on tails. What proposition is being bet on when you would be correct to be indifferent at 50:50 odds?
There is a mismatch between the betting question and the original question about probability.
At an awakening, she has no more information about heads or tails than she had originally, but we’re forcing her to bet twice under tails. So, even if her credence for heads was a half, she still wouldn’t make the bet.
Suppose I am going to flip a coin and I tell you you win $1 if heads and lose $2 if tails. You could calculate that the p(H) would have to be 2⁄3 in order for this to be a fair bet (have 0 expectation). That doesn’t imply that the p(H) is actually 2⁄3. It’s a different question. This is a really important point, a point that I think has caused much confusion.
Do you think this analysis works for the fact that a well-calibrated Beauty answers “1/3”? Do you think there’s a problem with our methods of judging calibration?
You seem to agree she should take a 50:50 bet on tails. What would be the form of the bet where she should be indifferent to 50:50 odds? If you can answer this question and explain why you think it is a more relevant probability then you may be able to resolve the confusion.
Roko has already given an example of such a bet: where she only gets one pay out in the tails case. Is this what you are claiming is the more relevant probability? If so, why is this probability more relevant in your estimation?
Yes, one pay out is the relevant case. The reason is because we are asking about her credence at an awakening.
How does the former follow from the latter, exactly? I seem to need that spelled out.
The interviewer asks about her credence ‘right now’ (at an awakening). If we want to set up a betting problem based around that decision, why would it involve placing bets on possibly two different days?
If, at an awakening, Beauty really believes that it’s tails with credence 0.67, then she would gladly take a single bet of win $1 if tails and lose $1.50 if heads. If she wouldn’t take that bet, why should we believe that her credence for heads at an awakening is 1/3?
What do you think the word “credence” means? I am thinking that perhaps that is the cause of your problems.
I’m treating credence for heads as her confidence in heads, as expressed as a number between 0 and 1 (inclusive), given everything she knows at the time. I see it as the same things as a posterior probability.
I don’t think disagreement is due to different uses of the word credence. It appears to me that we are all talking about the same thing.
I think that the difference between evaluating 2+2 and assigning probabilities (and the reason for the large amount of disagreement) is that 2+2 is a statement in a formal language, whereas what kind of anthropic principle to accept/how to interpret probability is a philosophical one.
Don’t be fooled by the simple Bayes’ theorem calculations—they are not the hard part of this question.
So the difficult question here is which probability space to set up, not how to compute conditional probabilities given that probability space.
(Posted as an antidote to misinterpretation of your comment I committed a moment before.)
A philosophical question, as opposed to a formal one, is a question that hasn’t been properly understood yet. It is a case of ignorance in the mind, not a case of fuzzy territory.
Yes. For example, let’s take a clearer mathematical statement, “3 is prime”. It seems that’s true whatever people say. However, if you come across some mathematicians who are having a discussion that assumes 3 is not prime, then you should think you’re missing some context rather than that they are bad at math.
I chose this example because I once constructed an integer-like system based on half-steps (the successor function adds .5). The system has a notion of primality, and 3 is not prime.
If you want a standard system where 3 is not prime consider Z[omega] where omega^3=1 and omega is not 1. That is, the set of numbers formed by taking all sums, differences, and products of 1 and omega.
What you should say when asked “What is 2+2?” is a separate question from what is 2+2. 2+2 is 4, but you should probably say something else if the situation calls to that. The circumstances that could force you to say something in response to a given question are unrelated to what the answer to that question really is. The truth of the answer to a question is implicit in the question, not in the question-answering situation, unless the question is about the question-answering situation.
I disagree. The correct answer to a question is exactly what you should answer to that question. It’s what “correct” and “should” mean.
“Should” refers to moral value of the outcome, and if someone is holding a gun to a puppy’s head and says “if you say that 2+2=4, the puppy will die!”, you shouldn’t answer “4” to the question, even though it’s correct that the answer is 4. Correctness is a concept separate from shouldness.
If someone asks you, “What do you get if you add 2 and 2”, and you are aware that if you answer “4“ he’ll shoot the puppy and if you answer “5” then he’ll let you and the puppy go, then the correct answer is “5”.
You are disputing definitions. You seem to include “should” among the possible meanings of “correct”. When you say, “in this situation, the correct answer is 5”, you refer to the “correctness” of the answer “5”, not to the correctness of 2+2 being 5. Thus, we are talking about an action, not about the truth of 2+2. The action can, for example, be judged according to moral value of its outcome, which is what you seem to mean by “correct” [action].
Thus, in this terminology, “5” is the correct answer, while it’s also correct that the [true] answer is 4. When I say just “the answer is 4“, this is a shorthand for “the true answer is 4”, and doesn’t refer to the actual action, for which it’s true that “the [actual] answer is 5”.
Right, so for some arbitrary formal system, you can derive “4” from “2+2“, and for some other one, you can derive “5” from “2+2”, and in other situations, the correct response to “2+2” is “tacos”.
When you ask “What is 2+2?”, you mean a specific class of formal systems, not an “arbitrary formal system”. The subject matter is fixed by the question, the truth of its answer doesn’t refer to the circumstances of answering it, to situations where you decide what utterance to produce in response.
The truth might be a strategy conditional on the situation in which you answer it, one that could be correctly followed given the specific situation, but that strategy is itself fixed by the question.
For example, I might ask “What should you say when asked the value of 2+2, taking into account the possibility of being threatened by puppy’s death if you say something other than 5?” The correct answer to that question is a strategy where you say “4″ unless puppy’s life is in danger, in which case you say “5”. Note that the strategy is still fixed by the question, even though your action differs with situation in which you carry it out; your action correctly brings about the truth of the answer to the question.
Given that Beauty is being asked the question, the probability that heads had come up is 1⁄3. This doesn’t mean the probability of heads itself is 1⁄3. So I think this is a confusion about what the question is asking. Is the question asking what is the probability of heads, or what is the probability of heads given an awakening?
Bayes theorem:
x = # of times awakened after heads
y = # of times awakened after tails
p(heads/awakened) = n(heads and awakened) / n(awakened) = x / (x+y)
Yields 1⁄3 when x=1 and y=2.
Where is the probability of heads? Actually we already assumed in the calculation above that p(heads) = 0.5. For a general biased coin, the calculation is slightly more complex:
p(H) =probability of heads
p(T) = probability of tails
x = # of times awakened after heads
y = # of times awakened after tails
p(heads/awakened) = n(heads and awakened) / n(awakened) = p(H)x / (p(H)x + p(T)y)
Yields 1⁄3 when x=1 and y=2 and p(H)=p(T)=0.5.
I’m leaving this comment because I think the equations help explain how the probability-of-heads and the probability-of-heads-given-awakening are inter-related but, obviously—I know you know this already—not the same thing.
To clarify, since the probability-of-heads and the probability-of-heads-given-single-awakening-event are different things, it is indeed a matter of semantics: if Beauty is asked about the probability of heads per event … what is the event? Is the event the flip of the coin (p=1/2) or an awakening (p=1/3)? In the post narrative, this remains unclear.
Which event is meant would become clear if it was a wager (and, generally, if anything whatsoever rested on the question). For example: if she is paid per coin flip for being correct (event=coin flip) then she should bet heads to be correct 1 out of 2 times; if she is paid per awakening for being correct (event=awakening) then she should bet tails to be correct 2 out of 3 times.
Actually .. arguing with myself now .. Beauty wasn’t asked about a probability, she was asked if she thought heads had been flipped, in the past. So this is clear after all—did she think heads was flipped, or not?
Viewing it this way, I see the isomorphism with the class of anthropic arguments that ask if you can deduce something about the longevity of humans given that you are an early human. (Being a human in a certain century is like awakening on a certain day.) I suppose then my solution should be the same. Waking up is not evidence either way that heads or tails was flipped. Since her subjective experience is the same however the coin is flipped (she wakes up) she cannot update upon awakening that it is more likely that tails was flipped. Not even if flipping tails means she wakes up 10 billion times more than if heads was flipped.
However, I will think longer if there are any significant differences between the two problems. Thoughts?
Why was this comment down-voted so low? (I rarely ask, but this time I can’t guess.) Is it too basic math? If people are going to argue whether 1⁄3 or 1⁄2, I think it is useful to know their debating about two different probabilities: the probability of heads or the probability of heads given an awakening.
This is incorrect.
Given that Beauty is being asked the question, the probability that heads had come up is 1⁄2.
This is bayes’ theorem:
p(H)=1/2
p(awakened|H)=p(awakened|T)=1
P(H|awakened)=p(awakened|H)P(H)/(p(awakened|H)p(H)+p(awakened|T)p(T))
which equals 1⁄2
By “awakened” here you mean “awakened at all”. I think you’ve shown already that the probability that heads was flipped given that she was awakened at all is 1⁄2, since in both cases she’s awakened at all and the probability of heads is 1⁄2. I think your dispute is with people who don’t think “I was awakened at all” is all that Beauty knows when she wakes up.
Beauty also knows how many times she it likely to have been woken up when the coin lands heads—and the same for tails. She knew that from the start of the experiment.
OK, I see now why you are emphasizing being awoken at all. That is the relevant event, because that is exactly what she experiences and all that she has to base her decision upon.
(But keep in mind that people are just busy answering different questions, they’re not necessarily incorrect for answering a different question.)