Beauty quips, “I’d shut up and multiply!”
When it comes to probability, you should trust probability laws over your intuition. Many people got the Monty Hall problem wrong because their intuition was bad. You can get the solution to that problem using probability laws that you learned in Stats 101 -- it’s not a hard problem. Similarly, there has been a lot of debate about the Sleeping Beauty problem. Again, though, that’s because people are starting with their intuition instead of letting probability laws lead them to understanding.
The Sleeping Beauty Problem
On Sunday she is given a drug that sends her to sleep. A fair coin is then tossed just once in the course of the experiment to determine which experimental procedure is undertaken. If the coin comes up heads, Beauty is awakened and interviewed on Monday, and then the experiment ends. If the coin comes up tails, she is awakened and interviewed on Monday, given a second dose of the sleeping drug, and awakened and interviewed again on Tuesday. The experiment then ends on Tuesday, without flipping the coin again. The sleeping drug induces a mild amnesia, so that she cannot remember any previous awakenings during the course of the experiment (if any). During the experiment, she has no access to anything that would give a clue as to the day of the week. However, she knows all the details of the experiment.
Each interview consists of one question, “What is your credence now for the proposition that our coin landed heads?”
Two popular solutions have been proposed: 1⁄3 and 1⁄2
The 1⁄3 solution
From wikipedia:
Suppose this experiment were repeated 1,000 times. We would expect to get 500 heads and 500 tails. So Beauty would be awoken 500 times after heads on Monday, 500 times after tails on Monday, and 500 times after tails on Tuesday. In other words, only in a third of the cases would heads precede her awakening. So the right answer for her to give is 1⁄3.
Yes, it’s true that only in a third of cases would heads precede her awakening.
Radford Neal (a statistician!) argues that 1⁄3 is the correct solution.
This [the 1⁄3] view can be reinforced by supposing that on each awakening Beauty is offered a bet in which she wins 2 dollars if the coin lands Tails and loses 3 dollars if it lands Heads. (We suppose that Beauty knows such a bet will always be offered.) Beauty would not accept this bet if she assigns probability 1⁄2 to Heads. If she assigns a probability of 1⁄3 to Heads, however, her expected gain is 2 × (2/3) − 3 × (1/3) = 1⁄3, so she will accept, and if the experiment is repeated many times, she will come out ahead.
Neal is correct (about the gambling problem).
These two arguments for the 1⁄3 solution appeal to intuition and make no obvious mathematical errors. So why are they wrong?
Let’s first start with probability laws and show why the 1⁄2 solution is correct. Just like with the Monty Hall problem, once you understand the solution, the wrong answer will no longer appeal to your intuition.
The 1⁄2 solution
P(Beauty woken up at least once| heads)=P(Beauty woken up at least once | tails)=1. Because of the amnesia, all Beauty knows when she is woken up is that she has woken up at least once. That event had the same probability of occurring under either coin outcome. Thus, P(heads | Beauty woken up at least once)=1/2. You can use Bayes’ rule to see this if it’s unclear.
Here’s another way to look at it:
If it landed heads then Beauty is woken up on Monday with probability 1.
If it landed tails then Beauty is woken up on Monday and Tuesday. From her perspective, these days are indistinguishable. She doesn’t know if she was woken up the day before, and she doesn’t know if she’ll be woken up the next day. Thus, we can view Monday and Tuesday as exchangeable here.
A probability tree can help with the intuition (this is a probability tree corresponding to an arbitrary wake up day):
If Beauty was told the coin came up heads, then she’d know it was Monday. If she was told the coin came up tails, then she’d think there is a 50% chance it’s Monday and a 50% chance it’s Tuesday. Of course, when Beauty is woken up she is not told the result of the flip, but she can calculate the probability of each.
When she is woken up, she’s somewhere on the second set of branches. We have the following joint probabilities: P(heads, Monday)=1/2; P(heads, not Monday)=0; P(tails, Monday)=1/4; P(tails, Tuesday)=1/4; P(tails, not Monday or Tuesday)=0. Thus, P(heads)=1/2.
Where the 1⁄3 arguments fail
The 1⁄3 argument says with heads there is 1 interview, with tails there are 2 interviews, and therefore the probability of heads is 1⁄3. However, the argument would only hold if all 3 interview days were equally likely. That’s not the case here. (on a wake up day, heads&Monday is more likely than tails&Monday, for example).
Neal’s argument fails because he changed the problem. “on each awakening Beauty is offered a bet in which she wins 2 dollars if the coin lands Tails and loses 3 dollars if it lands Heads.” In this scenario, she would make the bet twice if tails came up and once if heads came up. That has nothing to do with probability about the event at a particular awakening. The fact that she should take the bet doesn’t imply that heads is less likely. Beauty just knows that she’ll win the bet twice if tails landed. We double count for tails.
Imagine I said “if you guess heads and you’re wrong nothing will happen, but if you guess tails and you’re wrong I’ll punch you in the stomach.” In that case, you will probably guess heads. That doesn’t mean your credence for heads is 1 -- it just means I added a greater penalty to the other option.
Consider changing the problem to something more extreme. Here, we start with heads having probability 0.99 and tails having probability 0.01. If heads comes up we wake Beauty up once. If tails, we wake her up 100 times. Thirder logic would go like this: if we repeated the experiment 1000 times, we’d expect her woken up 990 after heads on Monday, 10 times after tails on Monday (day 1), 10 times after tails on Tues (day 2),...., 10 times after tails on day 100. In other words, ~50% of the cases would heads precede her awakening. So the right answer for her to give is 1⁄2.
Of course, this would be absurd reasoning. Beauty knows heads has a 99% chance initially. But when she wakes up (which she was guaranteed to do regardless of whether heads or tails came up), she suddenly thinks they’re equally likely? What if we made it even more extreme and woke her up even more times on tails?
Implausible consequence of 1⁄2 solution?
Nick Bostrom presents the Extreme Sleeping Beauty problem:
This is like the original problem, except that here, if the coin falls tails, Beauty will be awakened on a million subsequent days. As before, she will be given an amnesia drug each time she is put to sleep that makes her forget any previous awakenings. When she awakes on Monday, what should be her credence in HEADS?
He argues:
The adherent of the 1⁄2 view will maintain that Beauty, upon awakening, should retain her credence of 1⁄2 in HEADS, but also that, upon being informed that it is Monday, she should become extremely confident in HEADS:
P+(HEADS) = 1,000,001⁄1,000,002
This consequence is itself quite implausible. It is, after all, rather gutsy to have credence 0.999999% in the proposition that an unobserved fair coin will fall heads.
It’s correct that, upon awakening on Monday (and not knowing it’s Monday), she should retain her credence of 1⁄2 in heads.
However, if she is informed it’s Monday, it’s unclear what she conclude. Why was she informed it was Monday? Consider two alternatives.
Disclosure process 1: regardless of the result of the coin toss she will be informed it’s Monday on Monday with probability 1
Under disclosure process 1, her credence of heads on Monday is still 1⁄2.
Disclosure process 2: if heads she’ll be woken up and informed that it’s Monday. If tails, she’ll be woken up on Monday and one million subsequent days, and only be told the specific day on one randomly selected day.
Under disclosure process 2, if she’s informed it’s Monday, her credence of heads is 1,000,001⁄1,000,002. However, this is not implausible at all. It’s correct. This statement is misleading: “It is, after all, rather gutsy to have credence 0.999999% in the proposition that an unobserved fair coin will fall heads.” Beauty isn’t predicting what will happen on the flip of a coin, she’s predicting what did happen after receiving strong evidence that it’s heads.
ETA (5/9/2010 5:38AM)
If we want to replicate the situation 1000 times, we shouldn’t end up with 1500 observations. The correct way to replicate the awakening decision is to use the probability tree I included above. You’d end up with expected cell counts of 500, 250, 250, instead of 500, 500, 500.
Suppose at each awakening, we offer Beauty the following wager: she’d lose $1.50 if heads but win $1 if tails. She is asked for a decision on that wager at every awakening, but we only accept her last decision. Thus, if tails we’ll accept her Tuesday decision (but won’t tell her it’s Tuesday). If her credence of heads is 1⁄3 at each awakening, then she should take the bet. If her credence of heads is 1⁄2 at each awakening, she shouldn’t take the bet. If we repeat the experiment many times, she’d be expected to lose money if she accepts the bet every time.
The problem with the logic that leads to the 1⁄3 solution is it counts twice under tails, but the question was about her credence at an awakening (interview).
ETA (5/10/2010 10:18PM ET)
Suppose this experiment were repeated 1,000 times. We would expect to get 500 heads and 500 tails. So Beauty would be awoken 500 times after heads on Monday, 500 times after tails on Monday, and 500 times after tails on Tuesday. In other words, only in a third of the cases would heads precede her awakening. So the right answer for her to give is 1⁄3.
Another way to look at it: the denominator is not a sum of mutually exclusive events. Typically we use counts to estimate probabilities as follows: the numerator is the number of times the event of interest occurred, and the denominator is the number of times that event could have occurred.
For example, suppose Y can take values 1, 2 or 3 and follows a multinomial distribution with probabilities p1, p2 and p3=1-p1-p2, respectively. If we generate n values of Y, we could estimate p1 by taking the ratio of #{Y=1}/(#{Y=1}+#{Y=2}+#{Y=3}). As n goes to infinity, the ratio will converge to p1. Notice the events in the denominator are mutually exclusive and exhaustive. The denominator is determined by n.
The thirder solution to the Sleeping Beauty problem has as its denominator sums of events that are not mutually exclusive. The denominator is not determined by n. For example, if we repeat it 1000 times, and we get 400 heads, our denominator would be 400+600+600=1600 (even though it was not possible to get 1600 heads!). If we instead got 550 heads, our denominator would be 550+450+450=1450. Our denominator is outcome dependent, where here the outcome is the occurrence of heads. What does this ratio converge to as n goes to infinity? I surely don’t know. But I do know it’s not the posterior probability of heads.
- If a tree falls on Sleeping Beauty... by 12 Nov 2010 1:14 UTC; 147 points) (
- Conditioning on Observers by 11 May 2010 5:15 UTC; 12 points) (
- 17 May 2010 18:12 UTC; 1 point) 's comment on Updating, part 1: When can you change your mind? The binary model by (
- 17 Jun 2017 2:42 UTC; 0 points) 's comment on Open thread, June. 12 - June. 18, 2017 by (
- 13 May 2010 18:36 UTC; 0 points) 's comment on Updating, part 1: When can you change your mind? The binary model by (
As Wei Dai has said, arguing about which probability is “right” is futile until you have fixed your decision theory and goals that will actually make use of use those probabilities to act. In most use-cases of probability theory, such issues don’t come up.
In sleeping beauty, you are in a situation where such considerations do matter.
If we further specify that sleeping beauty can make a bet and (if she wins) will get the money straight away on Monday, and be allowed to spend it straight away on eating a chocolate bar, and will then be put to sleep again (if the coin came up tails), woken up on Tuesday and be given the same money again and allowed to eat another chocolate bar, then she will do best by saying that the probability of tails is 2⁄3.
But if we specify that the money will be put into an account (and she will only be paid one winning) that she can spend after the experiment is over, which is next week, then she will find that 1⁄2 is the “right” answer
In the sleeping beauty problem, whether the 2⁄3 or 1⁄2 is “right” is just a debate about words. The real issue is what kind of many-instance decision algorithm you are running.
EDIT: Another way of putting this would be to simply abandon the concept of probability altogether and use something like UDT. Probability theory doesn’t work in cases where you have multiple instances of your decision algorithm running.
A bet where she can immediately win, be paid, and consumer her winnings seems to me far more directly connected to the probability of “what state am I in” than a bet where whether the bet is consummated and the bet paid depends on what else happens in other situations that may exist later. It seems crazy to treat both of those as equally valid bets about what state she is in at the moment.
Re: “But if we specify that the money will be put into an account (and she will only be paid one winning) that she can spend after the experiment is over, which is next week, then she will find that 1⁄2 is the “right” answer”
That seems like a rather bizarre way to interpret: “What is your credence NOW for the proposition that our coin landed heads?” [emphasis added]
NOW. One bet.
Again, consider the scenario where at each awakening we offer a bet where she’d lose $1.50 if heads and win $1 if tails, and we tell her that we will only accept whichever bet she made on the final interview.
If her credence for heads on an awakening, on every awakening (she can’t distinguish between awakenings), really was 1⁄3, she would agree to accept the bet. But we all know accepting the bet would be irrational. Thus, her credence for heads on an awakening is not 1⁄3.
So: you are debating what:
“What is your credence now for the proposition that our coin landed heads?”
...actually means. Personally, I think your position on that is indefensible.
This would make it clear exactly where the problem lies—if not for the fact that you also appear to be in a complete muddle about how many times Beauty awakens and is interviewed.
We both know what question is being asked. We both know how many times she awakens and is interviewed. I know what subjective probability is (I assume you do too). I showed you my math. I also explained why your ratio of expected frequencies does not correspond to the subjective probability that you think it does.
Does it not concern you even a little that the Wikipedia article you linked to quite clearly says you are wrong and explains why?
I started by reading the wikipedia page. At that point, the 1⁄3 solution made some sense to me, but I was bothered by the fact that you couldn’t derive it from probability laws. I then read articles by Bostrom and Radford. I spent a lot of time working on the problem, etc. Eventually, I figured out precisely why the 1⁄3 solution is wrong.
Is Wikipedia a stronger authority than me here? Probably. But I know where the argument there fails, so it’s not very convincing.
I think we are nearing the end here. Someone just wrote a whole post explaining why the correct answer is 1/3: http://lesswrong.com/lw/28u/conditioning_on_observers/
It’s fascinating to me that you won’t tell me which probability is wrong, p(H)=1/2, P(monday|H)=1
It’s also interesting that you won’t defend your answer (other than saying I’m wrong). You are in a situation where the number of trials depends on outcome, but are using an estimator that is valid for independent trials. Show me that yours converges to a probability. Standard theory doesn’t hold here.
Probabilities are subjective. From Beauty’s POV, if she has just awakened to face an interview, then p(H)=1/3. If she has learned that is Friday and the experiment is over, (but she has not yet been told which side the coin came down), then she updates on that info, and then p(H)=1/2. So, the value of p(H) depends on who is being asked—and on what information they have at the time.
It’s the first one—P(H)=1/2 is wrong. Before going any further, we should adopt Jaynes’ habit of always labelling the prior knowledge in our probabilities, because there are in fact two probabilities that we care about: P(H|the experiment ran), and P(H|Sleeping Beauty has just been woken). These are 1⁄2 and 1⁄3, respectively. The first of these probabilities is given in the problem statement, but the second is what is asked for, and what should be used for calculating expected value in any betting, because any bets made occur twice if the coin was tails.
How can these things be different, P(H|the experiment ran) and P(H|Sleeping Beauty has just been woken)?
Yes, a bet would occur twice if tails, if you set the problem up that way. But the question has to do with her credence at an awakening.
The 1⁄3 calculation is derived from treating the 3 counts as if they arose from independent draws of a mulitinomial distribution. They are not independent draws. There is 1 degree of freedom, not 2. Thus, the ratio that lead to the 1⁄3 value is not the probability that people seem to think it is. It’s not clear that the ratio is a probability at all.
What’s this about a multinomial distribution and degrees of freedom? I calculated P(H|W) as E(occurances of H&&W)/E(occurances of W) = (1/2)/(3/2) = 1⁄3.
Yes, exactly. That would be a valid probability if these were expected frequencies from independent draws of a multinomial distribution (it would have 2 degrees of freedom). Your ratio of expected values does not result in P(H|W).
It might become clear if you think about it this way. Your expected number of occurrences of W is greater than the largest possible value of occurrences of H&W. You don’t have a ratio of number of events to number of independent trials.
Picture a 3 by 1 contingency table, where we have counts in 3 cells: Monday&H, Monday&T, Tuesday&T. Typically, a 3 by 1 contingency table will have 2 degrees of freedom (the count in the 3rd cell is determined by the number of trials and the counts in the other cells). Standard statistical theory says you can estimate the probability for cell one by taking the cell one count and dividing by the total. That’s not the situation with the sleeping beauty problem. There is just one degree of freedom. If we know the count the number of coin flips and the count in one of the cells, we know the count in the other two. Standard statistical theory does not apply. The ratio of count for cell one to the total is not the probability for cell one.
Occurances of H&&W are a strict subset of occurances of W, so if to use the terminology of events and trials, each waking is a trial, and each waking where the coin was heads is a positive result. That’s 1⁄3 of all trials, so a probability of 1⁄3.
If each waking is a trial, then you have a situation where the number of trials is outcome dependent. Your estimator would be valid if the number of trials was not outcome dependent. This is the heart of the matter. The ratio of cell counts here is just not a probability.
The number of trials being outcome dependent only matters if you are using the frequentist definition of probability, or if it causes you to collect fewer trials than you need to overcome noise. We’re computing with probabilities straight from the problem statement, so there’s no noise, and as a Bayesian, I don’t care about the frequentists’ broken definition.
This has nothing to do with Bayesian vs. Frequentist. We’re just calculated probabilities from the problem statement, like you said. From the problem, we know P(H)=1/2, P(Monday|H)=1, etc, which leads to P(H|Monday or Tuesday)=1/2. The 1⁄3 calculation is not from the problem statement, but rather from a misapplication of large sample theory. The outcome-dependent sampling biases your estimator.
And it’s strange that you don’t call your approach Frequentist, when you derived it from expected cell counts in repeated samples.
Don’t forget—around here ‘Bayesian’ is used normatively, and as part of some sort of group identification. “Bayesians” here will often use frequentist approaches in particular problems.
But that can be legitimate, as Bayesian calculations are a superset of frequentist calculations. Nothing bars a Bayesian from postulating that a limiting frequency exists in an unbounded number of trials in some hypothetical situation; but you won’t see one, e.g., accept R.A. Fisher’s argument for his use of p-values for statistical inference.
I adopted some frequentist terminology for purposes of this discussion because none of the other explanations I or others had posted seemed to be getting through, and I thought that might be the problem.
The reason I said that there’s a frequentist vs. Bayesian issue here is because the frequentist probabilitiy definition I’m most familiar with is P(f) = lim n->inf sum(f(i), i=1..n)/n, where f(x) is the x’th repetition of an independent repeatable experiment, and that definition is hard to reconcile with SB sometimes being asked twice. I assumed that issue, or a rule justified by that issue, was behind your objection.
I adopted some frequentist terminology for purposes of this discussion because none of the other explanations I or others had posted seemed to be getting through, and I thought that might be the problem.
The reason I said that there’s a frequentist vs. Bayesian issue here is because the frequentist probabilitiy definition I’m most familiar with is P(f) = lim n->inf sum(f(i), i=1..n)/n, where f(x) is the x’th repetition of an independent repeatable experiment, and that definition is hard to reconcile with SB sometimes being asked twice. I assumed that issue, or a rule justified by that issue, was behind your objection.
Not quite. The question of what do we mean by probability in this case is valid, but probability shouldn’t be just about bets. Probability is bound to a specific model of the situation, with sample space, probability measure, and events. The concept of “probability” doesn’t just mean “the password you use to win bets to your satisfaction”. Of course this depends on your ontological assumptions, but usually we are safe with a “possible worlds” model.
I’d like to hear what you and Wei Dai discuss that one further; I was taken with Wei’s insight that probability is for making decisions.…
It is for making decisions, specifically for expressing preference under the expected utility axioms and where uniform distribution is suggested by indifference to moral value of a set of outcomes and absence of prior knowledge about the outcomes. Preference is usually expressed about sets of possible worlds, and I don’t see how you can construct a natural sample space out of possible worlds for the answer of 2⁄3.
The sample space would be the three-element set {monday-tails, monday-heads, tuesday-tails} of possible sleeping beauty experience-moments.
Of course that’s the obvious answer, but it also has some problems that don’t seem easily redeemable. The sample space has to reflect the outcome of one’s actions in the world on which preference is defined, which usually means the set of possible worlds. “Experience-moments” are not carved the right way (not mutually exclusive, can’t update on observations, etc.)
Experience moments are “mutually exclusive”, in the sense that every experience moment can be uniquely identified in theory, and any given agent at any given time is only having one specific observer moment. However there is the possibility of subjectively indistinguishable experiences. I don’t understand what you mean by “can’t update”.
By “can’t update” I refer to the problem with marking Thursday “impossible”, since you’ll encounter Thursday later.
It’s not a problem with the model of ontology and preference, it’s merely specifics of what kinds of observation events are expected.
If the goal is to identify an event corresponding to observations in the form of a set of possible worlds, and there are different-looking observations that could correspond to the same event (e.g. observed at different time in the same possible world), their difference is pure logical uncertainty. They differ, but only in the same sense as 2+2 and (7-5)*(9-7) differ, where you need but to compute denotation: the agent running on the described model doesn’t care about the difference, indeed wants to factor it out.
Sorry, I don’t know of this problem. I thought that the days in this example were Monday and Tuesday—what’s going on with Thursday?
I humbly apologize for my inability to read (may the Values of Less Wrong be merciful).
Ah, OK. But I still don’t understand this:
Hmm, my argument is summarized in this phrase:
If you update on your knowledge that “it’s not Tuesday”, it means that you’ve thrown away the parts of your sample space that contain the territory corresponding to Tuesday, marked them impossible, no longer part of what you can think about, what you can expect to observe again (interpret as implied by observations). Assuming the model is honest, that you really do conceptualize the world through that model, your mind is now blind to the possibility of Tuesday. Come Tuesday, you’ll be able to understand your observations in any way but as implying that it’s Tuesday, or that the events you observe are the ones that could possibly occur on Tuesday.
This is not a way to treat your mind. (But then again, I’m probably being too direct in applying the consequences of really believing what is being suggested, as in the case of Pascal’s Wager, for it to reflect the problem statement you consider.)
I don’t see how this is related to the problem of observer-moments—the argument above holds for any event X: “What if you’ve observed ~X, and then you find that X”. What’s the connection?
In a probability space where you have distinct (non-intersecting) “Monday” and “Tuesday”, it is expected (in the informal sense, outside the broken model) that you’ll observe Tuesday after observing Monday, that upon observing Monday you rule out Tuesday, and that upon observing Tuesday you won’t be able to recognize it as such because it’s already ruled out. “Observer-moments” can be located on the same history, and a probability space that distinguishes them will tear down your understanding of the other observer-moments once you’ve observed one of them and excluded the rest. This model promises you a map disconnected from reality.
It is not the case with a probability space based on possible worlds that after concluding ~X, you expect (in the informal sense) to observe X after that. Possible worlds model is in accordance with this (informal) axiom. Sample space based on “observer-moments” is not.
This has nothing to do with semantics. If smart people are saying “2+2=5” and I point out it’s 4, would you say “what matters is why you want to know what 2+2 is”?
The question here is very well defined. There is only one right answer. The fact that even very smart people come up with the wrong answer has all kinds of implications about the type of errors we might make on a regular basis (and lead to bad theories, decisions, etc).
If you mean something else by probability than “at what odds would you be indifferent to accepting a bet on this proposition” then you need to explain what you mean. You are just coming across as confused. You’ve already acknowledged that sleeping beauty would be wrong to turn down a 50:50 bet on tails. What proposition is being bet on when you would be correct to be indifferent at 50:50 odds?
There is a mismatch between the betting question and the original question about probability.
At an awakening, she has no more information about heads or tails than she had originally, but we’re forcing her to bet twice under tails. So, even if her credence for heads was a half, she still wouldn’t make the bet.
Suppose I am going to flip a coin and I tell you you win $1 if heads and lose $2 if tails. You could calculate that the p(H) would have to be 2⁄3 in order for this to be a fair bet (have 0 expectation). That doesn’t imply that the p(H) is actually 2⁄3. It’s a different question. This is a really important point, a point that I think has caused much confusion.
Do you think this analysis works for the fact that a well-calibrated Beauty answers “1/3”? Do you think there’s a problem with our methods of judging calibration?
You seem to agree she should take a 50:50 bet on tails. What would be the form of the bet where she should be indifferent to 50:50 odds? If you can answer this question and explain why you think it is a more relevant probability then you may be able to resolve the confusion.
Roko has already given an example of such a bet: where she only gets one pay out in the tails case. Is this what you are claiming is the more relevant probability? If so, why is this probability more relevant in your estimation?
Yes, one pay out is the relevant case. The reason is because we are asking about her credence at an awakening.
How does the former follow from the latter, exactly? I seem to need that spelled out.
The interviewer asks about her credence ‘right now’ (at an awakening). If we want to set up a betting problem based around that decision, why would it involve placing bets on possibly two different days?
If, at an awakening, Beauty really believes that it’s tails with credence 0.67, then she would gladly take a single bet of win $1 if tails and lose $1.50 if heads. If she wouldn’t take that bet, why should we believe that her credence for heads at an awakening is 1/3?
What do you think the word “credence” means? I am thinking that perhaps that is the cause of your problems.
I’m treating credence for heads as her confidence in heads, as expressed as a number between 0 and 1 (inclusive), given everything she knows at the time. I see it as the same things as a posterior probability.
I don’t think disagreement is due to different uses of the word credence. It appears to me that we are all talking about the same thing.
I think that the difference between evaluating 2+2 and assigning probabilities (and the reason for the large amount of disagreement) is that 2+2 is a statement in a formal language, whereas what kind of anthropic principle to accept/how to interpret probability is a philosophical one.
Don’t be fooled by the simple Bayes’ theorem calculations—they are not the hard part of this question.
So the difficult question here is which probability space to set up, not how to compute conditional probabilities given that probability space.
(Posted as an antidote to misinterpretation of your comment I committed a moment before.)
A philosophical question, as opposed to a formal one, is a question that hasn’t been properly understood yet. It is a case of ignorance in the mind, not a case of fuzzy territory.
Yes. For example, let’s take a clearer mathematical statement, “3 is prime”. It seems that’s true whatever people say. However, if you come across some mathematicians who are having a discussion that assumes 3 is not prime, then you should think you’re missing some context rather than that they are bad at math.
I chose this example because I once constructed an integer-like system based on half-steps (the successor function adds .5). The system has a notion of primality, and 3 is not prime.
If you want a standard system where 3 is not prime consider Z[omega] where omega^3=1 and omega is not 1. That is, the set of numbers formed by taking all sums, differences, and products of 1 and omega.
What you should say when asked “What is 2+2?” is a separate question from what is 2+2. 2+2 is 4, but you should probably say something else if the situation calls to that. The circumstances that could force you to say something in response to a given question are unrelated to what the answer to that question really is. The truth of the answer to a question is implicit in the question, not in the question-answering situation, unless the question is about the question-answering situation.
I disagree. The correct answer to a question is exactly what you should answer to that question. It’s what “correct” and “should” mean.
“Should” refers to moral value of the outcome, and if someone is holding a gun to a puppy’s head and says “if you say that 2+2=4, the puppy will die!”, you shouldn’t answer “4” to the question, even though it’s correct that the answer is 4. Correctness is a concept separate from shouldness.
If someone asks you, “What do you get if you add 2 and 2”, and you are aware that if you answer “4“ he’ll shoot the puppy and if you answer “5” then he’ll let you and the puppy go, then the correct answer is “5”.
You are disputing definitions. You seem to include “should” among the possible meanings of “correct”. When you say, “in this situation, the correct answer is 5”, you refer to the “correctness” of the answer “5”, not to the correctness of 2+2 being 5. Thus, we are talking about an action, not about the truth of 2+2. The action can, for example, be judged according to moral value of its outcome, which is what you seem to mean by “correct” [action].
Thus, in this terminology, “5” is the correct answer, while it’s also correct that the [true] answer is 4. When I say just “the answer is 4“, this is a shorthand for “the true answer is 4”, and doesn’t refer to the actual action, for which it’s true that “the [actual] answer is 5”.
Right, so for some arbitrary formal system, you can derive “4” from “2+2“, and for some other one, you can derive “5” from “2+2”, and in other situations, the correct response to “2+2” is “tacos”.
When you ask “What is 2+2?”, you mean a specific class of formal systems, not an “arbitrary formal system”. The subject matter is fixed by the question, the truth of its answer doesn’t refer to the circumstances of answering it, to situations where you decide what utterance to produce in response.
The truth might be a strategy conditional on the situation in which you answer it, one that could be correctly followed given the specific situation, but that strategy is itself fixed by the question.
For example, I might ask “What should you say when asked the value of 2+2, taking into account the possibility of being threatened by puppy’s death if you say something other than 5?” The correct answer to that question is a strategy where you say “4″ unless puppy’s life is in danger, in which case you say “5”. Note that the strategy is still fixed by the question, even though your action differs with situation in which you carry it out; your action correctly brings about the truth of the answer to the question.
Given that Beauty is being asked the question, the probability that heads had come up is 1⁄3. This doesn’t mean the probability of heads itself is 1⁄3. So I think this is a confusion about what the question is asking. Is the question asking what is the probability of heads, or what is the probability of heads given an awakening?
Bayes theorem:
x = # of times awakened after heads
y = # of times awakened after tails
p(heads/awakened) = n(heads and awakened) / n(awakened) = x / (x+y)
Yields 1⁄3 when x=1 and y=2.
Where is the probability of heads? Actually we already assumed in the calculation above that p(heads) = 0.5. For a general biased coin, the calculation is slightly more complex:
p(H) =probability of heads
p(T) = probability of tails
x = # of times awakened after heads
y = # of times awakened after tails
p(heads/awakened) = n(heads and awakened) / n(awakened) = p(H)x / (p(H)x + p(T)y)
Yields 1⁄3 when x=1 and y=2 and p(H)=p(T)=0.5.
I’m leaving this comment because I think the equations help explain how the probability-of-heads and the probability-of-heads-given-awakening are inter-related but, obviously—I know you know this already—not the same thing.
To clarify, since the probability-of-heads and the probability-of-heads-given-single-awakening-event are different things, it is indeed a matter of semantics: if Beauty is asked about the probability of heads per event … what is the event? Is the event the flip of the coin (p=1/2) or an awakening (p=1/3)? In the post narrative, this remains unclear.
Which event is meant would become clear if it was a wager (and, generally, if anything whatsoever rested on the question). For example: if she is paid per coin flip for being correct (event=coin flip) then she should bet heads to be correct 1 out of 2 times; if she is paid per awakening for being correct (event=awakening) then she should bet tails to be correct 2 out of 3 times.
Actually .. arguing with myself now .. Beauty wasn’t asked about a probability, she was asked if she thought heads had been flipped, in the past. So this is clear after all—did she think heads was flipped, or not?
Viewing it this way, I see the isomorphism with the class of anthropic arguments that ask if you can deduce something about the longevity of humans given that you are an early human. (Being a human in a certain century is like awakening on a certain day.) I suppose then my solution should be the same. Waking up is not evidence either way that heads or tails was flipped. Since her subjective experience is the same however the coin is flipped (she wakes up) she cannot update upon awakening that it is more likely that tails was flipped. Not even if flipping tails means she wakes up 10 billion times more than if heads was flipped.
However, I will think longer if there are any significant differences between the two problems. Thoughts?
Why was this comment down-voted so low? (I rarely ask, but this time I can’t guess.) Is it too basic math? If people are going to argue whether 1⁄3 or 1⁄2, I think it is useful to know their debating about two different probabilities: the probability of heads or the probability of heads given an awakening.
This is incorrect.
Given that Beauty is being asked the question, the probability that heads had come up is 1⁄2.
This is bayes’ theorem:
p(H)=1/2
p(awakened|H)=p(awakened|T)=1
P(H|awakened)=p(awakened|H)P(H)/(p(awakened|H)p(H)+p(awakened|T)p(T))
which equals 1⁄2
By “awakened” here you mean “awakened at all”. I think you’ve shown already that the probability that heads was flipped given that she was awakened at all is 1⁄2, since in both cases she’s awakened at all and the probability of heads is 1⁄2. I think your dispute is with people who don’t think “I was awakened at all” is all that Beauty knows when she wakes up.
Beauty also knows how many times she it likely to have been woken up when the coin lands heads—and the same for tails. She knew that from the start of the experiment.
OK, I see now why you are emphasizing being awoken at all. That is the relevant event, because that is exactly what she experiences and all that she has to base her decision upon.
(But keep in mind that people are just busy answering different questions, they’re not necessarily incorrect for answering a different question.)
Add a payoff and the answer becomes clear, and it also becomes clear that the answer depends entirely on how the payoff works.
Without a payoff, this is a semantics problem revolving around the ill-defined concept of expectation and will continue to circle it endlessly.
The problem posed is, p(heads | Sleeping Beauty is awake). There is no payoff involved. Introducing a payoff only confuses matters. For instance, Roko wrote:
This is true; but that would be the answer to “What is the probability that the coin was heads, given that Sleeping Beauty was woken up at least once after being put to sleep?” That isn’t the problem posed. If that were the problem posed, we could eliminate her forgetfulness from the problem statement.
If you agree that the forgetfulness is necessary to the story, then 1⁄2 is the wrong answer, and 1⁄3 is the right answer. If you don’t agree it’s necessary, then its presence suggests that the speaker intended a different semantics than you’re using to interpret it.
ADDED: This is depressing. Here we have a collection of people who have studied probability problems and anthropic reasoning and all the relevant issues for years. And we have a question that is, on the scale of questions in the project of preparing for AGI, a small, simple one. It isn’t a tricky semantic or philosophical issue; it actually has an answer. And the LW community is doing worse than random at it.
In fact, this isn’t the first time. My brief survey of recent posts indicates that the LessWrong community’s track record when tackling controversial problems that actually have an answer is random at best.
I define subjective probability in terms of what wagers I would be willing to make. I think a good rule of thumb is that if you can’t figure out how to turn the problem into a wager you don’t know what you’re asking. And, in fact, when we introduce payoffs to this problem it becomes extremely clear why we get two answers. The debate then becomes a definition debate over what wager we mean by the sentence “what credence should the patient assign...”
As I just explained, the fact that the original author of the story wrote amnesia into it tells you which definition the author of the story was using.
And that’s a good argument you’ve got there, but I don’t think that is totally obvious on the first read of the problem. It’s a weird feature of a probability problem for the relevant wager to be offered once under some circumstances and twice under others. So people get confused. It is a little tricky. But, far from confusing things, that entire issue can be avoided if we specify exactly how the payoff works when we state the problem! So I don’t know why you’re freaking out about Less Wrong’s ability to answer these problems when it seems pretty clear that people interpret the question differently, not that they can’t think through the issues.
(Not my downvote, btw)
Re: “Introducing a payoff only confuses matters.”
Personally, I think it clarifies things—though at the expense of introducing complication. People disagree over which bet the problem represents. Describing those bets highlights this area of difference.
I see what you mean. But some comments have said, “I can set up a payoff scheme that gives this answer; therefore, this is an equally-valid answer.” The correct response is to state the payoff scheme that gives your answer, and then admit your answer is not addressing the problem if you can’t find justification for that payoff scheme in the problem statement.
Indeed—that would be bad—and confusing.
It is both bad and confusing that people are defending the idea that this problem is not clearly-stated enough to answer.
I suspect this happens because, people don’t like criticising the views of others. They would rather just say ‘you are both right’ - since then no egos get bruised, and a costly fight is avoided. So, nonsense goes uncriticised, and the innocent come to believe it—because nobody has the guts to knock it down.
No, it has an unasking.
IMO, there’s no problem with the form of this question. It is not ambiguous. The only way to make it so is with some pretty torturous misinterpretations.
I am confused: it is certain that beauty will be woken at least once. Why are you conditioning on it?
If you don’t need to condition on it, why is it in the story?
The question asked in the story is “Sleeping Beauty, what is p(heads | you are awake now)?”
Someone is going to complain that you can’t ask about p(heads) when it’s already either true or false. Well, you can. That’s how we use probabilities. If you are a determinist, you believe that everything is already either true or false; yet determinists still use probabilities.
“On Sunday she is given a drug” is also in the story. Does it follow that it is imperative to explicitly condition on that as well?
“ADDED: This is depressing. Here we have a collection of people who have studied probability problems and anthropic reasoning and all the relevant issues for years. And we have a question that is, on the scale of questions in the project of preparing for AGI, a small, simple one. It isn’t a tricky semantic or philosophical issue; it actually has an answer. And the LW community is doing worse than random at it.”
That’s why I posted this to begin with. It is interesting that we can’t come to an agreement on the solution to this problem, even though it involves very straightforward probability. Heck, I got heavily down voted after making statements that were correct. People are getting thrown off by doing the wrong kind of frequency counting.
--
However, I should note that the event ‘sleeping beauty is awake’ is equivalent to ‘sleeping beauty has been woken up at least once’ because of the amnesia. The forgetfulness aspect of the problem is why the solution is 1⁄2.
I’d like to see a model of how a group of people is supposed to improve their initial distribution of beliefs in a problem with a true/false answer.
Distressingly few people have publicly changed their mind on this thread. Various people show great persistence in believing the wrong answer—even when the problem has been explained. Perhaps overconfidence is involved.
I changed my mind from “1/3 is the right answer” to “The answer is obviously 1⁄2 or 1⁄3 once you’ve gotten clear on what question is being asked”. I’m not sure if I did so publicly. It seems to me that other folks have changed their minds similarly. I think I see an isomorphism to POAT here, as well as any classic Internet debate amongst intelligent people.
I’m not sure whether this is legitimate or a joke, but if the question is unclear about whether 1⁄2 or 1⁄3 is better, maybe 5⁄12 is a good answer.
I’m also not sure if you’re serious, but if you assign a 50% probability to the relevant question being the one with the correct answer of ‘1/2’ and a 50% probability to the relevant question being the one with the correct answer of ‘1/3’ then ‘5/12’ should maximize your payoff over multiple such cases if you’re well-calibrated.
Phil and I seem to think the problem is sufficiently clearly specified to give an answer to. If you think 1⁄2 is a defensible answer, how would you reply to Robin Hanson’s comment?
FWIW, on POAT I am inclined towards “Whoever asked this question is an idiot”.
Actually I think it would make more sense to reply to my own comment in response to this. link
I am not sure that is going anywhere.
Personally, I think I pretty-much nailed what was wrong with the claim that the problem was ambiguous here.
I think that we’ve established the following:
there are some problems similar to this one for which the answer is 1⁄2
there are some problems similar to this one for which the answer is 1⁄3
people seem to be disagreeing which sort of problem this is
all debate has devolved to debate over the meanings of words (in the problem statement and elsewhere)
Given this, I think it’s obvious that the problem is ambiguous, and arguing whether the problem is ambiguous is counterproductive as compared to just sorting out which sort of problem you’re responding to and what the right answer is.
IMHO, different people giving different answers to problems does not mean it is ambiguous. Nor does people disagreeing over the meanings of words. Words do have commonly-accepted meanings—that is how people communicate.
I’m coming around to the 1⁄2 point of view, from an initial intuition that 1⁄3 made most sense, but that it mostly depended on what you took “credence” to mean.
My main new insight is that the description of the set-up deliberately introduces confusion, it makes it seem as if there are two very different situations of “background knowledge”, X being “a coin flip” and X’ being “a coin flip plus drugs and amnesia”. So that P(heads|X) may not equal P(heads|X’).
This comment makes the strongest case I’ve seen that the difference is one that makes no difference. Yes, the setup description strongly steers us in the direction of taking “credence” to refer to the number of times my guess about the event is right. If Beauty got a candy bar each time she guessed right she’d want to guess tails. But on reflection what seems to matter in terms of being well-calibrated on the original question is how many distinct events I’m right about.
Take away the drug and amnesia, and suppose instead that Beauty is just absent-minded. On Tuesday when you ask her, she says: “Oh crap, you asked me that yesterday, and I said 1⁄2. But I totally forget if you were going to ask me twice on tails or on heads. You’d think with all they wrote about this setup I’d remember it. I’ve no idea really, I’ll have to go with 1⁄2 again. Should be 1 for one or the other, but what can I say, I just forget.”
I’m less than impressed with the signal-to-noise ratio in the recent discussion, in particular the back-and-forth between neq1 and timtyler. As a general observation backed by experience in other fora, the more people are responding in real time to a controversial topic, the less likely they are to be contributing useful insights.
I’m not ruling out changing my mind again. :)
I’ve been thinking 1⁄2 as well (though I’m also definitely in the “problem is underdefined” camp).
Here is how describe the appropriate payoff scheme. Prior to the experiment (but after learning the details) Beauty makes a wager with the Prince. If the coin comes up heads the Prince will pay Beauty $10. If it comes up tails Beauty will pay $10. Even odds. This wager represents Beauty’s prior belief that the coin is fair and head/tails have equal probability: her credence that heads will or did come up. At any point before Beauty learns what day of the week it is she is free alter the bet such that she takes tails but must pay $10 more dollars to do so (making the odds 2:1).
Beauty should at no point (before learning what day of the week it is) alter the wager. Which means when she is asked what her credence is that the coin came up heads she should continue to say 1⁄2.
This seems at least as good an payoff interpretation as a new bet every time Beauty is asked about her credence.
You don’t measure an agent’s subjective probability like that, though—not least because in many cases it would be bad experimental methodology. Bets made which are intended to represent the subject’s probability at a particular moment should pay out—and not be totally ignored. Otherwise there may not be any motivation for the subject making the bet to give an answer that represents what they really think. If the subject knows that they won’t get paid on a particular bet, that can easily defeat the purpose of offering them a bet in the first place.
This doesn’t make any sense to me. Or at least the sense it does make doesn’t sound like sufficient reason to reject the interpretation.
If Beauty forgets what is going on—or can’t add up—her subjective probability could potentially be all over the shop.
However, the problem description states explicitly that: “During the experiment, she has no access to anything that would give a clue as to the day of the week. However, she knows all the details of the experiment.”
This seems to me to weigh pretty heavily against the hypothesis that she may have forgotten the details of the experiment.
In the case where she remembers what’s going on, when you ask her on Tuesday what her credence is in Heads, she says “Well, since you asked me yesterday, the coin must have come up Tails; therefore I’m updating my credence in Heads to 0.”
The setup makes her absent-minded (in a different way than I suggest above). It erases information she would normally have. If you told her “It’s Monday”, she’d say 1⁄2. If you told her “It’s Tuesday”, she’d say 0. The amnesia prevents Beauty from conditioning on what day it is when she’s asked.
Prior to the experiment, Beauty has credence 1⁄2 in either Heads or Tails. To argue that she updates that credence to 1⁄3, she must be be taking into account some new information, but we’ve established that it can’t be the day, as that gets erased. So what it is?
Jonathan_Lee’s post suggests that Beauty is “conditioning on observers”. I don’t really understand what that means. The first analogy he makes is to an identical-copy experiment, but we’ve been over that already, and I’ve come to the conclusion that the answer in that case is “it depends”.
Re: “Prior to the experiment, Beauty has credence 1⁄2 in either Heads or Tails.”
IMO, we’ve been over that adequately here. Your comment there seemed to indicate that you understood exactly when Beauty updates.
Yes. I noted then that the description of the setup could make a difference, in that it represents different background knowledge.
It does not follow that it does make a a difference.
When I say “prior to the experiment”, I mean chronologically, i.e. if you ask Beauty on Sunday, what her credence is then in the proposition “the coin will come up heads”, she will answer 1⁄2.
Once Beauty wakes up and is asked the question, she conditions on the fact that the experiment is now ongoing. But what information does that bring, exactly?
When Beauty knows she will be the subject of the experiment (and its design), she will know she is more likely to be observing tails. Since the experiment involves administering Beauty drugs, it seems fairly likely that she knew she would be the subject of the experiment before it started—and so she is likely to have updated her expectations of observing heads back then.
The question is
Your claim is that Beauty answers “1/3” before the experiment even begins?
(?!?!!)
If she is asked: “if you wake up with amnesia in this experiment, what odds of the coin being heads will you give”, then yes. She doesn’t learn anything to make her change her mind about the odds she will give after the experiment has started.
That isn’t a symmetrical question. We’re not asking for her belief about what odds she will give. We’re asking what her odds are for a particular event (namely a coin flip at time t1 being heads).
The question “What is your credence now for the proposition that our coin landed heads?” doesn’t appear to make very much sense before the coin is flipped. Remember that we are told in the description that the coin is only flipped once—and that it happens after Beauty is given a drug that sends her to sleep.
Beauty should probably clarify with the experimenters which previous coin is being discussed, and then, based on what she is told about the circumstances surrounding that coin flip, she should use her priors to answer.
The English language doesn’t have a timeless tense. So we can’t actually phrase the question without putting the speaker into some time relative to the event we’re speaking of. But that doesn’t mean we can’t recognize that the question being asked is a timeless one. We have a coordinate system that lets us refer to objects and events throughout space and time… it doesn’t matter when the agent is: the probability of the event occurring can be estimated before, after and during just as easily (easy mathematically, not practically). That is why I used the phrasing “the coin flip at time t1 being heads”. The coin flip at t1 can be heads or tails. Since we know it is a fair coin toss we start with P=1/2 for heads. If you want the final answer to be something other than 1⁄2 you need to show when and how Beauty gets additional information about the coin toss.
The question asked in the actual problem has the word “now” in it. You said I didn’t answer a “symmetrical” question—but it seems as though the question you wanted me to answer is not very “symmetrical” either.
If Beauty is asked before the experiment the probabality she expects the coin to show heads at the end of the experiment, she will answer 1⁄2. However, in the actual problem she is not asked that.
We’re supposed to be Bayesians. It doesn’t matter whether the question asks “now” “in 500 B.C.E.” or “at the heat death of the universe” unless our information has changed, the time the prediction is made is irrelevant.
(ETA: Okay, I guess at the heat death of the universe the information would have changed. But you get my point :-)
If you are locked in a lead-lined box, the answer to question “is it night time outside now” varies over time—even though you learn nothing new.
Similarly with Beauty, as she moves through the experimental procedure.
But here you’ve put the time-indexical “now” into your description of the event. You’re asking for P(it is night, now). In the Beauty case question asked is what is P(heads), now. In the first case every moment that goes by we’re talking about a temporally distinct event. You’re actually asking about a different event every moment- so it isn’t surprising that the answer changes from moment to moment. The Sleeping Beauty problem is always about the same event.
The coin flip doesn’t change—but Beauty does. She goes in one end of the expertiment and comes out the other side, and she knows roughly where she is on that timeline. Probabalities are subjective—and in this example we are asked for Beauty’s “credence”—i.e. her subjective probability at a particular point in time. That’s a function of the observer, not just the observed.
Yes. But subjective probability is a function of the information someone has not where they are on the time-line. Which is why people keep asking what information Beauty is updating on. We’re covering 101 stuff at this point.
...and going round in circles, I might note. We did already discuss the issue of exactly when Beauty updates close by—here.
Also, we already know where we differ. We consider “subjective probability” to refer to different things. Given your notion of “subjective probability”, your position makes perfect sense, IMO. I just don’t think that is how scientists generally use the term.
Well you tried to answer the question. I suggested your answer was ridiculous and explained why and I have been rebutting your responses since then. So no, we’re not going in circles. I’m objecting to your answer to the updating question and rebutting your responses to my objection.
Here is what happened in this thread.
You suggested that Beauty would have estimated heads at 1⁄3 prior to the experiment.
I said ‘Wha?!?’
You tried to make Beauty’s pre-experiment estimation about what she was going to say when she woke up.
I pointed out that that question was about a different event (the saying) than the question “What is your credence now for the proposition that our coin landed heads?” is about (the coin flip)
You claimed that it didn’t make sense to ask that question (about the coin having landed heads) before the coin flip happens.
I showed how even though English requires us to use tense we can make the question time symmetrical by inventing temporal coordinates (t1) and speaking of subjective probability of heads at t1 at any time Beauty exists.
You claimed that the probability of heads at the end of the experiment was somehow different from the probability of heads at some other time (presumably when she is asked).
I pointed out that time is irrelevant and what matters is her information- an elementary point which I shouldn’t have to make to someone who was last night trashing the OP for supposedly not knowing anything about probability (and I’m a philosopher not a math guy!).
In conclusion: My claim is that for Beauty to answer 1⁄3 for the probability of the time invariant event “coin toss by experimenter at time t1 being heads” she needs to get new information since the prior for that even is obviously 1⁄2. No one has ever pointed to what new information she gets. You tried to claim that Beauty updates as soon as she gets the details of the experiment: but that can’t be right. The details of the experiment can’t alter the outcome of a fair coin toss. So where is the updating?!
It’s hard to tell but I’m not sure your notion of “subjective probability” is coherent- specifically because you keep talking about different events depending on what time you’re in. That sounds like a recipe for disaster. But alright.
Does this mean we can just agree to specify payouts in our probability problems from now on? Or must we now investigate which one of us is using the term the way scientists do? Unfortunately this disagreement suggest to me that scientists may not know exactly what they mean by subjective probability.
Subjective probability is a basic concept in decision theory. Scientists have certainly tried hard to say exactly what they mean by the term. E.g. see this one, from 1963:
“A Definition of Subjective Probability”—F. J. Anscombe; R. J. Aumann
http://www.econ.ucsb.edu/~tedb/Courses/GraduateTheoryUCSB/anscombeaumann.pdf
Sure. I don’t see anything in there to suggest that subjective probability isn’t time symmetrical (by which I mean that a subjective probability regarding an event can be held at any time and there is not reason for the probability to change unless the person’s evidence changes). Can you do a better job formalizing what your alternative is?
Except she doesn’t. She’ll give the same answer on Monday as she will on Tuesday, because she doesn’t learn anything by waking up.
Yes, this is very alarming, considering this is a forum for aspiring rationalists.
I disagree; but I’ve already given my reasons.
Which of your down-voted statements were correct?
Well, I got −6 for this statement: “P(monday and heads)=1/2. P(monday and tails)=1/4. P(tuesday and tails)=1/4. Remember, these have to add to 1.”
Initially there is a 50% chance for heads and 50% chance for tails. Given heads, it’s monday with certainty. So, P(heads)=1/2, p(monday | heads)=1.
Do you dispute either of those?
Similarly, p(tails)=1/2, p(monday | tails)=1/2. p(tuesday | tails)=1/2.
Do you dispute either of those?
The above are all of the probabilities you need to know. From them, you can derive anything that is of interest here.
For example, on an awakening p(monday)=p(monday|tails)p(tails) + p(monday|heads) p(heads)=1/4+1/2=3/4
p(monday and heads)=p(heads)*p(monday|heads)=1/2
etc.
Re: “P(monday and heads)=1/2. P(monday and tails)=1/4. P(tuesday and tails)=1/4. Remember, these have to add to 1.”
Yes, but those Ps are wrong—they should all be 1⁄3.
My assumptions and use of probability laws are clearly stated above. Tell me where I made a mistake, otherwise just saying “you’re wrong” is not going to move things forward.
Well, the correct sum is this one:
“Suppose this experiment were repeated 1,000 times. We would expect to get 500 heads and 500 tails. So Beauty would be awoken 500 times after heads on Monday, 500 times after tails on Monday, and 500 times after tails on Tuesday. In other words, only in a third of the cases would heads precede her awakening. So the right answer for her to give is 1⁄3. This is the correct answer from Beauty’s perspective.”
That gives:
P(monday and heads)=500/1500. P(monday and tails)=500/1500. P(tuesday and tails)=500/1500.
You appear to have gone wrong by giving a different answer—based on a misinterpretation of the meaning of the interview question, it appears.
So you are not willing to tell me where I made a mistake?
P(heads)=1/2, p(monday | heads)=1. Which one of these is wrong?
You’re using expected frequencies to estimate a probability, apparently. But you’re counting the wrong thing. What you are calling P(monday and heads) is not that. There is a problem with your denominator. Think about it. Your numerator has a maximum value of 1000 (if the experiment was repeated 1000 times). Your denominator has a maximum value of 2000. If the maximum possible values of the numerator and denominator do not match, there is a problem. You have an outcome-dependent denominator. Try taking expectation of that. You won’t get what you think you’ll get.
Re: “If the maximum possible values of the numerator and denominator do not match, there is a problem.
The total possible number of awakenings is 2000.
That represents all tails—e.g.:
P(monday and heads) = 0/2000; P(monday and tails) = 1000/2000; P(tuesday and tails) = 1000/2000;
These values add up to 1 - i.e. the total numerators add up to the commonn denominator. That is the actual constraint. The maximum possible value of the numerator in each individual fraction is permitted to be smaller than the common denominator—that is not indicative of a problem.
Oh, it is a huge problem. It proves that your ratio isn’t of the form # of events divided by # of trials. Your ratio is something else. The burden is on you to prove that it actually converges to a probability as the number of trials goes to infinity.
Using cell counts and taking a ratio leads to a probability as the number of trials goes to infinity if you have independent draws. You don’t. You have a strange dependence in there that messes things up. Standard theory doesn’t hold. Your thing there is estimating something, you just don’t know what it is
The total number of events (statements by Beauty) adds up to the total number of trials (interviews).
You should not expect the number of statements by beauty on Monday to add up to the total number of interviews alltogether. It adds up to the number of interviews on Monday. This is not very complicated.
Do you have to make a condescending remark every time you respond? You told me things that I already know, and then said “This is not very complicated.” Great, but nothing accomplished.
You are using an estimator that is valid when you have counts from independent trials. Coin flips are independent here, but interviews are not. You need to take that into account.
It is the plain truth. I don’t know why you are asking such silly questions in public. Maybe you have a weak background in this sort of maths. Or maybe you just don’t like admitting that you posted a whole bunch of inaccurate nonsense—and so keep digging yourself deeper in.
You show no sign of being able to understand your problems—so it seems to me as though there is little point in continuing to point them out. You can’t say I didn’t try to help you sort yourself out.
Well, I have a phd in biostatistics and teach Bayesian data analysis at the University of Pennsylvania, so I either have background in such matters or Penn isn’t real careful on who they hire.
The fact that I am very careful about these kinds of problems is what lead me to discover the flaw in the 1⁄3 argument—it wasn’t obvious to me at first.
If true, a good job you haven’t supplied your real name, then—or your friends and colleagues might come across this thread.
Do you find that people generally think more clearly after they’ve been insulted?
Do you find that you think more clearly after you’ve been insulted?
Hi, Nancy! I haven’t researched this issue. I imagine the results would depend on the details of the situation, the relative status of the participants, etc. I recommend you consult a social psychologist—if you are sincerely looking for answers.
NancyLebovitz was being oblique. I believe her point was that your remarks were not useful for the purpose of improving neq1′s state of knowledge.
I would add that they were also not useful for the purpose of entertaining the lurkers, if the downvotes are anything to go by.
Uh—improving neq1′s state of knowledge was not the intended purpose of that post.
I have already written literally dozens of posts attempting to improving the state of knowledge of other participants on this thread. That post was publicly explaining why I am now likely to stop—just so there is no subsequent confusion about the issue.
I think
would have gone over better.
Right—but I call a spade a spade, don’t beat about the bush, say what I think—etc.
Insulating others from what I think in order to protect their egos is not my style. If I did that people would always be wondering if I meant what I said—or whether I was shielding them from my true opinions in order to protect their egos. In the long run, it is best to just speak the truth, as I see it, IMO. At least then, others know where I stand.
There are a lot of approaches one can take when interacting with other people. Your approach leads me to not want to make your acquaintance. The same isn’t true for most of the other people here, even the ones who disagree with me.
In that case, I would recommend you shut up as soon as possible. If, as you said,
then stop wasting your time.
Thanks for your proposal about how to optimise my time management.
I think you are best leaving that issue to me, though—I have more relevant information about the topic than you do.
In the spirit of experimentation, I’m going to try giving up being oblique.
My primary motivation is not to do you a favor. My purpose is also not to protect neq1.
It is to convey that the purpose of this website is to work on thinking clearly, to some extent to further the creation of FAI, and also to improve skills at living. There’s also a little pleasant socializing.
However, insulting people (or having an atmosphere where insults are accepted) does not further any of the purposes of the site.
I believe that people generally think less well when they’ve been insulted. Also, I’ve been online for a long time. Insults are pretty much similar to each other—in other words, they’re noise, not signal so far as anything about the world generally is concerned.. They’re signal about emotional state and/or attempted dominance, but (as should be clear from the conversation so far), not a terribly clear signal.
What’s worse, insults are likely to lead to more insults.
I’m not a moderator, but I’m asking you not to dump hostility here.
Nancy, your beliefs about the average effect of insults on people do not seem to me to be a good reason to avoid bluntly telling people when they are behaving badly. IMO, you are not properly considering the positive effects of pointing out such bad behaviour. If someone behaves badly, and you don’t tell them, they don’t learn. Others might think their behaviour is acceptable. Still others might think you approve of their behaviour—and so on. It is not as though I had not tried all manner of rational argument first. Yes, people might be insulted or offended by someone else pointing out what is going on—if it reflects badly on them, but that is—ultimately—their business.
I’ve told you rather bluntly that I don’t approve of your behavior, though I think I’ve managed to avoid insulting your intelligence or character.
Do you think the world is a better place as a result?
Not especially—and certainly not from my point of view. Alas, I found responding to your comments to be a waste of my time and energy. Especially so with your “oblique” comments. So, overall, I would rather you had not bothered commenting in the first place.
I apologize.
In the spirit of experimentation, I’m going to try giving up being oblique.
My primary motivation is not to do you a favor. My purpose is also not to protect neq1.
It is to convey that the purpose of this website is to work on thinking clearly, to some extent to further the creation of FAI, and also to improve skills at living. There’s also a little pleasant socializing.
However, insulting people (or having an atmosphere where insults are accepted) does not further any of the purposes of the site.
I believe that people generally think less well when they’ve been insulted. Also, I’ve been online for a long time. Insults are pretty much similar to each other—in other words, they’re noise, not signal so far as anything about the world generally is concerned.. They’re signal about emotional state and/or attempted dominance, but (as should be clear from the conversation so far), not a terribly clear signal.
What’s worse, insults are likely to lead to more insults.
I’m not a moderator, but I’m asking you not to dump hostility here.
In the spirit of experimentation, I’m going to try giving up being oblique.
My primary motivation is not to do you a favor. My purpose is also not to protect neq1.
It is to convey that the purpose of this website is to work on thinking clearly, to some extent to further the creation of FAI, and also to improve skills at living. There’s also a little pleasant socializing.
However, insulting people (or having an atmosphere where insults are accepted) does not further any of the purposes of the site.
I believe that people generally think less well when they’ve been insulted. Also, I’ve been online for a long time. Insults are pretty much similar to each other—in other words, they’re noise, not signal so far as anything about the world generally is concerned.. They’re signal about emotional state and/or attempted dominance, but (as should be clear from the conversation so far), not a terribly clear signal.
What’s worse, insults are likely to lead to more insults.
I’m not a moderator, but I’m asking you not to dump hostility here.
In the spirit of experimentation, I’m going to try giving up being oblique.
My primary motivation is not to do you a favor. My purpose is also not to protect neq1.
It is to convey that the purpose of this website is to work on thinking clearly, to some extent to further the creation of FAI, and also to improve skills at living. There’s also a little pleasant socializing.
However, insulting people (or having an atmosphere where insults are accepted) does not further any of the purposes of the site.
I believe that people generally think less well when they’ve been insulted. Also, I’ve been online for a long time. Insults are pretty much similar to each other—in other words, they’re noise, not signal so far as anything about the world generally is concerned.. They’re signal about emotional state and/or attempted dominance, but (as should be clear from the conversation so far), not a terribly clear signal.
What’s worse, insults are likely to lead to more insults.
I’m not a moderator, but I’m asking you not to dump hostility here.
But there are probably an infinite number of propositions that you actually believe, and even an infinite number of relevant propositions that you actually believe. You choose which things that you think to actually say (I’m just assuming that everything you think ‘out loud’ doesn’t get posted to Less Wrong, since I assume you have more thoughts than I’ve observed comments from you). As long as you’re leaving out an infinite amount of information, you might as well also leave out insulting language.
I said he was asking “silly questions”. However, that is true—and was not “insulting language”. If you think I was using “insulting language”, you will have to be more specific about what you mean.
As to the possibility of you making a more general point, IMO, systematically not speaking truths that might cause offense would have bad results—especially for truth-seekers:
“Unfortunately some people take offense more easily than others. Also, some people are offended by true statements.”
http://timtyler.org/political_correctness/
It is the same with pressuring other people to not speak truths that might cause offense. That too, would have—and has had—seriously unpleasant long-term effects.
You didn’t mention that you were likely to stop posting in the thread.
That was implied information: “it seems to me as though there is little point in continuing to point them out. You can’t say I didn’t try to help you sort yourself out.”
Your continued involvement in the conversation was stronger information.
“Likely to stop” is a probabalistic statement. I am still likely to stop posting on this thread soon. I have done my bit to promote the correct answer to this problem. A top level post explains the correct answer in some detail. I feel as though my work here is done.
Were someone else exhibiting similar posting behavior, would you draw the same conclusion? You may sincerely desire to terminate your conversation with neq1, but you appear to cesire* to continue it.
* A “cesire” is a motivator to action that works like a desire even when accompanied by a conflicting desire—much like an alief can induce emotional reactions in the same way as beliefs even in the presence of a contrary belief.
Your question seems vague: Similar posting behavior to what posting behaviour? - and by whom? Would I draw the same conclusion—as which conclusion?
The conclusion that you feel your work is done. Such a state removes the desire to continue responding to neq1, and—as such a desire is the only apparent reason to respond to neq1 - leads to a cessation of posts in the associated thread(s). This has not occurred.
I haven’t argued about the topic of this post for a little while now—and certainly not since writing “I feel as though my work here is done”.
Rather I am here defending my reputation against assaults from people who don’t like my posting style—and seem keen to let everyone else know of their disapproval. I’ll probably give up with that too, soon enough.
Were someone else exhibiting similar posting behavior, would you draw the same conclusion? You may sincerely desire to terminate your conversation with neq1, but you appear to cesire* to continue it.
* A “cesire” is a motivator to action that works like a desire even when accompanied by a conflicting desire—much like an alief can induce emotional reactions in the same way as beliefs even in the presence of a contrary belief.
Actually, I have supplied my real name (in a previous post I linked to my blog, which has my name). I’m confident my colleagues would be in agreement with me.
Or they all should be 1⁄2.
Impossible—if they are to add up to 1.
For Jack’s bookie, I agree, you have to use 1⁄3 – but if you want to calculate a distribution on how much cash Beauty has after the experiment given different betting behavior, it no longer works to treat Monday and Tuesday as mutually exclusive.
This is one of those cases where we need to disentangle the dispute over definitions (1), forget about the notion of subjective anticipation (2), list the well-defined questions and ask which we mean.
If by the probability we mean the fraction of waking moments, the answer is 1⁄3.
If by the probability we mean the fraction of branches, the answer is 1⁄2.
http://lesswrong.com/lw/np/disputing_definitions/
http://lesswrong.com/lw/208/the_iless_eye/
It’s hard to make a sensible notion of probability out of “fraction of waking moments”. Two subsequent states of a given dynamical system make for poor distinct elements of a sample space: when we’ve observed that the first moment of a given dynamical trajectory is not the second, what are we going to do when we encounter the second one? It’s already ruled “impossible”! Thus, Monday and Tuesday under the same circumstances shouldn’t be modeled as two different elements of a sample space.
As Wei Dai and Roko have observed, that depends on why you’re asking in the first place. Probability estimates should pay rent in correct decisions. If you’re making a bet that will pay off once at the end of the experiment, you should count the fraction of branches. If you’re making a bet that will pay off once per wake-up call, you should count the fraction of wake-up calls.
That’s the wrong way to look at it. A certain bet may be the “correct” action to perform, or even a certain ritual of cognition may pay its rent, but it won’t be about the concept of probability. Circumstances may make it preferable to do or say anything, but that won’t influence the meaning of fixed concepts. You can’t argue that 2+2 is in fact 5 on the grounds that saying that saves puppies. You may say that 2+2 is 5, or think that “probability of Tuesday” is 1⁄3 or 1⁄4 in order to win, but that won’t make it so, it will merely make you win.
Subjective probability is not a well-defined concept in the general case. Fractions are well-defined, but only after you’ve decided where you are getting the numerator and denominator from.
That fractions are well-defined doesn’t make them probabilities.
Let us not sacrifice effectiveness of our concepts in order to make them mathematically elegant. If reality gives you problems where you win by reasoning anthropically, but ordinary probability theory is not up to the job of facilitating, then invent UDT and use that instead.
The winning thing might be better than the probability thing, but it won’t be a probability thing just because it’s winning. Also, UDT weakly relies on the same framework of expected utility and probability spaces, defined exactly as I discuss them in the comments to this post.
Not all of the waking moments have the same probability of occurring. If you estimate the probability of heads by the proportion of waking moments that were preceded by heads, you’d be throwing out information. Again, on a random waking moment, Monday preceded by heads is more likely than Monday preceded by tails.
On a random waking moment, Monday preceded by heads is equally likely as Monday preceded by tails.
I think you’re thinking of a similar problem that we discussed last year, which involves a forgetful driver who is driving past 1 to n intersections, and needs to turn left at at least one of them. That problem is different, because it’s asking about the probability of turning left at least once over the course of his drive.
The absent-minded driver is essentially the same problem, but it’s easier to analyze because explicit payoff specification prompts you to estimate expected value of possible strategies. In estimating those strategies, we use the same probability model that would say “1/2” in the Beauty problem.
Nope. P(monday and heads)=1/2. P(monday and tails)=1/4. P(tuesday and tails)=1/4. Remember, these have to add to 1.
How come P(monday and heads) and P(monday and tails) are not the same? This is an ordinary unbiased coin, yes?
How come P(monday and tails) and P(tuesday and tails) are not the same. Nothing happens in the interim, yes?
Before she wakes, the probabilities SB would assign if she were conscious are P(monday and heads) = P(monday and tails) = p(tuesday and heads) = p(tuesday and tails) = 1⁄4.
After waking, she would update to p(tuesday and heads) = 0 and P(monday and heads) = P(monday and tails) = p(tuesday and tails) = 1⁄3, since p(tuesday and heads | wakes up) = 0 and p(monday and heads | wakes up) = p(monday and tails | wakes up) = p(tuesday and tails | wakes up) = 1.
Ugh. That makes no sense. Can you explain why she would update in such a manner?
SB starts out with four equaly likely possibilities. On observing that she wakes up, she eliminates one of them, but does not distinguish between the remaining possiblities. Renormalizing the probabilities gives probability 1⁄3 to the remaining possibilities.
I agree, but don’t see how this works as a reply to Phil’s comment.
The coverage on http://en.wikipedia.org/wiki/Sleeping_Beauty_problem seems much less confused than this post.
I disagree. That’s why I quoted from that site and explained where I think the errors are.
Alas, that site is correct—and your whole post is totally wrong.
Except I showed why it’s wrong. I understand both the 1⁄3 and 1⁄2 solutions. I showed where 1⁄3 reasoning fails.
You don’t need a monetary reward for this reasoning to work. It’s a funny ambiguity, I think, in what ‘credence’ means. Intuitively, a well-calibrated person A should assign a probability of P% to X iff X happens on P% of the occasions where A assigned a P% probability to X.
If we accept this, then clearly 1⁄3 is correct. If we run this experiment multiple times and Beauty guessed 1⁄3 for heads, then we’d find heads actually came up 1⁄3 of the times she said “1/3”. Therefore, a well-calibrated Beauty guesses “1/3″.
On the other hand...
Here we’re still left with “occasions”. Should a well-calibrated person be right half of the times they are asked, or about half of the events? If (on many trials) Beauty guesses “tails” every time, then she’s correct 2⁄3 of the times she’s asked. However, she’s correct 1⁄2 of the times that the coin is flipped.
If I ask you for the probability of ‘heads’ on a fair coin, you’ll come up with something like ‘1/2’. If I ask you a million times before flipping, flip once, and it comes up tails, and then ask you once more before flipping, flip once, and it comes up heads, then you should not count that as a million cases of ‘tails’ being the correct answer and one of ‘heads’, even though a guess of ‘tails’ would have made you correct on a million occasions of being asked the question.
Well, the question was:
“What is your credence now for the proposition that our coin landed heads?”
No mention of “occasions”. Your comment doesn’t seem to be addressing that question, but some other ones, which are not mentioned in the problem description.
This explains why you can defend the “wrong” answer: you are not addressing the original question.
I did not claim that the problem statement used the word “occasions”.
Beauty should answer whatever probability she would answer if she was well-calibrated. So does a well-calibrated Beauty answer ‘1/2’ or 1⁄3′? Does Laplace let her into Heaven or not?
By the way, do you happen to remember the name or location of the article in which Eliezer proposed the idea of being graded for your beliefs (by Laplace or whoever), by something like cross-entropy or K-L divergence, such that if you ever said about something true that it had probability 0, you’d be infinitely wrong?
A Technical Explanation of Technical Explanation
What Nick said. Laplace is also mentioned jokingly in a different context in An Intuitive Explanation of Bayes’ Theorem.
Well, 1⁄3. I thought you were supposed to be defending the plausibility of the “1/2” answer here—not asking others which answer is right.
We know she will have the same credence on monday as she does on tuesday (if awakened), because of the amnesia. There is no reason to double count those. Under the experiment, you should think of there being one occasion under heads and one occasion under tails. From that perspective, a well-calibrated person A will assign 1⁄2 for heads. I think that is the correct way to view this problem. If there was a way for her to distinguish the days, things would be different.
Well, she does say it twice. That seems like at least a potential reason to count it as two answers.
You could say that 1⁄3 of the times the question is asked, the coin came up heads. You could also say that 1⁄2 of the beauties are asked about a coin that came up heads.
To me, this reinforces my doubt that probabilities and beliefs are the same thing.
EDIT: reworded for clarity
Why?
It illustrates fairly clearly how probabilities are defined in terms of the payoff structure (which things will have payoffs assigned to them and which things are considered “the same” for the purposes of assigning payoffs).
I’ve felt for a while that probabilities are more tied to the payoff structure than beliefs, and this discussion underlined that for me. I guess you could say that using beliefs (instead of probabilities) to make decisions is a heuristic that ignores, or at least downplays, the payoff structure.
I agree that probabilities are defined through wagers. I also think beliefs (or really, degrees of belief) are defined through wagers. That’s the way Bayesian epistemologists usually define degree of belief. So I believe X will occur with P = .5 iff a wager on X and a wager on a fair coin flip are equally preferable to me.
That’s fine. I guess I’m just not a Bayesian epistemologist.
If Sleeping Beauty is a Bayesian epistemologist, does that mean she refuses to answer the question as asked?
I’m not sure I have an official position of Bayesian epistemology but I find the problem very confusing until you tell me what the payoff is. One might make an educated guess at the kind of payoff system the experiment designers would have had in mind—as many in the this thread have done. (ETA: actually, you probably have to weigh your answer according to your degree of belief in the interpretation you’ve chosen. Which is of course ridiculous. Lets just include the payoff scheme in the experiment.)
I agree that more information would help the beauty, but I’m more interested in the issue of whether or not the question, as stated, is ill-posed.
One of the Bayesian vs. frequentist examples that I found most interesting was the case of the coin with unknown bias—a Bayesian would say it has 50% chance of coming up heads, but a frequentist would refuse to assign a probability. I was wondering if perhaps this is an analogous case for Bayesians.
That wouldn’t necessarily mean anything is wrong with Bayesianism. Everyone has to draw the line somewhere, and it’s good to know where.
I can understand that, but the fact that a wager has been offered distorts the probabilities under a lot of circumstances.
How do you mean?
I just flipped a coin. Are you willing to offer me a wager on the outcome I have already seen? Yet tradition would suggest you have a degree of belief in the most probable possibilities.
The offering of the wager itself can act as useful information. Some people wager to win.
I see what you mean. Yes, actual, literal, wagers are messier than beliefs. Another example is a bet that the world is going to end: which you should obviously always bet against at any odds even if you believe the last days are upon us. The equivalence between degree of belief and fair betting odds is a more abstract equivalence with an idealized bookie who offers bets on everything, doesn’t take a cut for himself and pays out even if you’re dead.
Actually, I like that metaphor! Let me work this out:
The bookie would see heads and tails with equal probability. However, the bookie would also sees twice as many bets when tails comes up. In order to make the vig zero, the bookie should pay out as much as comes in for whichever bet comes up, and that works out to 1:2 on heads and 2:1 on tails! Thus, the bookie sets the probability for Beauty at 1⁄3.
To make an end of the world bet, person A who believes the world is not about to end will give some money to person B who believes the world is about to end. If after an agreed upon time, it is observed that the world has not ended, person B then gives a larger amount of money to person A.
It is harder to recover probabilities from the bets of this form that people are willing to make, because interest rates are a confounding factor.
Bets with money assume fairly constant and universal utility/$ rate. But that can’t be assumed in this case since money isn’t worth nearly as much if the world is about to end.
So you’d have to adjust for that. And of course even if you can figure out a fair wager given this issue it won’t be equivalent to the right degree of belief.
It isn’t that hard, is it? We just find the interest rate on the amount B got to begin with, right?
But if person B is right she only gets to enjoy the money until the world ends. It seems to me that money is less valuable when you can only derive utility from it for a small, finite period of time. You can’t get your money’s worth buying a house, for example. Plus if belief in the end of the world is widespread the economy will get distorted in a bunch of ways (in particular, the best ways to spend money with two weeks left to live would get really expensive) making it really hard to figure out what the fair bet would be.
‘Credence’ is not probability.
It means: “subjective probabilty”:
“In probability theory, credence means a subjective estimate of probability, as in Bayesian probability.”
http://en.wikipedia.org/wiki/Credence
An estimate of a thing is not the same thing as that thing. And Bayesian probability is probability, not an estimate of probability.
Or—to put it another way—for a Bayesian their estimated probability is the same as their subjective probability.
The concept of “estimated probability” doesn’t make sense (in the way you use it).
? You can certainly estimate a probability—just like Wikipedia says.
Say you have a coin. You might estimate the probabiltiy of it coming down heads after a good flip on a flat horizontal surface as being 0.5. If you had more knowledge about the coin, you might then revise your estimate to be 0.497. You can consider your subjective probability to be an estimate of the probability that an expert might use.
You don’t seem to understand the concept of Bayesian probability. Subjective probability is not estimation of “real probability”, there is no “real probability”. When you revise subjective probability, it’s not because you found out how to approximate “real probability” better, it’s because you are following the logic of subjective probability.
Really? Someone who’s been posting around these parts for years, and your best hypothesis is “doesn’t understand Bayesian probability”? How would you rank it compared to “Someone hijacked your Lw account” or “I’m not understanding you” or “You said something that would have made sense except for a fairly improbable typo”?
This seems a reasonable hypothesis specifically because it’s Tim Tyler. It would be much less probable for most other old-timers (another salient exception that comes to mind is Phil Goetz, though I don’t remember what he understands about probability in particular).
You seem to have to misattribute the phrase “real probability” to me in order to make this claim. What I actually said was “the probability that an expert might use”.
I recommend you exercise caution with those quote marks when attributing silly positions to me: some people might be misled into thinking you were actually quoting me—rather than attacking some nonsense of your own creation.
A reasonable an idea for this and other problems that don’t’ seem to suffer from ugly asymptotics would simply to mechanically test it.
That is to say that it may be more efficient, requiring less brain power, to believe the results of repeated simulations. After going through the Monty Hall tree and statistics with people who can’t really understand either, then end up believing the results of a simulation whose code is straightforward to read, I advocate this method—empirical verification over intuition or mathematics that are fallible (because you yourself are fallible in your understanding, not because they contain a contradiction).
This is an interesting idea, that appeals to me owing to my earlier angle of attack on intuitions about “subjective anticipation”.
The question then becomes, how would we program a robot to answer the kind of question that was asked of Sleeping Beauty?
This comment suggests one concrete way of operationalizing the term “credence”. It could be a wrong way, but at least it is a concrete suggestion, something I think is lacking in other parts of this discussion. What is our criterion for judging either answer a “wrong” answer? More specifically still, how do we distinguish between a robot correctly programmed to answer this kind of question, and one that is buggy?
As in the robot-and-copying example, I suspect that which of 1⁄2 or 1⁄3 is the “correct” answer in fact depends on what (heretofore implicit) goals, epistemic or instrumental, we decide to program the robot to have.
And I think this is roughly equivalent to the suggestion that the payoff matters.
Depending on what you’re testing and a decent level of maths ability, empirics doesn’t help you here.
For my own benefit, i’ll try to explain my thinking on this problem, in my own words, because the discussions here are making my head spin. Then the rest of you can tell me whether i understand. The following is what i reasoned out before looking at neq1′s explanations.
Firstly, before the experiment begins, i’d expect a 50% chance of heads and a 50% chance of tails. Simple enough.
If it lands on heads, then i wake up only once, on Monday. If it lands on tails, then i wake up once on Monday, and a second time on Tuesday.
So, upon waking with amnesia, i’d expect a 50% chance of it being my first-and-only interview on Monday. I’d expect a 25% chance of it being my first-of-two interviews on Monday, and a 25% chance of it being my second-of-two interviews on Tuesday.
And due to the amnesia, and my having no indication of what day it is, i’d basically have no new information to act on after i wake up. So my probability estimates would remain the same after waking as they were before.
So, upon waking, i’d say:
50% chance that the coin landed on heads, and it’s currently Monday.
25% chance that the coin landed on tails, and it’s currently Monday.
25% chance that the coin landed on tails, and it’s currently Tuesday.
In other words, neq1′s probability-tree picture turned out to most clearly match my own reasoning on the problem. Does this make sense?
This was also my understanding of the problem. Are we missing something?
On awakening, I would give:
p(heads) and p(tails) on Monday should be equal (a fair coin was flipped). p(tails) on Monday and p(tails) on Tuesday should also be equal (nothing important changes in the interim).
Even though you knew ahead of time that there was a 50% chance you’d be on the heads path, and a 50% chance you’d be on the tails path, you’d shift those around without probability law justification?
I also think you are not careful with your wording. What does p(heads) on Monday mean? Is it a joint or conditional probability? p(heads | monday) = p(tails | monday), yes, but Beauty can’t condition on Monday since she doesn’t know the day. If you are talking about joint probabilities, p(heads and monday) does not equal p(tails and monday).
Re: a 50% chance you’d be on the heads path, and a 50% chance you’d be on the tails path.
Those are not the probabilities in advance of the experiment being perfomed. Once the experimental procedure is known the subjective probabilites for Beauty on awakening are 33% for heads and 67% for tails. These probabilities do not change during the experiment—since Beauty learns nothing.
“Once the experimental procedure is known the subjective probabilites for Beauty on awakening are 33% for heads and 67% for tails.”
Suppose 50% of the population has some asymptomatic form of cancer. We randomly select someone and do a diagnostic test. If they have cancer (we don’t tell them), we wake them up 9 times and ask their credence for cancer (administering amnesia-inducing drug each time). If they don’t have cancer, we wake them up once.
The person selected for this experiment knows there is a 50% chance they have cancer. And they decide ahead of time that, upon awakening, they’ll be 90% sure they have cancer. And this makes sense to you.
Re: “but Beauty can’t condition on Monday since she doesn’t know the day.”
She could make a bet. You do not have to know what day of the week it is in order to make a bet that it is Monday.
Re: “If you are talking about joint probabilities, p(heads and monday) does not equal p(tails and monday).”
Sure it does—if a fair coin was flipped!
Maybe instead of just saying it’s true, you could look at my proof and show me where I made a mistake. I’ve done that with yours.
I think you already clarified that here.
You interpreted:
“What is your credence now for the proposition that our coin landed heads?”
...as being equivalent a bet along these lines:
“the scenario where at each awakening we offer a bet where she’d lose $1.50 if heads and win $1 if tails, and we tell her that we will only accept whichever bet she made on the final interview.”
...which is a tortured interpretation.
The question says “now”. I think the correct corresponding wager is for Beauty to make a bet which is judged according to its truth value there and then—not for it to be interpreted later and the payout modified or cancelled as a result of other subsequent events.
Yes, this is correct.
My reasoning was a bit simpler.
Prior to the experiment, the probability of heads was 50%, tails 50%. Upon waking.. she learns no new information. She knew in advance she was going to wake up, and they tell her nothing.
So how could her beliefs possibly change?
One of the major take-aways I got from actually reading Jaynes was how he is always careful to write probabilities as conditioned on all prior knowledge: P(A|X) where X is our “background knowledge”.
This is useful in the present case since we can distinguish X, Beauty’s background knowledge about which way a given coin might land, and X’, which represents X plus the description of the experimental setup, including the number of awakenings in each case.
That—the difference between X and X’ - is the new information that Beauty learns and which might make P(heads|X’) different from P(heads|X).
She knew from the start that she is twice as likely to be asked when it is tails. So, her estimate of the chances of her being awakened facing tails should be bigger from the beginning.
Thank you, your explanation for the 1⁄3 answer makes sense to me. I’m still a bit confused about it, but i think i feel like i might be changing my mind.
I’ll try to figure out what would happen if SB makes a bet on the coin flip at each interview. Suppose she guesses heads each time, then:
Given that the result was heads, then she is interviewed once, and she is right once.
Given that the result was tails, then she is interviewed twice, and she is wrong twice.
… meaning that if the experiment is repeated several times, the guess “heads” will be correct for one out of three guesses. Just like you said.
(Perhaps it’s important to realize that, if the coin lands on tails, then she’s guaranteed to wake up once on Monday, and also guaranteed to wake up once on Tuesday. Now that i read your other comment again, i see your meaning when you say that p(heads) and p(tails) for each day is the same.)
I couldn’t decide exactly you meant by “twice as likely to be asked (woken) when it’s tails” either. I’m going to guess that you’re averaging evenly over Monday and Tuesday, in which case I agree. After marginalizing over M/T, P(wake|heads)=1/2 and P(wake|tails)=1.
“She knew from the start that she is twice as likely to be asked when it is tails. ”
The probability that she would be asked is 1, regardless of the outcome of the coin. Her estimate of the chances of her being awakened should have been 1.
Yes: her estimate of the chances of her being awakened is indeed 1.
Please insert a section break near the start of this post, so the whole thing doesn’t show up on “NEW”.
Um… why? There are the same number of heads&Monday as tails&Monday; why would heads&Monday be more likely?
The smoke and mirrors with that solution is that the hypothetical repeated sampling is done in the wrong way. Think about one single awakening, which is when the question is asked. If you want to think about doing 1000 replications of the experiment, it should go like this: coin is flipped. if heads, it’s monday. if tails, it’s monday with prob .5 and tails with prob .5. repeat 1000 times. We’d expect 500 heads&monday, 250 tails&monday, 250 tails&tuesday. It should add up to 1000, which it does. If you do 1000 repeated trials and get more than 1000 outcomes, something is wrong. It’s a very subtle issue here. (see my probability tree)
Another way to look at it: Beauty knows there’s a 50% chance she’s somewhere along the heads awakening sequence (which happens to be a sequence of 1 day) and a 50% chances she’s somewhere along the tail awakening sequence (which is 2 days in the sleeping beauty problem or 1,000,000 days in the extreme problem). Once she’s along one of these paths, she can’t distinguish. So prior=posterior here.
I make it: 500 heads & Monday … 500 tails & Monday … 500 tails & Tuesday.
You are arguing with http://en.wikipedia.org/wiki/Sleeping_Beauty_problem about the problem—and are making math errors in the process.
Interesting. You want to replicate an awakening 1000 times, and you end up with 1500 awakenings. I’d be concerned about that if I were you.
In 1000 replications of the experiment, there will be an average of 1500 awakenings − 1000 on Monday, and 500 on Tuesday.
“Suppose this experiment were repeated 1,000 times. We would expect to get 500 heads and 500 tails. So Beauty would be awoken 500 times after heads on Monday, 500 times after tails on Monday, and 500 times after tails on Tuesday.”
http://en.wikipedia.org/wiki/Sleeping_Beauty_problem
What is it about this that you are not getting?
Complete replications of the entire experiment is not the right approach, because the outcome of interest occurs at a single awakening. We need 1000 replications of the process that lead to an awakening.
What you said further up this branch of the thread was:
“if you want to think about doing 1000 replications of the experiment, it should go like this”.
Now you seem to be trying to shift the context retrospectively—now that you have found out that all the answers you gave to this were wrong.
You know that’s not true. I didn’t just discover the ’500 500 500′ answer—I quoted it from wikipedia and showed why it was wrong.
I should have made it clear what I meant by experiment, but you know what I meant now, so why take it as an opportunity to insult?
I don’t know what you mean by “experiment”.
“The process that lead to an awakening” refers not to one physical process, but potentially to multiple partly-overlapping physical processes per actual physical experiment.
You mean to run the physical experiment around 666 times, resulting in 1000 awakeningns in total—around 667 on Monday, and around 333 on Tuesday? Rather obviously that doesn’t support your maths either.
I have yet to find a sum that gives 500:250:250 as originally claimed. There is no 250 involved. Your supplied “probability tree” image is just nonsense—a wrong analysis of the problem, irrespective of what bet you think the question corresponds to.
I don’t think it is accurate to describe my post as “insulting”.
So, I’m still working on this in my plodding, newbie-at-probability-math fashion.
What I took away from my exchanges with AlephNeil is that I get the clearest picture if I think in terms of a joint probability distribution, and attempt to justify mathematically each step of my building the table, as well as the operations of conditioning and marginalizing.
In the original Sleeping Beauty problem, we have three variables: x is how the coin came up {heads, tails}, y is the day of the week {monday, tuesday}, and z is whether I am asked for my credence (i.e. woken) {wake, sleep}.
P(x,y,z)=P(x)P(y|x)P(z|x,y) and unlike in the “revival” case x and y aren’t clearly independent. In fact the answer very much seems to hinge on what we take the probability of it being tuesday, given that the coin came up heads.
The relevant possible outcomes are: (H,M,W) (H,T,W) (T,M,W) (T,T,W) (H,M,S) (H,T,S) (T,M,S) (T,T,S) - eight in all.
Conditioning on z=W consists of deleting the part of the table that has z=S, summing up all the remaining values, and renormalizing by dividing every cell in cell in the table by the total.
The rules for filling the table are: the values must add up to 1; the “heads” and “tails” branches must receive equal probability mass from P(x); and P(z|x,y) must reflect the experimental rules. So we must have the following:
P(H,M,W) - see below
P(H,T,W)=0
P(T,M,W)=1/4
P(T,T,W)=1/4
P(H,M,S)=0
P(H,T,S) - see below
P(T,M,S)=0
P(T,T,S)=0
The ambiguity seems to arise in allocating probability mass to the outcomes: “the coin comes up heads; it is Monday; I get woken up”, and “the coin comes up heads; it is Tuesday; I do no get woken up”. That is, I’m not sure what the correct conditional distribution P(y|x) should be.
The 1⁄2 answer corresponds to allocating all of the available 1⁄2 probability mass to the first of these outcomes in the joint table, saying P(y=M|x=H)=1 and P(y=T|x=H)=0. Or verbally, “it’s certain that I get woken up on Monday if the coin comes up heads, and after that the experiment is over”. The “not woken up” half of the table receives no probability mass at all.
The 1⁄3 answer corresponds to distributing that probability mass among the two outcomes, saying P(y=M|x)=P(y=T|x)=1/2. Verbally: “however the coin comes up, it could be either Monday or Tuesday”. Here 1⁄4 of the total probability mass is in the “not woken up” half of the table and gets deleted when we condition on being woken.
(ETA: Where does the amnesia appear in this formalization? It doesn’t, but neither does it need to. Its only practical consequence is to outlaw conditioning on the day, so working out the distribution P(x|z) conforms to the amnesia.)
OK, this seems quite helpful.
I think the question we now have to ask to resolve the remaining confusion is—what, exactly, is it that Beauty is uncertain about, and at what time?
The variables we are considering only seem to make sense if Beauty is having woken up as part of the experiment. That is, assuming x means “the coin came up heads or tails”, y means “it is Monday or Tuesday”, and z means “I am awake or asleep”—i.e., we’re dealing with uncertainty about facts that are already fixed, just unknown. Then these do not make sense outside that context.
Using that interpretation, then, and sticking to that context, we get the answer of 1⁄2, as if Beauty has just been woken up, she cannot allocate any probability mass to the possibility that she is asleep.
What other interpretations could there be? Perhaps the coin has not yet been flipped, and x is “the coin will come up heads (tails)”, y is “it will be Monday (Tuesday) when I wake up”, z is “I will be awake (asleep) when I wake up” (!). Of course, if the coin has not yet been flipped, I think we can agree 1⁄2 has to be the right answer. (Which then leads to the argument that it has to be 1⁄2 as she hasn’t gained any information, but I guess that’s been gone over before.) But the problem is that this y doesn’t seem well-defined, as she might be woken up more than once. (Hm, this is sounding familiar as well...) We could perhaps introduce separate variables for being woken up on each day; from the pre-flip point of view, that makes more sense. But it still gets you an answer of 1⁄2.
This is all I can come up with; I’m not seeing what other interpretations there could be. Could someone explain just what ‘x’, ‘y’, and ‘z’ correspond to—if they do correspond to anything well-defined rather than having to be thrown out—in the interpretations that get you 1/3? I don’t see any way for the probabilities to represent her uncertainty at the time of waking, while still having her assign nonzero probability to the possibility that she’s asleep.
“At what time” doesn’t matter in this formalism. You can be uncertain about future events or about past events, all that matters is how you update your uncertainty upon receiving new information.
So a triplet (x,y,z) represents, in the abstract, a conceivable configuration of the component uncertainties in the experimental setup. The coin could have come up heads or tails; it could be Monday or Tuesday; Beauty can be woken up on that day, or left asleep.
The joint probability P(x,y,z) is the plausibility we assign—in a timeless manner—to the corresponding propositions. Strictly speaking, it should be P(x,y,z|B) where B is our background information about the experiment: the rules, the fact that the coin is unbiased (or not known to be biased), and so on.
Our background information directs how we allocate probability mass to the various points in the sample space: P(T,T,S) corresponds to “the coin comes up tails, the day is Tuesday, Beauty is asleep”. The rules of the experiment require that this be zero.
On the other hand, P(H,T,S) corresponds to “the coin comes up heads, the day is Tuesday, Beauty is asleep”, and this can be non-zero.
When you learn (“condition on”) some new information, the probability distribution is altered: you only keep the points which correspond to this particular variable having the value(s) you learned, and you renormalize so that the total probability is 1. So, on learning “heads” you keep only the points having x=H. On learning what day it is you keep only the points having that value for y.
When Beauty wakes up, she learns the value of z, so she can condition on z. That means she throws away the part of the joint distribution where she was supposed to be asleep. If that part of the joint distribution did contain some probability mass (as I’ve argued above it can), then that can make P(x|z=W) something other than 1⁄2.
Hm. Should “S” be representing “Beauty is asleep or the experiment is over”? Seeing as how the experiment ends after one day if heads comes up. But then, we can just modify the problem to say she’s put back to sleep for the rest of Tuesday in the case of heads; that shouldn’t change anything.
It seems to me that if we make the experiment last three days instead of two, that ambiguity goes away: then it becomes clear that Beauty must assign non-zero probability mass to (H,T,S). (Or does it?)
However, that means I’d have to change my mind once again, and decide that the correct answer is in fact 1⁄3.
Here is a Google spreadsheet showing my reasoning. Any feedback welcome.
Can you explain what the three day version means in English, I’m having a little trouble parsing the spreadsheet.
See here and its grandparent.
The three day version goes: “Beauty is explained the rules on Sunday and put to sleep, then a coin is flipped. If it comes up heads, Beauty is awakened on Monday and sleeps through Tuesday and Wednesday. If it comes up tails, Beauty is awakened on Monday, Tuesday and Wednesday. On all awakenings (with the previous day’s memories erased by the sleeping drug) she is asked for her credence in Heads.”
This differs from the original which says “the experiment ends on Monday is the coin comes up heads”. But Beauty would have the same uncertainty if you decided, in the original version, to wake Beauty on Tuesday in the event of heads, rather than Monday.
BTW the Google spreadsheet has a chat area, if you’d like to discuss this live.
Variation Alpha:
10 people. If heads, one of the ten is randomly selected to be revived. If tails, all ten are revived. (If you like, suppose that the ten are revived one at a time on consecutive days—but it doesn’t make any difference.)
Variation Beta:
Same as Alpha except the 10 people are clones of yours, with mental state identical to your own.
Variation Gamma:
Same as Beta except the cloning is done after you fall asleep.
Variation Delta:
Same as Gamma except that the way the clones are not created all at once. Rather, successive clones are created on subsequent days by erasing one days’ worth of memory of the previous clone.
It seems clear to me that in variation Alpha, 1⁄11 is the answer and not 1⁄2. And clearly variation Delta is isomorphic to the Sleeping Beauty problem (except with 10 days rather than 2). And clearly each step from Alpha to Delta doesn’t change anything essential.
Right?
Nice way of formulating the problem.
In variation Alpha we know beforehand of a particular event that will happen with P=1 if tails and P=1/10 if heads. Call this event “Jack wakes up and thinks a thought”. So when we see that event we can conclude 1⁄11.
But in Beta and the remaining variations there is no such event. A clone can’t tell which clone it is, going into the experiment my anticipated experience does not differ based on whether or not heads comes up. Either one of me will be woken up or 10 identical copies who don’t know about each other will be woken up. “Jack wakes up and thinks a thought” happens at the same probability for heads and tails. At no point does any copy of me get new information to revise from 1⁄2.
What is it that makes that clear to you?
Your variation Alpha strikes me as somewhat under-specified. Here is how I’m tempted to fill in:
It seems to me that if the patient has no other relevant information (such as how many patients were revived), their answer ought to be 1⁄2, no matter how many revivals occur on tails. This looks a lot more like Stuart Armstrong’s “proof of the SIA” than like SB, though, so I might have to reread that post.
The background information X’=(coin flip, revival with questionnaire) is different from the background information X=(coin flip), but not necessarily enough to alter the answer to the question—unless for some reason each patient is interested in maximizing the number of patients who would get the right answer if they were asked straight out how the coin came up. (Which is how some participants in the discussion have interpreted “credence”, I now believe. Under some assumptions, such as having a payout involved, e.g. getting a candy bar for calling the coin correctly, this is even a legitimate interpretation.)
If you take “credence” to mean “your prior, updated with whatever information you’ve gained that has bearing on how the coin might have come up”, and your prior for the coin is the 50⁄50 distribution, then it seems to me that you have nothing to update on, and that the answer is still 1⁄2.
Your filling in is not quite what I had in mind: When I said “one is randomly selected to be revived” I meant to imply “none of the others are revived”.
Also, you may suppose that before entering hibernation, each patient knows that there’s going to be a coin flip and what will happen in each case.
Deducing 1⁄11 is now just a matter of applying Bayes’ theorem. This may be easier to comprehend if we introduce:
Variation Alpha’:
Same as Variation Alpha except that one of the 10 people is (secretly) designated beforehand to be revived in the event of heads.
How do the variations you suggest make a difference? Do you agree with my conclusions in my own variant?
Well, as I’m sure you’ve guessed my aim is to present the “1/2”-er with a ‘smooth spectrum’ of scenarios beginning with something that’s obviously 1⁄3 (or in this case 1⁄11) and ending with something isomorphic to the Sleeping Beauty puzzle, and challenging them to say where along this spectrum the “1/3″-er’s argument breaks down.
In the case of Variation Morendil… hmm, I think the Bayesian reasoning for Variation Alpha goes through just the same, and the answer is 1⁄11. Doesn’t it? (Does it make a difference if the patients know about the scenario beforehand, rather than being told about it only in the questionnaire? I don’t think so. So pretend they are told beforehand...)
Effectively, either variant comes down to being told: “A fair coin has been flipped, and depending on the result of that flip you are either one of a group of 10 people or a lone subject, what credence do you have in being on the small-group branch?”
It doesn’t seem obvious to me why, in such a situation, I should answer other than 1⁄2, so I’m still interested in what makes it obvious to you.
OK, well let’s start with Variation Alpha’. Consider that there are 20 equally likely possibilities, which we can label (x, y) where x belongs to {heads, tails} and y belongs to {1, …, 10}. Being in possibility (x, y) means “x is the result of the coin toss and y denotes the person we selected beforehand to be revived in the event of heads.”
Suppose that (like Patrick McGoohan) you are number 6. Then out of the 20 possibilities, there are 11 in which you are revived, namely (heads, 6) and (tails, 1) to (tails, 10). Therefore, applying Bayes’ theorem, given that you are revived, the probability of heads is 1⁄11.
OK. I have a quibble with your formalization but I get a similar result when working it out formally: if my background information consists of the Alpha procedure, then updating on being revived does give me 1⁄11.
The quibble is that I only know, algebrically, to condition on something that is a variable, so to work out the joint probability distribution at issue I had to introduce the variable z, with values {revived, not revived}. The triplet (H,3,NR) codes for “the coin comes up heads, person 3 gets picked to be revived in the event of heads, and I don’t get revived”. (Clearly this entails that I’m not person 3.)
The joint probability distribution P(x,y,z) factors out, per the product rule, into P(x)P(y)P(z|x,y) since x and y are independent.
Let’s use N=3 for the number of subjects involved, as I want to write out the full joint distribution (in case someone disagrees with that step) and N=10 makes it tedious. Arbitrarily I consider things from the perspective of Two.
(H,1,R)=0
(H,2,R)=1/6
(H,3,R)=0
(H,1,NR)=1/6
(H,2,NR)=0
(H,3,NR)=1/6
(T,1,R)=1/6
(T,2,R)=1/6
(T,3,R)=1/6
(T,1,NR)=0
(T,2,NR)=0
(T,3,NR)=0
This seems to check out: the marginal distribution for x is the expected 50⁄50, the marginal distribution for y is uniform, it all sums up to 1, it reproduces the setup as described. The conditional distribution P(x,y|z=R) is then:
(H,1)=0
(H,2)=1/4
(H,3)=0
(T,1)=1/4
(T,2)=1/4
(T,3)=1/4
Resulting in P(H|z=R)=1/4.
So I agree here that “I have been revived” is proper to update on, and yields 1/(N+1) credence for the coin having come up heads. (It wasn’t obvious to me to start out, and I still don’t rule out having made a mistake somewhere.)
I can see how this works out as equivalent to the variant I described, with z meaning “got the questionnaire” and y meaning “the label of the person picked to receive the questionnaire in the event of heads”. It shouldn’t matter, either, when we learn about the procedure.
Variations Beta and Gamma don’t seem to introduce anything that should matter, because nothing in the original formulation hinges crucially on particular differences in the memories of the N people involved.
I’m not quite sure what Delta means. My interpretation of Delta would be:
The triplet (H,3,NR) codes for… um… “the coin came up heads, day 3 was picked to awaken the original me in the even of tails, I (someone other than the person to be awakened in the case of heads) was not revived”. Best I can do.
Something seems to have gone awry somewhere: Delta is not formally equivalent to the previous formulations.
Also, any interpretation of Delta has a big difference with Sleeping Beauty: it ends up with N distinct clones of me, whereas SB ends up with a single Beauty.
My description of Delta wasn’t great, to be fair. So I’ll clarify (and change it slightly) like this:
If (x, y) where x is in {H, T} and y is in {1,2,3} then:
If H then you are not cloned and wake up on day y. If T then a clone of you is created just before the beginning of day 1. Either you or the clone (doesn’t matter which) is woken for day 1 while the other is kept in storage. Then the one that was kept in storage is cloned just before the beginning of day 2. Etc.
The idea of moving from Gamma to (my new) Delta is “it shouldn’t matter whether the clones are created right away (and possibly never used) or ‘just in time’”.
Anyway, the following idea has occurred to me, for defending 1⁄3 as the answer to the original Sleeping Beauty problem: Imagine that there is a clock on the wall and that on any day when SB is woken, the time of day of her awakening is chosen randomly (from a uniform distribution). Then the information that SB gets on awakening is not simply “I was awakened at least once” but “I was awakened at least once at time x”...
...and I’ll leave you guys to do the calculation, but you get 1⁄3, not 1⁄2.
We still have the same problem: there is no value of z that corresponds to “I am a non-special member of the initial set of N people, and I happen to get unlucky and not be revived”. That makes Delta not equivalent to the other variants. It does very much matter whether “not revived” is subjectively possible!
It feels as if this might be the same point that neq1 made earlier in answer to one of the defenses of 1⁄3, so I’d urge you to press on with the formalization and calculation.
My take-away from the discussion (and the two occasions where I changed my mind so far) is that it confirms intuitions aren’t reliable and need to be backed by detailed formalization.
The calculation is a little bit awkward because seemingly one has to condition on an event of zero probability (which entails division by zero). But we can proceed as follows:
Suppose the number of moments in a day is finite but ‘very large’, call it N.
Let’s list all of the possible outcomes:
If x = heads then SB is woken on Monday, and there are now N equally likely possibilities for when this will be.
If x = tails then SB is woken on Monday and again on Tuesday. There are N^2 equally likely possibilities for the two waking times.
Suppose SB wakes at time t0. Then she can reason thusly: If the coin toss was heads, then the probability of me seeing a clock show t0 was 1/N. Or if the coin toss was tails: Out of the N^2 possibilities, there are N where I see t0 on Monday and N where I see t0 on Tuesday, but I’ve double counted the case where I see t0 on both Monday and Tuesday, so in fact there are 2N-1 equally likely ways this could have happened. Note that (2N-1)/N^2 is roughly equal to 2/N.
So let H be the event “coin is heads” and let T0 be the event “SB sees clock pointing to t0″.
We have: P(T0 | H) = 1/N and P(T0 | ~H) = about 2/N
From Bayes’ theorem: P(H | T0) / P(~H | T0) = (P(H)/P(~H)) (P(T0|H) / P(T0|~H)) = (1/2)/(1/2) (1/N)/(2/N) = 1 * 1⁄2 = 1⁄2 (roughly)
So the posterior probabilities for H and ~H must be (about) 1⁄3 and 2⁄3 respectively.
The posterior probabilities converge to 1⁄3, 2⁄3 as N goes to infinity.
(Note: The reason for the discrepancy (i.e. the fact that P(H | T0) is not exactly 1⁄3) is that SB’s reasoning about ‘double-counting’ the instance when she is woken at t0 both times is actually invalid, and this possibility ought to be double counted. But the entire dispute centers around showing why it has to work this way in the case N = 1, so I think I’m entitled to pretend that the anti-double-counting argument is valid in order to show the contrary.)
I can reformulate the argument above much more straightforwardly:
Consider the original Sleeping Beauty problem.
Suppose we fix a pair of symbols {alpha, beta} and say that with probability 1⁄2, alpha = “Monday” and beta = “Tuesday”, and with probability 1⁄2 alpha = “Tuesday” and beta = “Monday”. (These events are independent of the ‘coin toss’ described in the original problem.)
Sleeping beauty doesn’t know which symbol corresponds to which day. Whenever she is woken, she is shown the symbol corresponding to which day it is. Suppose she sees alpha—then she can reason as follows:
If the coin was heads then my probability of being woken on day alpha was 1⁄2. If the coin was tails then my probability of being woken on day alpha was 1. I know that I have been woken on day alpha (and this is my only new information). Therefore, by Bayes’ theorem, the probability that the coin was heads is 1⁄3.
(And then the final step in the argument is to say “of course it couldn’t possibly make any difference whether an ‘alpha or beta’ symbol was visible in the room.”)
Now, over the course of these debates I’ve gradually become more convinced that those arguing that the standard, intuitive notion of probability becomes ambiguous in cases like this are correct, so that the problem has no definitive solution. This makes me a little suspicious of the argument above—surely the 1/2-er should be able to write something equally “rigorous”.
Sorry, I meant to say I’d urge you to press on with the formalization and calculation in your interpretation of the Delta case.
I’ll punt on the wall-clock idea. I’m not planning to spend any time working out the formalization for anything that involves large numbers of values for any given variable—my skills aren’t up to doing that confidently, and we seem to have enough to go on with formulations of the problem that only involve smaller sets.
OK but intuitively it can’t make any difference whether SB is woken at a fixed or a random time of day, and it can’t make any difference whether there is a clock on the wall.
So the solution to the ‘random-waking, clock on wall variation’ must be the same as the solution of the original SB problem.
See this for a crisp, simple formalization which appears to show where the ambiguity between 1⁄2 and 1⁄3 comes from.
If you are the person that was selected beforehand to be revived in the event of heads, then I agree with 1⁄11. Unfortunately, in variation beta we lose the ability to label someone ahead of time. This changes things.
No it doesn’t. Your clones are subjectively indistinguishable from you, but they’re all in different places at least. Perhaps they’re in rooms labelled 1-10, but not allowed to go outside and look at the number. So the experimenters can toss a D10 and randomly choose a subject without breaking the ‘clone condition’.
Variation Alpha is unclear, as worded. Let’s say one of the 10 people is Sleeping Beauty, and the other people have different names. Sleeping Beauty was identified ahead of time, and she knows it. If she is not selected, then no one is interviewed. Then, if she is revived, she should think it was heads with probability 10⁄11.
But… if we will interview everyone who is revived, and no one was labeled as special ahead of time, then all each person that was interviewed knows is that at least one person was revived, which was a probability 1 event under heads and tails.
This is just the self-indication assumption situation.
Consider an example. Suppose we want to know if it’s common for people to get struck by lightening. We could choose one person ahead of time to monitor. If they get struck by lightening in the next, say, year, then it’s likely that getting struck by lightening is common. But… if instead everyone is monitored, but we are only told about one person who was struck by lightening (there could be others, we don’t know), then we have no information about whether getting struck by lightening is common or not.
Variation Alpha is intended in such a way that, from the perspective of the experimenters, none of the ten subjects is ‘special’.
See here for why 1⁄11 is the correct posterior probability for heads.
I have a question for those more familiar with the discussions surrounding this problem: is there anything really relevant about the sleeping/waking/amnesia story here? What if instead the experimenter just went out and asked the next random passerby on the street each time?
It seems to me that the problem could be formulated less confusingly that way. Am I missing something?
Yes, I think that there is something very important about the memory loss/waking.
Suppose we perform the really extreme sleeping beauty problem but each interview is with a different person, chose at random from a very large pool.
I don’t think many people here would take the “1/2” approach; they would reason “Since if the die came up 1, there would be 400 interviews, and I am being interviewed, it almost certainly came up 1″.
I’m not sure I understand your “really extreme” formulation fully. Is the amnesia supposed to make the wins in chocolate bars non-cumulative?
I’m confused about how that’s supposed to have the same relevant features, so the answer to your question is probably “Yes”.
Are you suggesting the following?: Flip a coin. Go out and ask a random passerby what the probability is that the coin came up heads.
If so, you’ve entirely eliminated Beauty’s subjective uncertainty about whether she’s been woken up once or more than once, which is putatively relevant to subjective probability.
The exact equivalent of the original problem would be as follows. You announce that:
(1) You’re about to flip a coin at some secret time during the next few days, and the result will be posted publicly in (say) a week.
(2) Before the flip, you’ll approach a random person in the street and ask about their expectation about the result that’s about to be posted. After the flip, if and only if it lands tails, you’ll do the same with one additional person before the result is announced publicly. The persons are unaware of each other, and have no way to determine if they’re being asked before or after the actual toss.
So, does anyone see relevant differences between this problem and the original one?
I’m guessing you already understood this, but as a person accosted and informed of this procedure, I know it’s more likely that I heard about it because the result was tails (than I was to hear about it before the toss). Those experiments that resulted in heads, I (most likely) never got to hear about.
So in asking if there’s any relevant thing that’s different, you expect a halfer to come forth and explain himself. Unfortunately, I’m not one. But it does seem to me that the only possible important difference is that Beauty knows about the experiment before the coin is tossed; but perhaps the amnesia compensates exactly for that.
As far as your “an instrument sold by a (so far completely ignorant) third party that pays off $100 if the announced result is tails”, then of course Beauty would value it exactly as your interviewees, provided she knew that the offer was to be made at every interview.
Well you also have to note in the problem description that a particular person is asked, and ask what should their guess be (so far you just got as far as the announcement).
But I think that’s equivalent.
Well, yes, I should also specify that you’ll actually act on the announcement.
But in any case, would anyone find anything strange or counterintuitive about this less exotic formulation, which could be readily tried in the real world? As soon as the somewhat vague “expectation about the result” is stated clearly, the answer should be clear. In particular, if we ignore risk aversion and discount rate, each interviewee should be willing to pay, on the spot, up to $66.66 for an instrument sold by a (so far completely ignorant) third party that pays off $100 if the announced result is tails.
If the coin is tails, you would ask two random passerbies.
Aha. In that case, I’d say it’s analogous, but I might just be granting that since the correct answer there is 1⁄3 as well. Or are there folks that would answer 1⁄2 to this scenario?
Yes, the answer is 1⁄3, because I am more likely to be asked if it was tails. But in the original problem, I am not more likely to be asked, I am just asked more often, so there is no analogy.
I agree with the others about worrying about the decision theory before talking about probability theory that includes indexical uncertainty, but separately I think there’s an issue with your calculation.
“P(Beauty woken up at least once| heads)=P(Beauty woken up at least once | tails)=1”
Consider the case where a biased quantum coin is flipped and the people in ‘heads’ branches are awoken in green rooms while the ‘tails’ branches are awoken in red rooms.
Upon awakening, you should figure that the coin was probably biased to put you there. However, P(at least one version of you seeing this color room |heads) = P(at least one version of you seeing this color room |tails) = 1. The problem is that “at least 1” throws away information. p(I see this color|heads) != p(I see this color tails). The fact that you’re there can be evidence that the ‘measure’ is bigger. The problem lies with this ‘measure’ thing, and seeing what counts for what kinds of decision problems.
The blue eyes problem is similar. Everyone knows that someone has blue eyes, and everyone knows that everyone knows that someone has blue eyes, yet “they gained no knew information because he only told them that at least one person has blue eyes!” doesn’t hold.
That “you are there” is evidence that the set of possible worlds consistent with your observations doesn’t include the worlds that don’t contain you, under the standard possible worlds sample space. Probabilistic measure is fixed in a model from the start and doesn’t depend on which events you’ve observed, only used to determine the measure of events. Also, you might care about what happens in the possible worlds that don’t contain you at all.
But the amount of quantum measure in each color room depends on which biased coin was flipped, and your knowledge of the quantum measure can change based on the outcome.
Just as a better intuition pump, we can imagine the “really extreme” sleeping beauty problem.
Omega rolls a d20
If it comes up a “1”, then beauty is woken 400 times, otherwise she is woken once only. Questions:
Should beauty bet her $100K house in a treble-or-nothing bet against omega that the d20 came up “1”? (She will have to go and live in the house again after the experiment is over) (treble or nothing means that if the coin did come up “1″, then she gets 3* her stake, otherwise she loses it).
but what if:
Omega gives beauty one small chocolate bar, which she can either eat now, or risk in a treble-or-nothing bet against omega. Should she take the bet? Chocolate bars have to be consumed immediately, before she is sent back to sleep, and obviously each time she is woken, another set of chocolate bars is used.
The OP is correct. There are actually all the same issues here as with the Self Indication Assumption; it is wrong for the same reasons as the 1⁄3 probability. I predict that a great majority of those who accept SIA will also favor the probability of 1⁄3.
Sleeping Beauty does not sleep well. She has three dreams before awakening. The Ghost of Mathematicians Past warns her that there are two models of probability, and that adherents to each have little that is good to say about adherents to the other. The Ghost of Mathematicians Present shows her volumes of papers and articles where both 1⁄2 and 1⁄3 are “proven” to be the correct answer based on intuitive arguments. The Ghost of Mathematicians Future doesn’t speak, but shows her how reliance on intuition alone leads to misery. Only strict adherence to theory can provide an answer.
Illuminated by these spirits, once she is fully awake she reasons: “I have no idea whether today is Monday or Tuesday; but it seems that if I did know, I would have no problem answering the question. For example, if I knew it was Monday, my credence that the coin landed heads could only be 1⁄2. On the other hand, if I knew it was Tuesday, my credence would have to be 0. But on the gripping hand, these two incontrovertible truths can help me answer as my night visitors suggested. There is a theorem in probability, called the Theorem of Total Probability, that says the probability for event A is equal to the probability of the sum of the events (A intersect B(i)), where B(i) partitions the entire event space.
“Today has to be either Monday or Tuesday, and it can’t be both, so these two days represent such a partition. Since I want to avoid making any assumptions as long as I can, let me say that the probability that today is Monday is X, and the probability that it is Tuesday is (1-X). Now I can use this Theorem to state, unequivocally, that my credence that the coin landed heads is P(heads)=(1/2)X+0(1-X)=X/2.
“But I know that it is possible that today is Tuesday; even a Bayesian has to admit that X<1. So I know that 1⁄2 cannot be correct; the answer has to be less than that. A Frequentist would say that X=2/3 because, if this experiment were repeated many times, two out of every three interviews would take place on Monday. And while a Bayesian could, in theory, choose any value that is less than 1, it is a violation of Occam’s Razor to assume there is a factor present that would make X different than 2⁄3. So, it seems my answer must be 1⁄3.
You can have a credence of 1⁄2 for heads in the absence of which-day knowledge, but for consistency you will also need P(Heads | Monday) = 2⁄3 and P(Monday) = 3⁄4. Neither of these match frequentist notions unless you count each awakening after a Tails result as half a result (in which case they both match frequentist notions).
Proof that neq1 is wrong:
Let H be the event that heads was flipped in this experiment instance. We’re going to let Beauty experience a waking now. Let M be the event that the waking is on Monday. Let B be the information that Beauty (knowing the experiment design) has upon waking. Let h=P(H|B), and let m=P(M|B).
We wish to discover the true values of h and m. Clearly in the context of someone being asked about the expected outcome of the experiment, P(H)=1/2, but h may (or may not) differ from 1⁄2.
Fact 1: P(H|M,B)=P(H)=1/2
Fact 2: P(H|~M,B)=0 (by ~M I mean the complement of M, i.e. that it’s not Monday)
Given the above two facts, we know enough to solve for h and m.
lemma 1:
P(~H|B)=P(M,B)P(~H|M,B)+P(~M)P(~H|~M,B) ; probability axiom
(1-h)=m(1/2)+(1-m)(1) ; by facts 1-2 and above axiom
1-(h)=1-(m/2) ; above simplified
h=m/2
lemma 2:
P(H|B)=P(M)P(H|M,B)+P(~M)(P(H|~M,B) ; probability axiom
h=m(1/2)+(1-m)(0) ; facts 1-2 and above
h=m/2 ; simplified
m=2h
(oops, that turned out to be redundant; not surprising since I’m using in lemma 2 the variants p(~X)=1-P(X) from the same facts 1+2).
P(H|B) is a weighted average of the probability for heads given Monday (1/2) and given Tuesday (0). It turns out that, according to thirders, it’s more likely that it’s Monday (m=2h=2/3).
The thirder argument is that m=2/3 (that is, 2 out of 3 wakings on average are on Monday). The halfer argument that h=1/2 implies that m=1; that is, that Beauty is certain that it’s Monday (but this is obviously stupid of her).
I was originally sympathetic to neq1′s argument that B is merely “1 or more wakings occur” and that P(H|1 or more wakings occur)=P(H)=1/2, since 1 or more wakings always occur, no matter whether H or ~H. But B is better characterized as “Beauty has just been woken, not knowing whether it’s the first or second waking, but knowing the experiment design”.
I would like to strengthen this argument to prove that m=2/3.
Lemma 1 is wrong. -h=(-1/2)m, m=2h. So your two lemmas are just saying the same thing.
I agree. I should have used a computer algebra program ;) I’ve revised my post so that it’s correct. It’s funny to me that I let slip a computation error that happened to accidentally give me the result I expected.
Just an observation: I’ve mostly ignored this discussion, but it appears to have generated a lot of meaningful debate about the very fundamental epistemic issues at play (though a lot of unproductive debate as well). No consensus on which position is idiotic has apparently arisen.
With that in mind, surely this article should be rated above 1? Are the upvotes being canceled by downvotes, or are people just not voting it either way? Why isn’t this rated higher?
I, for one, rate articles by the article text alone, not by the discussions generated in their comment threads.
Okay. That’s good—I agree with that standard. So is the consensus that, however productive the debate might be that is going on in the comments, the article that prompted them wasn’t very good? If so, the rating seems reasonable. (I felt the same way about the top-level article that was basically just the question, “What are you doing, and why are you doing it?”)
I suspect that people don’t like the tone/conclusions/analysis, and much of the debate was instigated by the article’s author. If someone wrote a post that successfully managed to explain what people actually mean when they say the answer goes one way or the other, then I’d expect that one to be rated higher.
Frankly, I think the Wikipedia article on the sleeping beauty problem tells you everything you’d get out of this article and more, without the implication that 1⁄2 is the right answer and people who answer 1⁄3 are doing something basically stupid.
And if an article doesn’t add anything over Wikipedia, it probably doesn’t deserve to be upvoted. Just add a link to the Wikipedia page on the open thread.
Alright, sounds good to me. Rating seems reasonable then.
ETA: And let me add that such restraint in voting gives me renewed confidence LW’s karma system.
The thing is, the argument in favor of the 1⁄3 solution on the Wikipedia page is flawed. I tried to explain the flaw, but perhaps I failed. It makes me cringe when I think that people are going to that page for the solution.
Also, not only did I critique the wikipedia page, but I critiqued parts of papers by Radford Neal and Nick Bostrom.
That’s not to say my post deserves more up votes. Others can judge the quality of my work. But I’m pretty sure I covered some new ground here.
After tinkering with a solution, and debating with myself how or whether to try it again here, I decided to post a definitive counter-argument to neq1′s article as a comment. It starts with the correct probability tree, which has (at least) five outcomes, not three. But I’ll use the unknown Q for one probability in it:
••••••• Monday---1---Waken; Pr(observe Heads and Monday)=Q/2 ••••••••••/ ••••••••Q •••••••/ ••• Heads •••••/••\••••••••••••1---Sleep; Pr(sleep thru Heads and Tuesday)=(1-Q)/2 ••••/•••1-Q•••••••/ ••1/2••••\••••••••/ ••/•••• Tuesday--0---Waken; Pr(observe Heads and Tuesday)=0 •/ + •\ ••\•••• Monday---1---Waken; Pr(observe Tails and Monday)=1/4 ••1/2••••/ ••••\••1/2 •••••\••/ ••• Tails •••••••\ •••••••1/2 ••••••••••\ ••••••• Tuesday--1---Waken; Pr(observe Tails and Tuesday)=1/4
What halfers refuse to recognize, is that whether Beauty is awakened in any specific circumstance is a decision that is part of the process. It is based on the other two random variables, after both – repeat, both – have been determined. The event “Heads and Tuesday” is an event that exists in the sample space, and the decision to not awaken her is made only after that event has occurred. Halfers think they have to force that event into non-existence by making Q=1, when all the experiment requires is that the probability Beauty will observe it is zero. This is the point one thirder argument utilizes, that of Radford Neal’s companion Prince who is always awakened but only asked if Beauty is awakened.
In fact, there is no reason why the probability that it is Monday, given Heads, should be any different than the probability it is Monday, given Tails. So, with Q=1/2, we get that Pr(observe heads)=1/4, Pr(observe anything)=3/4, so Pr(Heads|observe anything)=1/3. QED.
Neq1’s arguments that the thirder positions are wrong are all examples of circular reasoning. He makes some assumption equivalent to saying the answer is 1⁄2, and from that proves the answer is 1⁄2. For example, when he uses “Beauty woken up at least once” as a condition, all his terms are also conditioned on the fact that the rules of the experiment were followed. So when he inserts the completely unconditional “Pr(Heads)=1/2” on the right-hand side of the equation, he really should use Pr(heads|rules followed), which is the unknown we are trying to find. It is then unsurprising that he gets the number he inserted, especially if you consider what using a probability-one event as a condition in Bayes’ Rule means.
Where neq1 claims that Nick Bostrom’s argument is wrong in “Disclosure Process 1,” I suggest he go back and use the values from his probability tree. Her credence of heads is (1/2)/(1/2+1/2/1,000,000). In the second process, it is either (1/2)/(1/2+1/2/7,000,000) of (1/2)/(1/2+1/2/1,000,000,000,000), depending on what “specific day” means.
The whole anthropics debate is over things that you have taken as assumptions e.g. whether waking up is identical evidence to merely knowing that you wake at least once, whether the three days are equally likely
Your update doesn’t solve the problem. It’s a semantic issue about what credence we are being asked. If we are being asked about the probability of our coin flip associated with this iteration of the experiment, then the answer is 1⁄2. If we are being asked about the probability of the coin flip associated with this particular awakening, then it must be 1⁄3.
You say that you must use cell counts of 500,250,250, but the fact is that if you repeat the experiment 1000 times, sleeping beauty will be awoken 1500 times, not 1000. So what are you doing with the other 500 awakenings? I would say you are implicitly ignoring them, as you do when you say “we only accept her last decision” in the bet scenario. The reformulations of this using different people, rather than the same person being awoken multiple times don’t seem to cause as much trouble.
The semantic issue here is reminiscent of arguments I’ve seen over the Monty Hall problem when it is misstated so that Monty’s algorithm is not clear. People who assume what he usually does on the show come up with 2⁄3, and people who don’t make any assumptions come up with 1⁄2 (as do most of the people who simply don’t understand restricted choice).
Re: “If we are being asked about the probability of our coin flip associated with this iteration of the experiment, then the answer is 1⁄2. If we are being asked about the probability of the coin flip associated with this particular awakening, then it must be 1⁄3.”
What is actually asked at each awakening is:
“What is your credence now for the proposition that our coin landed heads?”
I figure that makes the answer 1⁄3 - and not 1⁄2.
If the question had been: “What is your credence that this is the last time you awaken and our coin landed heads-up?”
...then the answer would have been 1⁄2.
...but that wasn’t the question that was asked.
Cool story bro
It doesn’t make sense to assert that probability of Tuesday is 1⁄4 (in the sense that it’d take a really bad model to give this answer). Monday and Tuesday of the “tails” case shouldn’t be distinct elements of the sample space. What happens when you’ve observed that “it’s not Tuesday”, and the next day it’s Tuesday? Have you encountered an event of zero probability? This is exactly the same reason why the solution of 1⁄3 can’t be backed up by a reasonable model.
In the classical possible worlds model, you’ve got two worlds for each outcome of the coin flip, with probabilities 1⁄2 apiece, and so (Tuesday, tails) is the same event as (Monday, tails), weighing probability of 1⁄2. Thus, for example, probability that we are in the possible world where Monday can be observed, given that Tuesday can be observed, is 1, but it doesn’t make sense to ask “What is probability of it being Tuesday?”, unless this question is interpreted as “What is probability of us being in the possible world where it’s possible to observe Tuesday?”, in which case the question “What is the probability of it being Monday, given that it’s Tuesday?”, interpreted the same way, has “100%” as the answer.
“It doesn’t make sense to assert that probability of Tuesday is 1⁄4 (in the sense that it’d take a really bad model to give this answer).”
Suppose if heads we wake Beauty up on Monday, and if tails we wake her up either on Monday or Tuesday (each with probability 1⁄2). In that case, when Beauty is awakened, she should it’s Monday with probability .75 and tails with probability .25.
“In the classical possible worlds model, you’ve got two worlds for each outcome of the coin flip, with probabilities 1⁄2 apiece, and so (Tuesday, tails) is the same event as (Monday, tails), weighing probability of 1⁄2.”
I agree with this. I just thought it would be more intuitive if people thought of “(Tuesday, tails) is the same event as (Monday, tails), weighing probability of 1/2” from the perspective of the experiment that I describe above (where we imagine Beauty is awakened on a random day within the space of possible days for each coin result).
I have no problem imagining a probability distribution for Tuesday, just like I can imagine a probability distribution for the mean of some random variable.
Surely, 1⁄3 is the correct answer—and is backed up by a perfectly reasonable model..
By the way, you may have noticed that the wiki has an article on the Sleeping Beauty problem. Also, it’s been referenced before Sleeping beauty gets counterfacutally mugged in a top-level post, and it was mentioned in the context of a general solution in The I-Less Eye. And the comments on How many LHC failures is too many are relevant to the problem too.
Did you even search to see if someone had done a post on this topic before?
Yes, and the posts you referenced don’t cover the topic in the same way.
update: I did link to the wiki article in the original post (and quoted from it extensively). I’m surprised you didn’t notice that
There is a difference between P(“Heads came up”) and P(“Heads came up” given that “I was just woken up”). Since you will be woken up (memory-less) multiple times if tails came up, the fact that you are just getting woken up gives you information and increases the probability that tails came up.
Let’s consider P(H | JustWoken) = P(H and Monday | JustWoken) + P(H and Tuesday | JustWoken) Because I have no information about the scientist’s behavior (when he chooses to ask the question), I have to assign equal probabilities (one third) to P(H and Monday | JustWoken), P(T and Monday | JustWoken) and P(T and Tuesday | JustWoken). And it’s impossible to be woken up on Tuesday if Heads came up, so P(H and Tuesday | JustWoken) = 0. In result, P(H | JustWoken) = 1⁄3.
If anyone doubts that, we could set up a computer simulation (you write the scientist and coin code and I write code for the beauty answering the question) and we bet. But I would require an experimental condition, stating that the scientist will ask the beauty the question every time she wakes up. Under those conditions, a beauty which always bets that “tails came up” any time she gets woken up will win 2⁄3 of the time. If we could not agree to those conditions (getting interviewed by the scientist on every occasion), the bet would be broken because you know what answer I will give and you have information that I don’t have (strategy for when to interview).
I think the solution to the problem depends on what you want to measure. The probability of being tails per wakening is not the same as the probability of being tails per flip or per day.
Robert Wiblin—Thoughts on the sleeping beauty problem
http://robertwiblin.wordpress.com/2010/03/26/news-flash-multiverse-theory-proven-right/
Huh? If tails, then Beauty is (always) woken on Monday. Why do you have probability=1/2 there?
(likewise for Tuesday)
The probability represents how she should see things when she wakes up.
She knows she’s awake. She knows heads had probability 0.5. She knows that, if it landed heads, it’s Monday with probability 1. She knows that, if it landed tails, it’s either Monday or Tuesday. Since there is no way for her to distinguish between the two, she views them as equally likely. Thus, if tails, it’s Monday with probability 0.5 and Tuesday with probability 0.5.
Okay, I now understand what you mean by that tree.
Beauty ends up with 1500 observations on average (maybe as few as 1000 or as many as 2000). Imagine a sequence of Beauty-observations in (H|TT)^1000 , where by r^1000 I mean 1000 repetitions of r. This string is from 1000-2000 letters long.
If you consider the scenario from a non-amnesiac perspective, then you can consider the TT—the two forgetful-Beauty observations in the tails case, as a single event, which is indeed equally likely to the alternative, H. In fact, the shortest possible coding to describe one of the beauty-observation strings is just a 1000-bit string where the nth bit indicates the result of the nth coin flip.
But what are you thinking when you say there are two “cells” each with p=1/4 (count 250 out of 1000)? What, exactly, would happen to 250 times on average? Certainly we expect Beauty waking on Mon. with it being tails 500 times (and also 500 times on Tue.).
PhilGoetz writes:
I would like to do this. However, it’s time consuming to sort through people’s posts and see what they think. (You have to read carefully, because they may be critiquing a particular argument rather than the value 1⁄2 or 1⁄3 presented in the parent.) Would people mind stating their position on the Sleeping Beauty problem with a single sentence explaining the core detail of the argument that persuades them?
I think the answer is not well defined. I would be inclined to say 1⁄2, however, because if you want to calculate the distribution of outcomes after the experiment, the 1⁄3 calculation will give the wrong answer. If Beauty bets on heads, instead of winning 1⁄3 of the time and losing 2⁄3 of the time, she wins 1⁄2 the time and loses twice 1⁄2 the time. Her decision theory needs to take that into account.
I understand the argument for 1⁄3, but it seems to throw away important information.
Edit: What convinced me? Oddly, it was the arguments for 1⁄3 - when I examined them, I noticed the problem.
Edit 2: Upon further consideration (thanks, Jonathan_Graehl!), I have decided that 1⁄3 is the better answer, but not obviously so.
I find the problem statement to be completely unambiguous:
Robin wrote:
...and then revised that to “1/3 is the better answer, but not obviously so”. I wish to agree with both statements, with additional explanation. “Credence” is ambiguous in the question, “What is your credence now for … heads?” Depending on additional context, that could be a request for Beauty’s P(heads | Beauty woken up at least once). Or, it could be a request for her P(heads | Beauty just now woken).
The latter case first: Beauty is offered a bet with some payoff odds, for a few dollars, and she is neither risk-averse nor risk-seeking with regard to such amounts of money. She is to be offered this bet upon each awakening. Nothing else of consequence hinges on the coin flip. In this scenario, Beauty likely interprets “credence” in a manner directly corresponding to betting odds, and her correct answer is 1⁄3.
Now for the 1⁄2 case: Beauty knows that her President has decided to launch all-out nuclear war if and only if the coin lands heads. Upon being woken up, her first thought, naturally, is a deep dread at this possibility. How much dread does she feel, in comparison to how she would feel if nuclear war were certain, and in comparison to how she’d feel if it were out of the question? About halfway between.
Since the “offered a bet” scenario is more natural than scenarios where something momentous hangs on the coin flip, 1⁄3 is probably a “better” answer to the unadorned Sleeping Beauty problem. But even then, your mileage may vary. If you are the kind of person more interested in objective events (the coin flip itself) than the track record of your guesses (“I’ve just been woken, so I’ll say probably tails”), well then, be a halfer. If the opposite, be a thirder.
It would make sense to respond “1/3” to that question, but it would not make sense to use 1⁄3 to make decisions with. The payoff grid is different.
I’m confused. Are you saying there’s room for debate over what “credence” means?
Maybe in discussing what credence someone ought to have, there’s some default analogy to optimizing odds under some betting/payoff/utility scheme, but I think there’s a single correct answer to the Beauty problem under that default, and it should be possible to justify it without recourse to the analogy.
I like to simplify: suppose Beauty wakes and guesses that the coin was tails. How often is she expected to be right? For 2⁄3 of her guesses (but 1⁄2 of the experiments). So clearly in a wager to be played each time she’s woken in the experiment as described, she would need to lose twice as much utility as when she’s wrong as when she’s right, in order to be indifferent about making the wager. I believe that’s the default analogy between credence and lotteries.
That’s all perfectly true, but compare her strategy in this experiment to, say, an ordinary bet at 2:1 odds. If Beauty bets $10 on heads, she will either win $20 or lose $20 with equal likelihood over the course of the experiment—but if she bets $10 on an ordinary one-in-three chance, she will either win $20 or lose $10, with losing $10 being twice as likely. Mere risk aversion would make these two options different.
I’ll concede that, of the two options, 1⁄3 probably makes more sense to describe her credence, but it’s not sufficient to describe the variables she must account for.
I agree but don’t think it’s necessary to talk about risk at all (except to say that we wish to ignore it) for the purpose of the hypothetical bets an agent should make given a certain credence. I also think you confused the direction of the odds; if I believe something is 2⁄3 likely, I should take the positive side if I can gain anything more than half of what I stand to lose if the negative occurs (with p=1/3). But of course that doesn’t change the interesting difference you point out (that the bet involves a $40 swing rather than a $30 one).
Agreed. I have indicated a change of opinion at my original comment.
I don’t follow your latest argument against thirders. You claim that the denominator
#(heads & monday) + #(tails & monday) + #(tails & tuesday)
counts events that are not mutually exclusive. I don’t see this. They look mutually exclusive to me—heads is exclusive of tails, and monday is exclusive of tuesday, Could you elaborate this argument? Where does exclusivity fail? Are you saying tails&monday is not distinct from tails&tuesday, or all three overlap, or something else?
You also assert that the denominator is not determined by n. (I assume by n you mean replications of the SB experiment, where each replication has a randomly varying number of awakenings. That’s true in a way—particular values that you will see in particular replications will vary, because the denominator is a random variable with a definite distribution (Bernoulli, in fact). But that’s not a problem when computing expected values for random processes in general; they often have perfectly definite and easily computed expected values. Are you arguing that this makes that ratio undefined, or problematic in some way? I can tell easily what this ratio converges to, but you won’t like it.
SB would start out with P(tails) = 1,000,001⁄1,000,002 and on being informed that it is monday would update:
The initial strong belief in tails is cancelled by the strong evidence of being told that it is monday, which only happens in one of many wakings if the coin landed tails.
The initial strong belief in tails is canceled by the strong evidence of being told what day it is, and then updated further to strong belief in heads by the strong evidence of it being Monday.
The next program works well:
R=Random(0,1) If R=0 SAY “P(R=0)=1/2” Elseif SAY “P(R=0)=1/2″: SAY “P(R=0)=1/2” Endif
The next doesn’t:
R=Random(0,1) If R=0 SAY “P(R=0)=1/3” Elseif SAY “P(R=0)=1/3″: SAY “P(R=0)=1/3” Endif
Run it many times and you will clearly see, that the first program will be right, since it will be about the same number of cases when R will be 0 and the other cases when R will be 1.
Just what the first program keep saying.
I’m not convinced that 1⁄2 is the right answer. I actually started out thinking it was obviously 1⁄2, and then switched to 1⁄3 after thinking about it for a while (I had thought of Bostrom’s variant (without the disclosure bit) before I got to that part).
Let’s say we’re doing the Extreme version, no disclosure. You’re Sleeping Beauty, you just woke up, that’s all the new information you have. You know that there are 1,000,001 different ways this could have happened. It seems clear that you should assign tails a probability of 1,000,000⁄1,000,001.
Now I’ll go think about this some more and probably change my mind a few more times.
We can tweak the experiment a bit to clarify this. Suppose the coin is flipped before she goes to sleep, but the result is hidden. If she’s interviewed immediately, she has no reason to answer other than 1⁄2 - at this point it’s just “flip a fair coin and estimate P(heads)”. What information does she get the next time she’s asked that would cause her to update her estimate? She’s woken up, yes, but she already knew that would happen before going under and still answered 1⁄2. With no new information she should still guess 1⁄2 when woken up.
She knows in advance how many times she will be woken up (on each coin result). It says so in the problem description. So, she never answers 1⁄2 in the first place. She doesn’t update on awakening. She updates when she is told the experimental procedure in the first place.
So, in the extreme sleeping beauty problem, when she is told the experimental procedure, she decides that it will be tails with near certainty?
Let’s look at the ultimate extreme version. Assume she’s woken up once (or arbitrarily many non-zero times) for tails, and not at all for heads. Now the fact that she’s been woken up implies tails with certainty. So if the answer remains 1⁄2 in the extreme versions, then there must be a discontinuous jump, rather than convergence, when the ratio of the number of awakenings for heads vs. tails tends towards zero.
Intuitively, this is very convincing. But it definitely doesn’t prove anything by itself...
This was similar to my intuition also. She will wake up in the end no matter what regardless of how many times she has been woken up before, so how does her wakefulness add any new information? If there was a scenario where she would never wake up, then her being awake would actually mean something, but that isn’t the question.
I can’t see how this a problem of conditional probability. Isn’t it just “what is P(heads)”? Am I missing something?
Note jimmy’s comparison to Blue Eyes. It’s not necessarily the case that she’s getting no new information here.
...yeah, I think you’re right.
In the few minutes before I read your comment, I was thinking about reformulating this as an Omega-style problem. (I know, I know… I do try not to be too gratuitous with my use of Omega, but what can I say — omnipotence and omniscience are surprisingly useful for clarifying and simplifying reasoning/decision problems.) So Omega tells you she’s going to flip a fair coin, and if it lands on tails, she’s going to make a million copies of you and put all of them in identical rooms, and if it lands on heads, she’ll just put the one of you in such a room. She flips the coin, you blank out for a moment, and as expected, you’re in an unfamiliar room. In this case, it doesn’t appear that adding or subtracting copies of you should have anything to do with what you believe about the coin flip. You saw her flip the coin yourself, and you knew that you’d be seeing the same thing no matter what side came up. She could come back a few minutes later and say “Hey, if and only if it was tails, I just made another million copies of you and put them in rooms identical to this one, kbye” which clearly shouldn’t change your belief about the coin, but seems to be a situation identical to if she had just said “two million” in the first place.
Okay, I think I’m more confidently on the 1⁄2 side now.
OK, I think I have a definite reductio ad absurdum of your point. Suppose you wake up in a room, and the last thing you remember is Omega telling you: “I’m going to toss a coin now. Whatever comes up, I’ll put you in the room. However, if it’s tails, I’ll also put a million other people each in an identical room and manipulate their neural tissue so as to implant them a false memory of having been told all this before the toss. So, when you find yourself in the room, you won’t know if we’ve actually had this conversation, or you’ve been implanted the memory of it after the toss.”
After you find yourself in the room under this scenario, you have the memory of these exact words spoken to you by Omega a few seconds ago. Then he shows up and asks you about the expected value of the coin toss. I’m curious if your 1⁄2 intuition still holds in this situation? (I’m definitely unable to summon any such intuition at all—your brain states representing this memory are obviously more likely to have originated from their mass production in case of tails, just like finding a rare widget on the floor would be evidence for tails if Omega pledged to mass-manufacture them if tails come up.)
But if you wouldn’t say 1⁄2, then you’ve just reached an awful paradox. Instead of just implanting the memories, Omega can also choose to change these other million people in some other small way to make them slightly more similar to you. Or a bit more, or even more—and in the limit, he’d just use these people as the raw material for manufacturing the copies of you, getting us back to your copying scenario. At which step does the 1⁄2 intuition emerge?
(Of course, as I wrote in my other comment, all of this is just philosophizing that goes past the domain of validity of human intuitions, and these questions make sense only if tackled using rigorous math with more precisely defined assumptions and questions. But I do find it an interesting exploration of where our intuitions (mis)lead us.)
I’d still say 1⁄2 is the right answer, yes.
But I’m trying to avoid using intuition here; when I do, it tends to find the arguments on both sides equally persuasive (obvious, even). If there is a right answer at all, then this is truly a case where we have no choice but to shut up and do the math.
Hm.. let’s try pushing it a bit further.
Suppose you’re a member of a large exploratory team on an alien planet colonized by humans. As a part of the standard equipment, each team member has an intelligent reconnaissance drone that can be released to roam around and explore. You get separated from the rest of your team and find yourself alone in the wilderness. You send out your drone to explore the area, and after a few hours it comes back. When you examine its records, you find the following.
Apparently, a local super-smart creature with a weird sense of humor—let’s call it Omega—has captured several drones and released (some of?) them back after playing with them a bit. Examining your drone’s records, you find that Omega has done something similar to the above described false memory game with them. You play the drone’s audio record, and you hear Omega saying: “I’ll toss a coin now. Afterwards, I’ll release your drone back in any case. If heads come up, I’ll destroy the other ten drones I have captured. If it’s tails, I’ll release them all back to their respective owners, but I’ll also insert this message into their audio records.” Assume that you’ve already heard a lot about Omega, since he’s already done many such strange experiments on the local folks—and from what’s known about his behavior, it’s overwhelmingly likely that the message can be taken at face value.
What would you say about the expected coin toss result now? Would you take the fact that you got your drone back as evidence in favor of tails, or does your 1⁄2 intuition still hold? If not, what’s the difference relative to the false memory case above? (Unless I’m missing something, the combined memories of yourself and the drone should be exactly equivalent to the false memory scenario.)
How about the following scenario? Say instead of Omega, it’s just a company doing a weird promotional scheme. They announce that they’ll secretly flip a coin in their headquarters, and if it’s tails, they’ll hand out prizes to a million random people from the phone directory tomorrow, whereas if it’s heads, they’ll award the same prize to only one lucky winner. The next day, you receive a phone call from them. Would you apply analogous reasoning in this case (and how, or why not)?
I think that’s very different… in the original scenario, heads and tails both result in you experiencing the same thing. In this case, if it comes up heads, it is a million times more likely that you will receive the prize, so getting a phone call from them is very significant Bayesian evidence.
Yes, you’re right (as are the other replies making similar points). I tried hard once more to come up with an accurate analogy of the above problem that would be realizable in the real world, but it seems like it’s impossible to come up with anything that doesn’t involve implanting false memories.
After giving this some more thought, it seems to me that the problem with the copying scenario is that once we eliminate the assumption that each agent has a unique continuous existence, all human intuitions completely break down, and we can compute only mathematically precise problems formulated within strictly defined probability spaces. Trouble is, since we’ve breaking one of the fundamental human common sense assumptions, the results may or may not make any intuitive sense, and as soon as we step outside formal, rigorous math, we can only latch onto subjectively preferable intuitions, which may differ between people.
In the situation you state, then yes, of course I place high probability on the coin having come up tails. However, in order for your situation to be truly analogous to the Sleeping Beauty problem, you would have to be guaranteed to get the phone call either way, which destroys any information you gain in your version.
The probability for the head is still the same.
On the additional information, that you got the call, it becomes more likely that it was the head this time.
As I said in the other comment, this argument is just like arguing that if I exist, then there are likely more people, since there would be more ways that it could happen that I existed, i.e. I could be any of them of the people who exist.
But in fact “I” am just one of the people who exist, no matter how many or few there are, so the inevitable fact of my existence cannot increase the probability of many people existing. Likewise, when Sleeping Beauty wakes up, that is just the one case or one of the million cases; either way it would still have happened. It would not have happened with greater likelihood in the million cases.
I agree with the author of this article. After having done a lot of research on the Sleeping Beauty Problem as it was the topic of my bachelor’s thesis (philosophy), I came to the conclusion that anthropic reasoning is wrong in the Sleeping Beauty Problem. I will explain my argument (shortly) below:
The principle that Elga uses in his first paper to validate his argument for 1⁄3 is an anthropic principle he calls the Principle of Indifference:
“Equal probabilities should be assigned to any collection of indistinguishable, mutually exclusive and exhaustive events.”
The Principle of Indifference is in fact a more restricted version of the Self-Indication Assumption:
“All other things equal, an observer should reason as if they are randomly selected from the set of all possible observers.”
Both principles are to be accepted a priori as they can not be attributed to empirical considerations. They are therefore vulnerable to counterarguments...
The counterargument:
Suppose that the original experiment is modified a little:
If the outcome of the coin flip is Heads, they wake Beauty up at exactly 8:00. If the outcome of the first coin flip is Tails, the reasearchers flip another coin. If it lands Heads they wake Beauty at 7:00, if Tails at 9:00. That means that when Beauty wakes up she can be in one of 5 situations:
Heads and Monday 8:00
Tails and Monday 7:00
Tails and Monday 9:00
Tails and Tuesday 7:00
Tails and Tuesday 9:00
Again, these situations are mutually exclusive, indistinguishable and exhaustive. Hence thirders are forced to conclude that P(Heads) = 1⁄5.
Thirders might object that Beauty’s total credence in the Tails-world would still have to equal 2⁄3, as Beauty is awakened twice as many times in the Tails-world as in the Heads-world. They are then forced to explain why temporal uncertainty regarding an awakening (Monday or Tuesday) is different from temporal uncertainty regarding the time (7:00 or 9:00 o’clock). Both classify as temporal uncertainties within the same possible world, what could possibly set them apart?
An explanation could be that Beauty is only is asked for her credence in Heads during an awakening event, regardless of the time, and that such an event occurs twice in the Tails-world. That is, out of the 4 possible observer-moments in the Tails-world there are only two in which she is interviewed. That means that simply the fact that she is asked the same question twice is reason enough for thirders to distribute their credence, and it is no longer about the number of observer moments. So if she would be asked the same question a million times then her credence in Heads would drop to 1/1000001!
We can magnify the absurdity of this reasoning by imagining a modified version of the Sleeping Beauty Problem in which a coin is tossed that always lands on Tails. Again, she is awakened one million times and given an amnesia-inducing potion after each awakening. Thirder logic would lead to Beauty’s credence in Tails being 1/1000000, as there are one million observer-moments where she is asked for her credence within the only possible world; the Tails-world. To recapitulate: Beauty is certain that she lives in a world where a coin lands Tails, but due to the fact that she knows that she will answer the same question a million times her answer is 1/1000000. This would be tantamount to saying that Mt. Everest is only 1m high when knowing it will be asked 8848 times! It is very hard to see how amnesia could have such an effect on rationality.
Conclusion:
The thirder argument is false. The fact that there are multiple possible observer-moments within a possible world does not justify dividing your credences equally among these observer-moments, as this leads to absurd consequences. The anthropic reasoning exhibited by the Principle of Indifference and the Self-Indication Assumption cannot be applied to the Sleeping Beauty Problem and I seriously doubt if it can be applied to other cases...
If Sleeping Beauty doesn’t know what day it is, what could possibly motivate her to say that the probability of heads is something other than 50%? I mean, she knows nothing about the coin except that it’s round and shiny, and the metal costs more than the coin does.
Unless I misunderstood, this problem is smoke and mirrors.
If she thinks she will be asked what the coin shows more times if it is tails.
I updated the post one more time. I think this time I more effectively explain where the thirder logic fails. Correct me if I’m wrong...
I updated the post. Thanks to the many interesting comments, I think I am now better able to describe why the 1⁄3 solution is wrong.
And to be clear, the main point of the post isn’t to show that 1⁄2 is right, but to make the observation about how easy it is to be confident in the wrong answer when it comes to probability problems.