neq1 comments on Beauty quips, “I’d shut up and multiply!”

neq1 11 May 2010 16:50 UTC
0 points
Yes, exactly. That would be a valid probability if these were expected frequencies from independent draws of a multinomial distribution (it would have 2 degrees of freedom). Your ratio of expected values does not result in P(H|W).

It might become clear if you think about it this way. Your expected number of occurrences of W is greater than the largest possible value of occurrences of H&W. You don’t have a ratio of number of events to number of independent trials.

Picture a 3 by 1 contingency table, where we have counts in 3 cells: Monday&H, Monday&T, Tuesday&T. Typically, a 3 by 1 contingency table will have 2 degrees of freedom (the count in the 3rd cell is determined by the number of trials and the counts in the other cells). Standard statistical theory says you can estimate the probability for cell one by taking the cell one count and dividing by the total. That’s not the situation with the sleeping beauty problem. There is just one degree of freedom. If we know the count the number of coin flips and the count in one of the cells, we know the count in the other two. Standard statistical theory does not apply. The ratio of count for cell one to the total is not the probability for cell one.
- jimrandomh 11 May 2010 17:26 UTC
  1 point
  Parent
  Occurances of H&&W are a strict subset of occurances of W, so if to use the terminology of events and trials, each waking is a trial, and each waking where the coin was heads is a positive result. That’s ¹⁄₃ of all trials, so a probability of ¹⁄₃.
  - neq1 11 May 2010 17:35 UTC
    0 points
    Parent
    If each waking is a trial, then you have a situation where the number of trials is outcome dependent. Your estimator would be valid if the number of trials was not outcome dependent. This is the heart of the matter. The ratio of cell counts here is just not a probability.
    - jimrandomh 11 May 2010 17:45 UTC
      1 point
      Parent
      The number of trials being outcome dependent only matters if you are using the frequentist definition of probability, or if it causes you to collect fewer trials than you need to overcome noise. We’re computing with probabilities straight from the problem statement, so there’s no noise, and as a Bayesian, I don’t care about the frequentists’ broken definition.
      - neq1 11 May 2010 19:01 UTC
        1 point
        Parent
        This has nothing to do with Bayesian vs. Frequentist. We’re just calculated probabilities from the problem statement, like you said. From the problem, we know P(H)=1/2, P(Monday|H)=1, etc, which leads to P(H|Monday or Tuesday)=1/2. The ¹⁄₃ calculation is not from the problem statement, but rather from a misapplication of large sample theory. The outcome-dependent sampling biases your estimator.
        
        And it’s strange that you don’t call your approach Frequentist, when you derived it from expected cell counts in repeated samples.
        thomblake 11 May 2010 19:48 UTC
        2 points
        Parent
        
        And it’s strange that you don’t call your approach Frequentist, when you derived it from expected cell counts in repeated samples.
        
        Don’t forget—around here ‘Bayesian’ is used normatively, and as part of some sort of group identification. “Bayesians” here will often use frequentist approaches in particular problems.
        Cyan 15 May 2010 2:29 UTC
        3 points
        Parent
        But that can be legitimate, as Bayesian calculations are a superset of frequentist calculations. Nothing bars a Bayesian from postulating that a limiting frequency exists in an unbounded number of trials in some hypothetical situation; but you won’t see one, e.g., accept R.A. Fisher’s argument for his use of p-values for statistical inference.
        jimrandomh 11 May 2010 20:45 UTC
        0 points
        Parent
        I adopted some frequentist terminology for purposes of this discussion because none of the other explanations I or others had posted seemed to be getting through, and I thought that might be the problem.
        
        The reason I said that there’s a frequentist vs. Bayesian issue here is because the frequentist probabilitiy definition I’m most familiar with is P(f) = lim n->inf sum(f(i), i=1..n)/n, where f(x) is the x’th repetition of an independent repeatable experiment, and that definition is hard to reconcile with SB sometimes being asked twice. I assumed that issue, or a rule justified by that issue, was behind your objection.
        jimrandomh 11 May 2010 20:44 UTC
        0 points
        Parent
        I adopted some frequentist terminology for purposes of this discussion because none of the other explanations I or others had posted seemed to be getting through, and I thought that might be the problem.
        
        The reason I said that there’s a frequentist vs. Bayesian issue here is because the frequentist probabilitiy definition I’m most familiar with is P(f) = lim n->inf sum(f(i), i=1..n)/n, where f(x) is the x’th repetition of an independent repeatable experiment, and that definition is hard to reconcile with SB sometimes being asked twice. I assumed that issue, or a rule justified by that issue, was behind your objection.