neq1 comments on Conditioning on Observers

neq1 11 May 2010 12:32 UTC
0 points
Wait, I didn’t catch this the first time:

“using the ¹⁄₃ answer and working back to try to find P(W) yields P(W) = ³⁄₂, which is a strong indication that it is not the probability that matters”

No. It’s proof that your solution is wrong.

And I know exactly why your solution is wrong. You came up with P(Monday|W) using a ratio of expected counts, but you relied on an assumption that trials are independent. Here, the coin flips are indpendent but the counts are not. Even though you are using three counts, there is just one degree of freedom. Vladmir Nesov got it right, I think, when he said “(Tuesday, tails) is the same event as (Monday, tails)”

The last update in my sleeping beauty post explains the problem in more detail.
- Joanna Morningstar 11 May 2010 15:38 UTC
  0 points
  Parent
  Of course P(W) isn’t bound within [0,1]; W is one of any number of events, in this case 2: P(You will be woken for the first time) = 1; P(You will be woken a second time) = ¹⁄₂. The fact that natural language and the phrasing of the problem attempts to hide this as “you wake up” is not important. That is why P(W) is apparently broken; it double counts some futures, it is the expected number of wakings. This is why I split into conditioning on waking on Monday or Tuesday.
  
  (Tuesday, tails) is not the same event as (Monday, tails). They are distinct queries to whatever decision algorithm you implement; there are any number of trivial means to distinguish them without altering the experiment (Say “we will keep you in a red room on one day and a blue one on the other, with the order to be determined by a random coin flip)
  
  They are strongly correlated events, granted. If either occurs, so will the other. That does not make them the same event. On your argumentation, you would assert confidently to that the coin is fair beforehand, yet also assert that the conditional probability that you wake on Monday depends on the coin flip, when in either branch you are woken then with probability 1.
  - neq1 12 May 2010 13:09 UTC
    1 point
    Parent
    If P(H) and P(H|W) are probabilities, then it must be true that:
    
    P(H)=P(H|W)P(W)+P(H|~W)P(~W), where ~W means not W (any other event), by the law of total probability
    
    If P(H)=1/2 and P(H|W)=1/3, as you claim, then we have
    
    1/2=1/3P(W)+P(H|~W)(1-P(W))
    
    P(H|~W) should be 0, since we know she will be awakened if heads. But that leads to P(W)=3/2.
    
    P(W) should be 1, but that leads to an equation 1/2=1/3
    
    So, this is a big mess.
    
    The reason it is a big mess is because the ¹⁄₃ solution was derived by treating one random variable as two.
    What links here?
    neq1's comment on Conditioning on Observers by Joanna Morningstar (12 May 2010 14:28 UTC; 0 points)
    - jimrandomh 12 May 2010 16:11 UTC
      0 points
      Parent
      I already addressed this elsewhere. The problem is that W is not a boolean, it’s a probability distribution over observer moments, so P(W) and P(~W) are undefined (type errors).
      - neq1 12 May 2010 16:24 UTC
        0 points
        Parent
        At one point in your post you said “For convenience let us say that the event W is being woken” and then later on you suggest W is something else, but I don’t see where you really defined it.
        
        You’re saying W itself is a probability distribution. What probability distribution? Can you be specific?
        
        P(H) and P(H|W) are probabilities. It’s unclear to me how those can be well defined, but the law of total probability doesn’t apply.
        jimrandomh 12 May 2010 17:03 UTC
        4 points
        Parent
        Suppose we write out SB as a world-program:
        
        SleepingBeauty(S(I)) = { coin = rnd({"H","T"}) S("starting the experiment now") if(coin=="H"): S("you just woke up") S("you just woke up") else: S("you just woke up") S("the experiment's over now") return 0 }
        This notation is from decision theory; S is sleeping beauty’s chosen strategy, a function which takes as arguments all the observations, including memories, which sleeping beauty has access to at that point, and returns the value of any decision SB makes. (In this case, the scenario doesn’t actually do anything with SB’s answers, so the program ignores them.)
        
        An observer-moment is a complete state of the program at a point where S is executed, including the arguments to S. Now, take all the possible observer-moments, weighted by the expected number of times that a given run of SleepingBeauty contains that observer moment. To condition on some information, take the subset of those observer-moments which match that information. So, P(coin=heads|I=”you just woke up”) means, of all the calls to S where I=”you just woke up”, weighted by probability of occurance, what fraction of them are on the heads branch? This is ¹⁄₃. On the other hand, P(coin=heads|I=”the experiment’s over now”)=1/2.
        jimrandomh 12 May 2010 17:02 UTC
        0 points
        Parent
        Suppose we write out SB as a world-program:
        
        SleepingBeauty(S(I)) = { coin = rnd({"H","T"}) S("starting the experiment now") if(coin=="H"): S("you just woke up") S("you just woke up") else: S("you just woke up") S("the experiment's over now") return 0 }
        This notation is from decision theory; S is sleeping beauty’s chosen strategy, a function which takes as arguments all the observations, including memories, which sleeping beauty has access to at that point.
        
        An observer-moment is a complete state of the program at a point where S is executed, including the arguments to S. Now, take all the possible observer-moments, weighted by the probability that a given run of SleepingBeauty contains that observer moment. To condition on some information, take the subset of those observer-moments which match that information. So, P(coin=heads|I=”you just woke up”) means, of all the calls to S where I=”you just woke up”, weighted by probability of occurance, what fraction of them are on the heads branch? This is ¹⁄₃. On the other hand, P(coin=heads|I=”the experiment’s over now”)=1/2.
  - neq1 11 May 2010 16:16 UTC
    0 points
    Parent
    “Of course P(W) isn’t bound within [0,1]”
    
    Of course! (?) You derived P(W) using probability laws, i.e., solving for it in this equation: P(H)=P(H|W)P(W), where P(H)=1/2 and P(H|W)=1/3. These are probabilities. And your ¹⁄₃ solution proves there is an error.
    
    If two variables have correlation of 1, I think you could argue that they are the same (they contain the same quantitative information, at least).
    
    On your argumentation, you would assert confidently to that the coin is fair beforehand, yet also assert that the conditional probability that you wake on Monday depends on the coin flip, when in either branch you are woken then with probability 1.
    
    No. You will wake on Monday with probability one. But, on a randomly selected awakening, it is more likely that it’s Monday&Heads than Monday&Tails, because you are on the Heads path on 50% of experiments
    - Morendil 11 May 2010 17:11 UTC
      0 points
      Parent
      What is this random selection procedure you use in the last para?
      
      (“I select an awakening, but I can’t tell which” is the same statement as “Each awakening has probability 1/3″ and describes SB’s epistemic situation.)
      - neq1 11 May 2010 17:16 UTC
        0 points
        Parent
        Random doesn’t necessarily mean uniform. When Beauty wakes up, she knows she is somewhere on the tails path with probability .5, and somewhere on the tails path with probability .5. If tails, she also knows it’s either monday or tuesday, and from her persepctive, she should treat those days as equally likely (since she has no way of distinguishing). Thus, the distribution from which we would select an awakening at random has probabilities 0.5, 0.25 and 0.25.
        mattnewport 11 May 2010 17:43 UTC
        5 points
        Parent
        This appears to be where you are getting confused. Your probability tree in your post was incorrect. It should look like this:
        
        If you think about writing a program to simulate the experiment this should be obvious.
        neq1 11 May 2010 18:45 UTC
        1 point
        Parent
        No, because my probability tree was meant to reflect how beauty should view the probabilities at the time of an awakening. From that perspective, your tree would be incorrect (as two awakenings cannot happen at one time)
        timtyler 11 May 2010 19:27 UTC
        −2 points
        Parent
        After the 1000 experiments, you divided 500 by 2 - getting 250. You should have multiplied 500 by 2 - getting 1000 tails observations in total. It seems like a simple-enough math mistake.
        neq1 11 May 2010 19:40 UTC
        2 points
        Parent
        No, that’s not what I did. I’ll assume that you are smart enough to understand what I did, and I just did a poor job of explaining it. So I don’t know if it’s worth trying again. But basically, my probability tree was meant to reflect how Beauty should view the state of the world on an awakening. It was not meant to reflect how data would be generated if we saw the experiment through to the end. I thought it would be useful. But you can scrap that whole thing and my other arguments hold.
        timtyler 11 May 2010 20:00 UTC
        −1 points
        Parent
        Well you did divide 500 by 2 - getting 250. And you should have multiplied the 500 tails events by 2 (the number of interviews that were conducted after each “tails” event) - getting 1000 “tails” interviews in total. 250 has nothing to do with this problem.
        What links here?
        RobinZ's comment on Beauty quips, “I’d shut up and multiply!” by neq1 (11 May 2010 20:10 UTC; 5 points)
    - jimrandomh 11 May 2010 17:09 UTC
      0 points
      Parent
      
      P(H)=P(H|W)P(W), where P(H)=1/2 and P(H|W)=1/3
      
      No, P(H)=P(H|W)P(W) is incorrect because the W in P(H|W) is different than the W in P(W): the former is a probability distribution over a set of three events, while the latter is a boolean. Using the former definition, as a probability distribution, P(W) isn’t meaningful at all, it’s just a type error.
    - Joanna Morningstar 11 May 2010 16:40 UTC
      0 points
      Parent
      It isn’t a probability; the only use of it was to note the method leading to a ¹⁄₂ solution and where I consider it to fail, specifically because the number of times you are woken is not bound in [0,1] and thus “P(W)” as used in the ¹⁄₂ conditioning is malformed, as it doesn’t keep track of when you’re actually woken up. In as much as it is anything, using the ¹⁄₂ argumentation, “P(W)” is the expected number of wakings.
      
      No. You will wake on Monday with probability one. But, on a randomly selected awakening, it is more likely that it’s Monday&Heads than Monday&Tails, because you are on the Heads path on 50% of experiments
      
      Sorry, but if we’re randomly selecting a waking then it is not true that you’re on the heads path 50% of the time. In a pair of runs, one head, one tail, you are woken 3 times, twice on the tails path.
      
      On a randomly selected run of the experiment, there is a ¹⁄₂ chance of being in either branch, but: Choose a uniformly random waking in a uniformly chosen random run is not the same as Choose a uniformly random waking.
      - neq1 11 May 2010 16:57 UTC
        1 point
        Parent
        Why are you using the notation P(W) when you mean E(W)? And if you can get an expectation for it, you must know the probability of it.
        
        Sorry, but if we’re randomly selecting a waking then it is not true that you’re on the heads path 50% of the time. In a pair of runs, one head, one tail, you are woken 3 times, twice on the tails path.
        
        Randomly selecting a waking does not imply a uniform distribution. On the contrary, we know the distribution is not uniform.