neq1 comments on Conditioning on Observers

neq1 11 May 2010 11:35 UTC
−2 points
“Under these numbers, the 1000 observations made have required 500 heads and 250 tails, as each tail produces both an observation on Monday and Tuesday. ”

I must have been unclear in explaining my probability tree. The tree represents how Beauty should view things on an awakening. I thought it would be helpful. Apparently it just created more confusion (although some people got it).

“P(Monday|W) = 2/3”

Why? I believe it is 0.75. How did you come up with 2/3?
- Joanna Morningstar 11 May 2010 12:22 UTC
  2 points
  Parent
  P(Monday ∩ H | W) = P(Monday ∩ T | W). Regardless of whether the coin came up heads or tails you will be woken on Monday precisely once.
  
  P(Monday ∩ T | W) = P(Tuesday ∩ T | W), because if tails comes up you are surely woken on both Monday and Tuesday.
  
  You still seem to be holding on to the claim that there are as many observations after a head as after a tail; this is clearly false. There isn’t a half measure of observation to spread across the tails branch of the experiment; this is made clearer in Sleeping Twins and the Probabilistic Sleeping Beauty problems.
  
  Once Sleeping Beauty is normalised so that there is at most one observation per “individual” in the experiment, it seems far harder to justify the ¹⁄₂ answer. The fact of the matter is that your use of P(W) = 1 is causing grief, as on these problems you should consider E(#W) instead, because P(W) is not linear.
  
  What is your credence in the Probabilistic Sleeping Beauty problem?
  - neq1 11 May 2010 13:04 UTC
    0 points
    Parent
    Probabilistic sleeping beauty
    
    P(H|W)=1/21
    
    Now, let’s change the problem slightly.
    
    The experimenters fix m unique constants, k1,...,km, each in {1,2,..,20}, sedate you, roll a D20 and flip a coin. If the coin comes up tails, they will wake you on days k1,...,km. If the coin comes up heads and the D20 comes up is in {k1,...,km}, they will wake you on day 1.
    
    Here, P(H|W)=m/(20+m)
    
    If m is 1 we get ¹⁄₂₁.
    
    If m is 20 we get ¹⁄₂, which is the solution to the sleeping beauty problem.
    - Joanna Morningstar 11 May 2010 15:22 UTC
      0 points
      Parent
      The point of the PSB problem is that the approach you’ve just outlined is indefensible.
      
      You agree that for each single constant k_i P(H|W) = ¹⁄₂₁. Uncertainty over which constant k_i is used does not alter this.
      
      So if I run PSB 20 times, you would assert in each run that P(H|W) = ¹⁄₂₁. So now I simply keep you sedated between experiments. Statistically, 20 runs yields you SB, and each time you answered with ¹⁄₂₁ as your credence. Does this not faze you at all?
      
      You have a scenario A where you assert foo with credence P, and scenario B where you also assert foo with credence P, yet if I put you in scenario A and then scenario B, keeping you sedated in the meantime, you do not assert foo with credence P...
      - neq1 12 May 2010 12:47 UTC
        0 points
        Parent
        Jonathan,
        
        In this problem:
        
        The experimenters fix m unique constants, k1,...,km, each in {1,2,..,20}, sedate you, roll a D20 and flip a coin. If the coin comes up tails, they will wake you on days k1,...,km. If the coin comes up heads and the D20 comes up is in {k1,...,km}, they will wake you on day 1.
        
        Do you agree that P(H|W)=m/(20+m) ? If not, why not?
        
        Do you also agree that when m=20 we have the sleeping beauty problem (with 20 wake ups instead of 2 for tails)? If not, why not?
        Joanna Morningstar 12 May 2010 19:23 UTC
        0 points
        Parent
        No. I assert P(H|W) = ¹⁄₂₁ in this case.
        
        Two ways of seeing this: Either calculate the expected number of wakings conditional on the coin flip (m/20 and m for H and T). [As in SB]
        
        Alternatively consider this as m copies of the single constant game, with uncertainty on each waking as to which one you’re playing. All m single constant games are equally likely, and all have P(H|W) = ¹⁄₂₁. [The hoped for PSB intuition-pump]
        neq1 12 May 2010 22:15 UTC
        0 points
        Parent
        I need more clarification. Sorry. I do think we’re getting somewhere...
        
        The experimenters fix 2 unique constants, k1,k2, each in {1,2,..,20}, sedate you, roll a D20 and flip a coin. If the coin comes up tails, they will wake you on days k1 and k2. If the coin comes up heads and the D20 that comes up is in {k1,k2}, they will wake you on day 1.
        
        Do you agree that P(H|W)=2/22 in this case?
        What links here?
        Morendil's comment on Conditioning on Observers by Joanna Morningstar (13 May 2010 9:01 UTC; 0 points)
        Morendil 13 May 2010 9:02 UTC
        0 points
        Parent
        I do.
        Joanna Morningstar 13 May 2010 8:19 UTC
        0 points
        Parent
        No; P(H|W) = ¹⁄₂₁
        
        Multiple ways to see this: 1) Under heads, I expect to be woken ¹⁄₁₀ of the time Under tails, I expect to be woken twice. Hence on the average for every waking after a head I am woken 20 times after a tail. Ergo ¹⁄₂₁.
        
        2) Internally split the game into 2 single constant games, one for k1 and one for k2. We can simply play them sequentially (with the same die roll). When I am woken I do not know which of the two games I am playing. We both agree that in the single constant game P(H|W) = ¹⁄₂₁.
        
        It’s reasonably clear that playing two single constant games in series (with the same die roll and coin flip) reproduces the 2 constant game. The correleation between the roll and flip in the two games doesn’t affect the expectations, and since you have complete uncertainty over which game you’re in (c/o amnesia), the correlation of your current state with a state you have no information on is irrelevant.
        
        P(H|W ∩ game i) = ¹⁄₂₁, so P(H|W) = ¹⁄₂₁, as the union over all i of (W ∩ game i) is W. At some level this is why I introduced PSB, it seems clearer that this should be the case when the number of wakings is bounded to 1.
        
        3) Being woken implies either W1 or W2 (currently being woken for the first time or the second time) has occured. In general note that the expected count of something is a probability (and vice versa) if the number of times the event occurs is in {0,1} (trivial using the frequentist def of probability; under the credence view it’s true for betting reasons).
        
        P(W1 | H) = ¹⁄₁₀, P(W2 | H) = 0 P(W1 | T) = 1, P(W2 | T) = 1, from the experimental setup.
        
        Hence P(H|W1) = ¹⁄₁₁, P(H|W2) = 0 You’re woken in ¹¹⁄₂₀ of experiments for the first time and in ¹⁄₂ of experiments for the second, so P(W1| I am woken) = ¹¹⁄₂₁
        
        P(H | I am woken ) = P(H ∩ W1 | I am woken ) + P(H ∩ W2 | I am woken ) = P(H | W1 ∩ I am woken).P(W1 | I am woken) + 0 = ¹⁄₁₁ . ¹¹⁄₂₁ = ¹⁄₂₁.
        
        The issues you’ve raised with this is seem to be that you would either: Set P(W1 | I am woken) = 1 or Set P(W1 | T) = P(W2 | T) = ¹⁄₂ [ so P(H|W1) = ¹⁄₆ ], and set P(W1 | I am woken) = ⁶⁄₁₁.
        
        My problem with this is that if P(W1 | I am woken) =/= ¹¹⁄₂₁, you’re poorly calibrated. Your position appears to be that this is because you’re being “forced to make the bet twice in some circumstances but not others”. Hence what you’re doing is clipping the number of times a bet is made to {0,1}, at which point expectation counts of number of outcomes are probabilities of outcomes. I think such an approach is wrong, because the underlying problem is that the counts of event occurences conditional on H or T aren’t constrained to be in {0,1} anymore. This is why I’m not concerned about the “probabilities” being over-unity. Indeed you’d expect them to be over-unity, because the long run number of wakings exceeds the long run number of experiments. In the limit you get well defined over unity probability, under the frequentist view. Betting odds aren’t constrained in [0,1] either, so again you wouldn’t expect credence to stay in [0,1]. It is bounded in [0,2] in SB or your experiment, because the maximum number of winning events in a branch is 2.
        
        As I see it, the ¹⁄₂₁ answer (or ¹⁄₃ in SB) is the only plausible answer because it holds when we stack up multiple runs of the experiment in series or equivalently have uncertainty over which constant is being used in PSB. The ¹⁄₁₁ (equiv. ¹⁄₂) answer doesn’t have this property, as is seen from ¹⁄₂₁ going to ¹⁄₁₁ from nothing but running two experiments of identical expected behaviour in series...
        What links here?
        Joanna Morningstar's comment on Conditioning on Observers by Joanna Morningstar (13 May 2010 8:29 UTC; 0 points)
        neq1 13 May 2010 10:03 UTC
        0 points
        Parent
        Credence isn’t constrained to be in [0,1]???
        
        It seems to me that you are working very hard to justify your solution. It’s a solution by argument/intuition. Why don’t you just do the math?
        
        The experimenters fix 2 unique constants, k1,k2, each in {1,2,..,20}, sedate you, roll a D20 and flip a coin. If the coin comes up tails, they will wake you on days k1 and k2. If the coin comes up heads and the D20 that comes up is in {k1,k2}, they will wake you on day 1.
        
        I just used Bayes rule. W is an awakening. We want to know P(H|W), because the question is about her subjective probability when (if) she is woken up.
        
        To get P(H|W), we need the following:
        
        P(W|H)=2/20 (if heads, wake up if D20 landed on k1 or k2)
        
        P(H)=1/2 (fair coin)
        
        P(W|T)=1 (if tails, woken up regardless of result of coin flip)
        
        P(T)=1/2 (fair coin)
        
        Using Bayes rule, we get:
        
        P(H|W)=(2/20)(1/2) / [(2/20)(1/2)+(1)*(1/2)] = ¹⁄₁₁
        
        With your approach, you avoid directly applying Bayes’ theorem, and you argue that it’s ok for credence to be outside of [0,1]. This suggests to me that you are trying to derive a solution that matches your intuition. My suggestion is to let the math speak, and then to figure out why your intuition is wrong.
        Joanna Morningstar 13 May 2010 12:22 UTC
        0 points
        Parent
        You and I both agree on Bayes implying ¹⁄₂₁ in the single constant case. Considering the 2 constant game as 2 single constant games in series, with uncertainty over which one (k1 and k2 the mutually exclusive “this is the k1/k2 game”)
        
        P(H | W) = P(H ∩ k1|W) + P(H ∩ k2|W) = P(H | k1 ∩ W)P(k1|W) + P(H|k2 ∩ W)P(k2|W) = ¹⁄₂₁ . ¹⁄₂ + ¹⁄₂₁ . ¹⁄₂ = ¹⁄₂₁
        
        This is the logic that to me drives PSB to SB and the ¹⁄₃ solution. I worked it through in SB by conditioning on the day (slightly different but not substantially).
        
        I have had a realisation. You work directly with W, I work with subsets of W that can only occur at most once in each branch and apply total probability.
        
        Formally, I think what is going on is this: (Working with simple SB) We have a sample space S = {H,T}
        
        “You have been woken” is not an event, in the sense of being a set of experimental outcomes. “You will be woken at least once” is, but these are not the same thing.
        
        “You will be woken at least once” is a nice straightforward event, in the sense of being a set of experimental outcomes {H,T}. “You have been woken” should be considered formally as the multiset {H,T,T}. Formally just working thorough with multisets wherever sets are used as events in probability theory, we recover all of the standard theorems (including Bayes) without issue.
        
        What changes is that since P(S) = 1, and there are multisets X such that X contains S, P(X) > 1.
        
        Hence P({H,T,T}) = ³⁄₂; P({H}|{H,T,T}) = ¹⁄₃.
        
        In the 2 constant PSB setup you suggest, we have S = {H,T} x {1,..,20} W = {(H,k1),(H,k2), (T,1),(T,1),(T,2),(T,2),....,(T,20),(T,20)}
        
        And P(H|W) = ¹⁄₂₁ without issue.
        
        My statement is that this more accurately represents the experimental setup; when you wake, conditioned on all background information, you don’t know how many times you’ve been woken before, but this changes the conditional probabilities of H and T. If you merely use background knowledge of “You have been woken at least once”, and squash all of the events “You are woken for the nth time” into a single event by using union on the events, then you discard information.
        
        This is closely related to my earlier (intuition) that the problem was something to do with linearity.
        
        In sets, union and intersection are only linear when the working on some collection of atomic sets, but are generally linear in multisets. [eg. (A υ B) \ B ≠ A in general in sets]
        
        Observe that the approach I take of splitting “events” down to disjoint things that occur at most once is precisely taking a multiset event apart into well behaved events and then applying probability theory.
        
        What was concerning me is that the true claim that P({H,T}|T) = 1 seemed to discard pertinent information (ie the potential for waking on the second day). With W as the multiset {H,T,T}, P(W|T) = 2. You can regard this as expectation number of times you see Tails, or the extension of probability to multisets.
        
        The difference in approach is that you have to put the double counting of waking given tails in as a boost to payoffs given Tails, which seems odd as from the point of view of you having just been woken you are being offered immediate take-it-or-leave-it odds. This is made clearer by looking at the twins scenario; each person is offered at most one bet.
      - neq1 11 May 2010 16:06 UTC
        0 points
        Parent
        
        So if I run PSB 20 times, you would assert in each run that P(H|W) = ¹⁄₂₁. So now I simply keep you sedated between experiments.
        
        You just changed the problem. If you wake me up between runs of PSB, then P(H|W)=1/21 each time. If not, I have different information to condition on.
        Joanna Morningstar 11 May 2010 16:57 UTC
        0 points
        Parent
        No; between sedation and amnesia you know nothing but the fact that you’ve been woken up, and that 20 runs of this experiment are to be performed.
        
        Why would an earlier independent trial have any impact on you or your credences, when you can neither remember it nor be influenced by it?
        neq1 11 May 2010 17:05 UTC
        0 points
        Parent
        I don’t know. It’s a much more complicated problem, because you have 20 coin flips (if I understand the problem correctly). I haven’t taken the time to work through the math yet. It’s not obvious to me, though, why this corresponds to the sleeping beauty problem. In fact, it seems pretty clear that it doesn’t.
        Joanna Morningstar 11 May 2010 17:35 UTC
        0 points
        Parent
        The reason it corresponds to Sleeping Beauty is that in the limit of a large number of trials, we can consider blocks of 20 trials where heads was the flip and all values of the die roll occurred, and similar blocks for tails, and have some epsilon proportion left over. (WLLN)
        
        Each of those blocks corresponds to Sleeping Beauty under heads/tails.
  - neq1 11 May 2010 12:39 UTC
    0 points
    Parent
    
    You still seem to be holding on to the claim that there are as many observations after a head as after a tail; this is clearly false.
    
    No. I never made that claim, so I cannot “hold on to it”. The number of observations after tails doesn’t matter here.
    
    P(Monday ∩ H | W) = P(Monday ∩ T | W). Regardless of whether the coin came up heads or tails you will be woken on Monday precisely once.
    
    Imagine we repeat the sleeping beauty experiment many times. On half of the experiments, she’d be on the heads path. On half of the experiments, she’d be on the tails path. If she is on the tails path, it could be either monday or tuesday. Thus, on an awakening Monday ∩ T is less likely than Monday ∩ H
    - Joanna Morningstar 11 May 2010 15:12 UTC
      0 points
      Parent
      The claim is implied by your logic; the fact that you don’t engage with it does not prevent it from being a consequence that you need to deal with. Furthermore it appears to be the intuition by which you are constructing your models of Sleeping Beauty.
      
      Imagine we repeat the sleeping beauty experiment many times. On half of the experiments, she’d be on the heads path. On half of the experiments, she’d be on the tails path.
      
      Granted; no contest
      
      If she is on the tails path, it could be either monday or tuesday.
      
      And assuredly she will be woken on both days in any given experimental run. She will be woken twice. Both events occur whenever tails comes up.P(You will be woken on Monday | Tails) = P(You will be woken on Tuesday | Tails) = 1
      
      The arrangement that you are putting forward as a model is that Sleeping Beauty is to be woken once and only once regardless of the coin flip, and thus if she could wake on Tuesday given Tails occurred then that must reduce the change of her waking on Monday given that Tails occurred. However in the Sleeping Beauty problem the number of wakings is not constant. This is the fundamental problem in your approach.
      - neq1 11 May 2010 15:56 UTC
        −1 points
        Parent
        I think we could make faster progress if you started with the assumption that I have read and understood the problem. Yes, I know that she is woken up twice when tails.
        
        You agree that
        
        On half of the experiments, she’d be on the heads path. On half of the experiments, she’d be on the tails path.
        
        Given that she is awake right now, what should be her state of mind. Well, she knows if heads it’s Monday. She knows if tails it’s either Monday or Tuesday. The fact that she will (or has) been woken up on both days doesn’t matter to her right now. It’s either Monday or Tuesday. Given that she cannot distinguish between the two, it would make sense for her to think of those as equally likely at this awakening (under tails). Thus, P(Monday|T,W)=1/2, P(T|W)=1/2, P(Monday ∩ T | W)=1/4.
        
        The problem with your ¹⁄₃ solution is you treat the data as if they are counts in a 3 by 1 contingency table (the 3 cells being Monday&H, Monday&T, Tuesday&T). If the counts were the result of independent draws from a multinomial distribution, you would get p(H|W)=1/3. You have dependent draws though. You have 1 degree of freedom instead of the usual 2 degrees of freedom. That’s why your ratio is not a probability. That’s why your solution results in nonsense like p(W)=3/2.
        Joanna Morningstar 11 May 2010 18:00 UTC
        1 point
        Parent
        As I see it, initially (as a prior, before considering that I’ve been woken up), both Heads and Tails are equally likely, and it is equally likely to be either day. Since I’ve been woken up, I know that it’s not (Tuesday ∩ Heads), but I gain no further information.
        
        Hence the 3 remaining probabilities are renormalised to ¹⁄₃.
        
        Alternatively: I wake up; I know from the setup that I will be in this subjective state once under Heads and twice under Tails, and they are a priori equally likely. I have no data that can distinguish between the three states of identical subjective state, so my posterior is uniform over them.
        
        If she knows it’s Tuesday then it’s Tails. If she knows it’s Monday then she learns nothing of the coin flip. If she knows the flip was Tails then she is indifferent to Monday and Tuesday. ¹⁄₃ drops out as the only consistent answer at that point.
        neq1 11 May 2010 19:25 UTC
        0 points
        Parent
        
        As I see it, initially (as a prior, before considering that I’ve been woken up), both Heads and Tails are equally likely, and it is equally likely to be either day.
        
        It’s not equally likely to be either day. If I am awake, it’s more likely that it’s Monday, since that always occurs under heads, and will occur on half of tails awakenings.
        
        I know from the setup that I will be in this subjective state once under Heads and twice under Tails, and they are a priori equally likely. I have no data that can distinguish between the three states of identical subjective state, so my posterior is uniform over them.
        
        Heads and tails are equally likely, a priori, yes. It is equally likely that you will be woken up twice as it is that you will be woken up. Yes. That’s true. But we are talking about your state of mind on an awakening. It can’t be both Monday and Tuesday. So, what should your subjective probability be? Well, I know it’s tails and (Monday or Tuesday) with probability 0.5. I know it’s heads and Monday with probability 0.5.
        Joanna Morningstar 12 May 2010 7:35 UTC
        0 points
        Parent
        Before I am woken up, my prior belief is that I spend 24 hours on Monday and 24 on Tuesday regardless of the coin flip. Hence before I condition on waking, my probabilities are ¹⁄₄ in each cell.
        
        When I wake, one cell is driven to 0, and the is no information to distinguish the remaining 3. This is the point that the sleeping twins problem was intended to illuminate.
        
        Given awakenings that I know to be on Monday, there are two histories with the same measure. They are equally likely. If I run the experiment and count the number of events Monday ∩ H and Monday ∩ T, I will get the same numbers (mod. epsilon errors). Your assertion that it’s H/T with probability 0.5 is false given that you have woken. Hence sleeping twins.
        timtyler 11 May 2010 19:38 UTC
        0 points
        Parent
        That is Beauty’s probability of which day it is AFTER considering that she has been woken up.