Rafael Harth answers Change My Mind: Thirders in “Sleeping Beauty” are Just Doing Epistemology Wrong

Rafael Harth 16 Oct 2024 12:15 UTC
19 points
4
It ultimately depends on how you define probabilities, and it is possible to define them such that the answer is $\frac{1}{2}$ .

I personally think that the only “good” definition (I’ll specify this more at the end) is that a probability of $\frac{1}{4}$ should occur one in four times in the relevant reference class. I’ve previously called this view “generalized frequentism”, where we use the idea of repeated experiments to define probabilities, but generalizes the notion of “experiment” to subsume all instances of an agent with incomplete information acting in the real world (hence subsuming the definition as subjective confidence). So when you flip a coin, the experiment is not the mathematical coin with two equally likely outcomes, but the situation where you as an agent are flipping a physical coin, which may include a 0.01% probability of landing on the side, or a $10^{- 15}$ probability of breaking in two halfs mid air or whatever. But the probability for it coming up heads should be about $\frac{1}{2}$ because in about $\frac{1}{2}$ of cases where you as an agent are about to flip a physical coin, you subsequently observe it coming up heads.

There are difficulties here with defining the reference class, but I think they can be adequately addressed, and anyway, those don’t matter for the sleeping beauty experiment because there, the reference classes is actually really straight-forward. Among the times that you as an agent are participating in the experiment and are woken up and interviewed (and are called Sleeping Beauty, if you want to include this in the reference class), one third will have the coin heads, so the probability is $\frac{1}{3}$ . This is true regardless of whether the experiment is run repeatedly throughout history, or repeatedly because of Many Worlds, or an infinite universe, etc. (And I think the very few cases in which there is genuinely not a repeated experiment are in fact qualitatively difference since now we’re talking logical uncertainty rather than probability, and this distinction is how you can answer $\frac{1}{3}$ in Sleeping Beauty without being forced to answer $\frac{1}{1000000}$ on the Presumptuous Philosopher problem.)

So RE this being the only “good” definition, well one thing is that it fits betting odds, but I also suspect that most smart people would eventually converge on an interpretation with these properties if they thought long enough about the nature of probability and implications of having a different definition, though obviously I can’t prove this. I’m not aware of any case where I want to define probability differently, anyway.
- DragonGod 16 Oct 2024 12:44 UTC
  15 points
  0
  Parent
  So in this case, I agree that like if this experiment is repeated multiple times and every Sleeping Beauty version created answered tails, the reference class of Sleeping Beauty agents would have many more correct answers than if the experiment is repeated many times and every sleeping Beauty created answered heads.
  
  I think there’s something tangible here and I should reflect on it.
  
  I separately think though that if the actual outcome of each coin flip was recorded, there would be a roughly equal distribution between heads and tails.
  
  And when I was thinking through the question before it was always about trying to answer a question regarding the actual outcome of the coin flip and not what strategy maximises monetary payoffs under even bets.
  
  While I do think that like betting odds isn’t convincing re: actual probabilities because you can just have asymmetric payoffs on equally probable mutually exclusive and jointly exhaustive events, the “reference class of agents being asked this question” seems like a more robust rebuttal.
  
  I want to take some time to think on this.
  
  Strong up voted because this argument actually/genuinely makes me think I might be wrong here.
  
  Much less confident now, and mostly confused.
  - Measure 16 Oct 2024 13:51 UTC
    10 points
    1
    Parent
    
    I separately think though that if the actual outcome of each coin flip was recorded, there would be a roughly equal distribution between heads and tails.
    
    Importantly, this is counting each coinflip as the “experiment”, whereas the above counts each awakening as the “experiment”. It’s okay that different experiments would see different outcome frequencies.
    - Viliam 17 Oct 2024 13:07 UTC
      2 points
      0
      Parent
      Yes.
      If you record the moments when the outside observer sees the coin landing, you will get ¹⁄₂.
      If you record the moments when the Sleeping Beauty, right after making her bet, is told the actual outcome, you will get ¹⁄₃.
      So we get ¹⁄₂ by identifying with the outside observer, but he is not the one who was asked in this experiment.
      Unless you change the rules so that the Sleeping Beauty is only rewarded for the correct bet at the end of the week, and will only get one reward even if she made two (presumably identical) bets. In that case, recording the moment when the Sleeping Beauty gets the reward or not, you will again get ¹⁄₂.
  - Rafael Harth 16 Oct 2024 17:33 UTC
    7 points
    1
    Parent
    
    I separately think though that if the actual outcome of each coin flip was recorded, there would be a roughly equal distribution between heads and tails.
    
    What I’d say is that this corresponds to the question, “someone tells you they’re running the Sleeping Beauty experiment and just flipped a coin; what’s the probability that it’s heads?”. Difference reference class, different distribution; probability now is 0.5. But this is different from the original question, where we are Sleeping Beauty.
  - DragonGod 16 Oct 2024 13:21 UTC
    2 points
    0
    Parent
    My current position now is basically:
    
    Actually, I’m less confident and now unsure.
    
    Harth’s framing was presented as an argument re: the canonical Sleeping Beauty problem.
    
    And the question I need to answer is: “should I accept Harth’s frame?”
    
    I am at least convinced that it is genuinely a question about how we define probability.
    
    There is still a disconnect though.
    
    While I agree with the frequentist answer, it’s not clear to me how to backgpropagate this in a Bayesian framework.
    
    Suppose I treat myself as identical to all other agents in the reference class.
    
    I know that my reference class will do better if we answer “tails” when asked about the outcome of the coin toss.
    
    But it’s not obvious to me that there is anything to update from when trying to do a Bayesian probability calculation.
    
    There being many more observers in the tails world to me doesn’t seem to alter these probabilities at all:
    
    P(waking up)
    P(being asked questions)
    P(...)
    
    By stipulation my observational evidence is the same in both cases.
    
    And I am not compelled by assuming I should be randomly sampled from all observers.
    
    There are many more versions of me in this other world does not by itself seem to raise the probability of me witnessing the observational evidence since by stipulation all versions of me witness the same evidence.
- DragonGod 16 Oct 2024 12:49 UTC
  7 points
  0
  Parent
  I’m curious how your conception of probability accounts for logical uncertainty?
  - Rafael Harth 16 Oct 2024 17:25 UTC
    3 points
    0
    Parent
    I count references within each logical possibility and then multiply by their “probability”.
    
    Here’s a super contrived example to explain this. Suppose that if the last digit of pi is between 0 and 3, Sleeping Beauty experiments work as we know them, whereas if it’s between 4 and 9, everyone in the universe is miraculously compelled to interview Sleeping Beauty 100 times if the coin is tails. In this case, I think P(coin heads|interviewed) is $0.4 \cdot \frac{1}{3} + 0.6 \cdot \frac{1}{101}$ . So it doesn’t matter how many more instances of the reference class there are in one logical possibility; they don’t get “outside” their branch of the calculation. So in particular, the presumptuous philosopher problem doesn’t care about number of classes at all.
    
    In practice, it seems super hard to find genuine examples of logical uncertainty and almost everything is repeated anyway. I think the presumptuous philosopher problem is so unintuitive precisely because it’s a rare case of actual logical uncertainty where you genuinely cannot count classes.
    What links here?
    Rafael Harth's comment on K-Bounded Sleeping Beauty Problem by Ape in the coat (26 Nov 2024 19:17 UTC; 3 points)
- Ape in the coat 16 Oct 2024 14:29 UTC
  2 points
  0
  Parent
  I personally think that the only “good” definition (I’ll specify this more at the end) is that a probability of $\frac{1}{4}$ should occur one in four times in the relevant reference class. I’ve previously called this view “generalized frequentism”, where we use the idea of repeated experiments to define probabilities, but generalizes the notion of “experiment” to subsume all instances of an agent with incomplete information acting in the real world (hence subsuming the definition as subjective confidence).
  Why do you suddenly substitute the notion of “probability experiment” with the notion of “reference class”? What do you achieve by this?
  From my perspective, this is where the source of confusion lingers. Probability experiment can be precisely specified: the description of any probability theory problem is supposed to be that. But “reference class” is misleading and up for the interpretation.
  There are difficulties here with defining the reference class, but I think they can be adequately addressed, and anyway, those don’t matter for the sleeping beauty experiment because there, the reference classes is actually really straight-forward. Among the times that you as an agent are participating in the experiment and are woken up and interviewed (and are called Sleeping Beauty, if you want to include this in the reference class), one third will have the coin heads, so the probability is $\frac{1}{3}$ .
  And indeed, because of this “reference class” business you suddenly started treating individual awakening of Sleeping Beauty as mutually exclusive outcome, even though it’s absolutely not the case in the experiment as stated. I don’t see how you would make such mistake if you kept using the term “probability experiment” without switching to speculate about “reference classes”.
  Among the iterations of Sleeping Beauty probability experiment that a participant awakes, half the time the coin is Heads so the probability is ¹⁄₂.
  Here there are no difficulties to address—everything is crystal clear. You just need to calm the instinctive urge to weight the probability by the number of awakenings, which would be talking about a different mathematical concept.
  EDIT: @ProgramCrafter the description of the experiment clearly states that that when the coin is Tails the Beauty is to be awaken twice in the same iteration of the experiment. Therefore, individual awakennings are not mutually exclisive with each other: more than one can happen in the same iteration of the experiment.
  - Rafael Harth 16 Oct 2024 17:12 UTC
    4 points
    0
    Parent
    
    Why do you suddenly substitute the notion of “probability experiment” with the notion of “reference class”? What do you achieve by this?
    
    Just to be clear, the reference class here is the set of all instances across all of space and time where an agent is in the same “situation” as you (where the thing you can argue about is how precisely one has to specify the situation). So in the case of the coinflip, it’s all instances across space and time where you flip a physical coin (plus, if you want to specify further, any number of other details about the current situation).
    
    So with that said, to answer your question: why define probabilities in terms of this concept? Because I don’t think I want a definition of probability that doesn’t align with this view, when it’s applicable. If we can discretely count the number of instances across the history of the universe that fit the current situation , and we know some event happens in one third of those instances, then I think the probability has to be one third. This seems very self-evident to me; it seems exactly what the concept of probability is supposed to do.
    
    I guess one analogy—suppose one third of all houses is painted blue from the outside and one third red, and you’re in one house but have no idea which one. What’s the probability that it’s blue? I think it’s ²⁄₃, and I think this situation is precisely analogous to the reference class construction. Like I actually think there is no relevant difference; you’re in one of the situations that fit the current situation (trivially so), and you can’t tell which one (by construction; if you could, that would be included in the definition of the reference class, which would make it different from the others). Again, this just seems to get at precisely the core of what a probability should do.
    
    So I think that answers it? Like I said, I think you can define “probability” differently, but if the probability doesn’t align with reference class counting, then it seems to me that the point of the concept has been lost. (And if you do agree with that, the question is just whether or not reference class counting is applicable, which I haven’t really justified in my reply, but for Sleeping Beauty it seems straight-forward.)
    - Ape in the coat 17 Oct 2024 4:22 UTC
      3 points
      2
      Parent
      So with that said, to answer your question: why define probabilities in terms of this concept? Because I don’t think I want a definition of probability that doesn’t align with this view, when it’s applicable.
      Suppose I want matrix multiplication to be commutative. Surely it would be so convinient if it was! I can define some operator * over matrixes so that A*B = B*A. I can even call this operator “matrix multiplication”.
      But did I just make matrix multiplication, as it’s conventionally defined, commutative? Of course not. I logically pinpointed a new function and called it the same way as the previous function is being called, but it didn’t change anything about how the previous function is logically pinpointed.
      My new function may have some interesting applications and therefore deserve to be talked about in its own right. But calling it’s “matrix multiplication” is very misleading. And if I were to participate in conversation about matrix multiplication while talking about my function I’d be confusing everyone.
      This is basically the situation that we have here.
      Initially probability function is defined over iterations of probability experiment. You define a different function over all space and time, which you still call “probability”. It surely has properties that you like, but it’s a different function! Please use another name, this is already taken. Or add a disclaimer. Preferably do both. You know how easy it is to confuse people with such things! Definetely, do not start participating in the conversations about probability while talking about your function.
      If we can discretely count the number of instances across the history of the universe that fit the current situation , and we know some event happens in one third of those instances, then I think the probability has to be one third. This seems very self-evident to me; it seems exactly what the concept of probability is supposed to do.
      I guess one analogy—suppose one third of all houses is painted blue from the outside and one third red, and you’re in one house but have no idea which one. What’s the probability that it’s blue?
      As long as these instances are independent of each other—sure. Like with your houses analogy. When we are dealing with simple, central cases there is no diasagreement between probability and weighted probability and so nothing to argue about.
      But as soon as we are dealing with more complicated scenario where there is no independence and it’s possible to be inside multiple houses in the same instance… Surely, you see how demanding to have coherent P(Red xor Blue) becomes unfeasible?
      The problem is, our intuitions are too eager to assume that everything as independent. We are used to think in terms of physical time, using our memory as something that allows us to orient in it. This is why amnesia scenarios are so mindboggling to us!
      And that’s why the notion of probability experiment where every single trial is independent and the outcomes in any single trial are mutually exclusive is so important. We strictly define what the “situation” means and therefore do not allow ourselves to be tricked. We can clearly see that individual awakenings can’t be treated as outcomes of the Sleeping Beauty experiment.
      But when you are thinking in terms of “reference classes” your definition of “situation” is too vague. And so you allow yourself to count the same house multiple times. Treat yourself not as a person participating in the experiment but as an “awakening state of the person”, even though one awakening state necessary follows the other.
      if the probability doesn’t align with reference class counting, then it seems to me that the point of the concept has been lost.
      The “point of probability” is lost when it doesn’t allign with reasoning about instances of probability experiments. Namely, we are starting to talk about something else, instead of what was logically pinpointed as probability in the first place. Most of the time reasoning about reference classes does allign with it, so you do not notice the difference. But once in a while it doesn’t and so you end up having “probability” that contradicts conservation of expected evidence and “utility” shifting back and forth.
      So what’s the point of these reference classes? What’s so valuable in them? As far as I can see they do not bring anything to the table except extra confusion.
      - ProgramCrafter 17 Oct 2024 15:00 UTC
        2 points
        0
        Parent
        Upon rereading your posts, I retract disagreement on “mutually exclusive outcomes”. Instead...
        Initially probability function is defined over iterations of probability experiment. You define a different function over all space and time, which you still call “probability”. It surely has properties that you like, but it’s a different function! Please use another name, this is already taken. Or add a disclaimer. Preferably do both. You know how easy it is to confuse people with such things! Definetely, do not start participating in the conversations about probability while talking about your function.
        An obvious way to do so is put a hazard sign on “probability” and just not use it, not putting resources into figuring out what “probability” SB should name, isn’t it? For instance, suppose Sleeping Beauty claims “my credence for Tails is $\frac{1}{π}$ ”; any specific objection would be based on what you expected to hear.
        (And now I realize a possible point why you’re arguing to keep “probability” term for such scenarios well-defined; so that people in ~anthropic settings can tell you their probability estimates and you, being observer, could update on that information.)
        As for why I believe probability theory to be useful in life despite the fact that sometimes different tools need to be used: I believe disappearing as a Boltzmann brain or simulated person is balanced out by appearing the same way, dissolving into different quantum branches is balanced out by branches reassembling, and likewise for most processes.
        Ape in the coat 22 Oct 2024 8:34 UTC
        3 points
        0
        Parent
        An obvious way to do so is put a hazard sign on “probability” and just not use it, not putting resources into figuring out what “probability” SB should name, isn’t it?
        It’s an obvious thing to do when dealing with simularity clusters poorly defined in natural language. Not so much, when we are talking about a logically pinpointed mathematical concept which we know are crucial for epistemology.
        (And now I realize a possible point why you’re arguing to keep “probability” term for such scenarios well-defined; so that people in ~anthropic settings can tell you their probability estimates and you, being observer, could update on that information.)
        It’s not just about anthropic scenarios and not just about me being able to understand other people. It’s about general truth preserving mechanism of logical and mathematical reasoning. When people just use different definitions—this is annoying but fine. But when they use different definitions without realizing that these definitions are different and, moreover insist that it’s you who is making a mistake—then we have an actual disagreement about math which will provide more confusion along the way. Anthropic scenarious are just the ones where this confusion is noticeable.
        As for why I believe probability theory to be useful in life despite the fact that sometimes different tools need to be used
        What exactly do you mean by “different tools need to be used”? Can you give me an example?
        ProgramCrafter 22 Oct 2024 17:32 UTC
        1 point
        0
        Parent
        What exactly do you mean by “different tools need to be used”? Can you give me an example?
        I mean that Beauty should maintain full model of experiment, and use decision theory as well as probability theory (if latter is even useful, which it admittedly seems to be). If she didn’t keep track of full setup but only “a fair coin was flipped, so the odds are 1:1”, she would predictably lose when betting on the coin outcome.
        Also, I’ve minted another “paradox” version. I can predict you’ll take issue with one of formulations in it, but what do you think about it?
        A fair coin is flipped, hidden from you.
        On Heads, you’re waken up on Monday, asked “what credence do you have that coin landed Heads?”; on Tuesday, you’re let go.
        If coin landed Tails, you’re waken up on Monday and still asked “what credence do you have that coin landed Heads?”; then, with no memory erasure, you’re waken up on Tuesday, and experimenter says to you: “Name the credence that coin landed Heads, but you must name the exact same number as yesterday”. Afterwards, you’re let go.
        If you don’t follow experiment protocol, you lose/lose out on some reward.
        Ape in the coat 22 Oct 2024 19:25 UTC
        3 points
        0
        Parent
        I suppose the participant is just supposed to lie about their credence here in order to “win”.
        On Tuesday your credence in Heads supposed to be 0, but saying the true value would go against the experimental protocol unless you also said that your credence is 0 on Monday, which would also be a lie.
        Radford Neal 22 Oct 2024 17:50 UTC
        3 points
        1
        Parent
        I don’t understand this formulation. If Beauty always says that the probability of Heads is ¹⁄₇, does she win? Whatever “win” means...
        ProgramCrafter 23 Oct 2024 19:25 UTC
        0 points
        −1
        Parent
        She certainly gets a reward for following experimental protocol, but beyond that… I concur there’s the problem, and I have the same issue with standard formulation asking for probability.
        In particular, pushing problem out to morality “what should Sleeping Beauty answer so that she doesn’t feel as if she’s lying” doesn’t solve anything either; rather, it feels like asking question “is continuum hypothesis true?” providing only options ‘true’ and ‘false’, while it’s actually independent of ZFC axioms (claims of it or of its negation produce different models, neither proven to self-contradict).
        P.S. One more analogue: there’s a field, and some people (experimenters) are asking whether it rained recently with clear intent to walk through if it didn’t; you know it didn’t rain but there are mines all over the field.
        I argue you should mention the mines first (“that probability—which by the way will be ¹⁄₂ - can be found out, conforms to epistemology, but isn’t directly usable anywhere”) before saying if there was rain.
      - Rafael Harth 17 Oct 2024 9:38 UTC
        2 points
        0
        Parent
        
        As long as these instances are independent of each other—sure. Like with your houses analogy. When we are dealing with simple, central cases there is no diasagreement between probability and weighted probability and so nothing to argue about.
        
        But as soon as we are dealing with more complicated scenario where there is no independence and it’s possible to be inside multiple houses in the same instance
        
        If you can demonstrate how, in the reference class setting, there is a relevant criterion by which several instances should be grouped together, then I think you could have an argument.
        
        If you look at space-time from above, there’s two blue houses for every red house. Sorry I meant there’s two SB(=Sleeping Beauty)-tails instances for every SB-heads instance. The two instances you want to group together (tails-Monday & tails-Tuesday) aren’t actually at the same time (not that I think it matters). If the universe is very large of Many Worlds is true, then there are in fact many instances of Monday-heads, Monday-tails, and Tuesday tails occurring at the same time, and I don’t think you want to group those together.
        
        In any case, from the PoV of SB, all instances look identical to you. So by what criterion should we group some of them together? That’s the thing I think your position requires (just because you accept reference classes are a priori valid and then become invalid in some cases), and I don’t see the criterion.