Ape in the coat comments on Anthropical Motte and Bailey in two versions of Sleeping Beauty

Ape in the coat 6 Aug 2023 8:16 UTC
3 points
0
And since this happens absolutely every time she wakes, Beauty should always assess the probability of Heads as ¹⁄₃.
Nope. Doesn’t work this way. There is an important difference between a probability of a specific low probable event happening and a probability of any low probable event from a huge class of events happening. Unsurprisingly, the former is much more probable than the latter and the trick works only with low probable events. As I’ve explicitly said, and you could’ve checked yourself as I provided you the code for it.
It’s easy to see why something that happens absolutely every time she wakes doesn’t help at all. You see, 50% of the coin tosses are Heads. If Beauty correctly guessed Tails ²⁄₃ out of all experiments that would be a contradiction. But it’s possible for the Beauty to correctly guess Tails ²⁄₃ out of some subset of experiments. To get the ²⁄₃ score she need some kind of evidence that happens more often when the coin is Tails than when it is Heads, not all the time, and then guess only when she gets this evidence.
There’s always fly crawling on the wall in a random direction, unlikely to be the same on Monday and Tuesday, or a stray thought about aardvarks, or a dimming of the light from the window as a cloud passes overhead, or any of millions of other things entering her consciousness in ways that won’t be the same Monday and Tuesday.
This is irrelevant unless the Beauty somehow knows where the fly is supposed to be on Monday and where on Tuesday. She can try to guess tails when the fly is in a specific place that the beauty precommited to, hoping that the causal process that define fly position is close enough to placing the fly in this place with the same low probability for every day but it’s not guaranteed to work.
If you’ve read the posts you link to, you must realize that this is central to my argument for why Heads has probability ¹⁄₃.
I’ve linked two posts. You need to read the second one as well, to understand the mistake in the reasoning of the first.
But this is not how probability theory works. A rare event is a rare event. You’ve just decided to define another event, “random number generator produces number I’d guessed beforehand”, and noted that that event didn’t occur. This doesn’t change the fact that the random number generator produced a number that is unlikely to be the same as that produced on another day.
This is exactly how probability theory works. The event “random number generator produced any number” has very high probability. The event “random number generator produced this specific number” has low probability. Which event we are talking about depends on whether the number was specified or not. It can be confusing if you forget that probabilities are in the mind, that it’s about Beauty decision making process, not the metaphysical essence of randomness.
You want to find the answer to a more fantastical problem in which Beauty can be magically duplicated exactly and kept in a room totally isolated from the external world, and hence guaranteed to have no random experiences
The fact that Beauty is unable to tell which day it is or whether she has been awakened before is an important condition of the experiment.
But this doesn’t have to mean that her experiences on Monday and Tuesday or on Heads and Tails are necessary exactly the same—she just doesn’t have to be able to tell which is which. Beauty can be placed in the differently colored rooms on Monday&Tails, Monday&Heads and Tuesday&Tails. All the furniture can be completely different as well. There can be ciphered message describing the result of the coin toss. Unless she knows how to break the cipher, or knows the pattern in colors/furniture this doesn’t help her. The mathematical model is still the same.
On a repeated experiment beauty can try executing a strategy but this requires precommitment to this strategy. And without this precommitment she will not be able to get useful information from all the differences in the outcomes.
But it’s going to be hard to answer it when we haven’t yet answered questions such whether a computer program can be conscious, and if so, whether consciousness requires actually running the program, or whether it’s enough to set up the (deterministic) program on the computer, so that it could be run, even though we don’t actually push the Start button.
How is the consciousness angle relevant here? Are you under the impression that probability theory works differently depending on whether we are reasoning about conscious or unconscious objects?
- Radford Neal 6 Aug 2023 17:05 UTC
  1 point
  −6
  Parent
  Nope. Doesn’t work this way. There is an important difference between a probability of a specific low probable event happening and a probability of any low probable event from a huge class of events happening.
  In Bayesian probability theory, it certainly does work this way. To find the posterior probability of Heads, given what you have observed, you combine the prior probability with the likelihood for Heads vs. Tails based on everything that you have observed. You don’t say, “but this observation is one of a large class of observations that I’ve decided to group together, so I’ll only update based on the probability that any observation in the group would occur (which is one for both Heads and Tails in this situation)”.
  You’re arguing in a frequentist fashion. A similar sort of issue for a frequentist would arise if you flipped a coin 9 times and found that 2 of the flips were Heads. If you then ask the frequentist what the p-value is for testing the hypothesis that the coin was fair, they’ll be unable to answer until you tell them whether you pre-committed to flipping the coin 9 times, or to flipping it until 2 Heads occurred (they’ll be completely lost if you tell them you just flipped until your finger got tired). Bayesians think this is ridiculous.
  Of course, there are plenty of frequentists in the world, but I presume they are uninterested in the Sleeping Beauty problem, since to a frequentist, Beauty’s probability for Heads is a meaningless concept, since they don’t think probability can be used to represent degrees of belief.
  How is the consciousness angle relevant here? Are you under the impression that probability theory works differently depending on whether we are reasoning about conscious or unconscious objects?
  I think if Beauty isn’t a conscious being, it doesn’t make much sense to talk about how she should reason regarding philosophical arguments about probability.
  I suspect we’re at a bit of an impasse with this line of discussion. I’ll just mention that probability is supposed to be useful. And if you extend the problem to allow Beauty to make bets, in various scenarios, the bets the make Beauty the most money are the ones she will make by assessing the probability of Heads to be ¹⁄₃ and then applying standard decision theory. Halfers are losers.
  - Ape in the coat 7 Aug 2023 8:55 UTC
    1 point
    0
    Parent
    You are making a fascinating mistake, and I may make a separate post about it, even though it’s not particularly related to anthropics and just a curious detail about probability theory, which in retrospect I relize I was confused myself about. I’d recommend you to meditate on it for a while. You already have all the information required to figure it out. You just need to switch yourself from the “argument mode” to “investigation mode”.
    Here are a couple more hints that you may find useful.
    1) Suppose you observed number 71 on a random number generator that produces numbers from 0 to 99.
    Is it
    1 in 100 occurence because the number is exactly 71?
    1 in 50 occurence becaue the number consist of these two digits: 7 and 1?
    1 in 10 occurence because the first digit is 7?
    1 in 2 occurence because the number is more or equal 50?
    1 in n occurence because it’s possible to come with some other arbitrary rule?
    What determine which case is actually true?
    2) Suppose you observed a list of numbers with length n, produced by this random number generator. The probability that exactly this series is produced is $1 / 100^{n}$
    At what n are you completely shocked and in total disbelief about your reality, after all you’ve just observed an event that your model of reality claims to be extremely improbable?
    Would you be more shocked if all the numbers in this list are the same? If so why?
    Can you now produce arbitrary improbable events just by having a random number generator? In what sense are these events have probability $1 / 100^{n}$ if you can witness as many of them as you want any time?
    You do not need to tell me the answers. It’s just something I believe will be helpful for you to honestly think about.
    To find the posterior probability of Heads, given what you have observed, you combine the prior probability with the likelihood for Heads vs. Tails based on everything that you have observed.
    Here is the last hint, actually I have a feeling that this just spoils the solution outright so it’s in rot13:
    Gur bofreingvbaf “Enaqbz ahzore trarengbe cebqhprq n ahzore” naq “Enaqbz ahzore trarengbe cebqhprq gur rknpg ahzore V’ir thrffrq” ner qvssrerag bofreingvbaf. Lbh pna bofreir gur ynggre bayl vs lbh’ir thrffrq n ahzore orsberunaq. Lbh znl guvax nobhg nf nal bgure novyvgl gb rkgenpg vasbezngvba sebz lbhe raivebazrag.
    Fhccbfr jura gur pbva vf Gnvyf gur ebbz unf terra jnyyf naq jura vg’f Urnqf gur ebbz unf oyhr jnyyf. N crefba jub xabjf nobhg guvf naq vfa’g pbybe oyvaq pna thrff gur erfhyg bs n pbva gbff cresrpgyl. N pbybe oyvaq crefba jub xabjf nobhg guvf ehyr—pna’g. Rira vs gurl xabj gung gur ebbz unf fbzr pbybe, gurl ner hanoyr gb rkrphgr gur fgengrtl “thrff Gnvyf rirel gvzr gur ebbz vf terra”.
    N crefba jub qvqa’g thrff n ahzore orsberunaq qbrfa’g cbffrff gur novyvgl gb bofreir rirag “Enaqbz ahzore trarengbe cebqhprq gur rknpg ahzore V’ir thrffrq” whfg nf n pbybe oyvaq crefba qbrfa’g unir na novyvgl gb bofreir na rirag “Gur ebbz vf terra”.
    I think if Beauty isn’t a conscious being, it doesn’t make much sense to talk about how she should reason regarding philosophical arguments about probability.
    The Beauty doesn’t need to experience qualia or be self aware to have meaningful probability estimate.
    I’ll just mention that probability is supposed to be useful. And if you extend the problem to allow Beauty to make bets, in various scenarios, the bets the make Beauty the most money are the ones she will make by assessing the probability of Heads to be ¹⁄₃ and then applying standard decision theory.
    Betting arguments are not particularly helpful. They are describing the motte—specific scoring rule, and not the actual ability to guess the outcome of the coin toss in the experiment. As I’ve written in the post itself:
    As long as we do not claim that this fact gives an ability to predict the result of the coin toss better than chance, then we are just using different definitions, while agreeing on everything. We can translate from Thirder language to mine and back without any problem. Whatever betting schema is proposed, all other things being equal, we will agree to the same bets.
    That is, if betting happens every day, Halfers and Double Halfers need to weight the odds by the number of bets, while Thirders already include this weighting in their definitions “probability”. On the other hand, if only one bet per experiment counts, suddenly it’s thirders who need to discount this weighting from their “probability” and Halfers and Double Halfers who are fine by default.
    - Radford Neal 7 Aug 2023 23:57 UTC
      1 point
      0
      Parent
      There are rules for how to do arithmetic. If you want to get the right answer, you have to follow them. So, when adding 18 and 17, you can’t just decide that you don’t like to carry 1s today, and hence compute that 18+17=25.
      Similarly, there are rules for how to do Bayesian probability calculations. If you want to get the right answer, you have to follow them. One of the rules is that the posterior probability of something is found by conditioning on all the data you have. If you do a clinical trial with 1000 subjects, you can’t just decide that you’d like to compute the posterior probability that the treatment works by conditioning on the data for just the first 700.
      If you’ve seen the output of a random number generator, and are using this to compute a posterior probability, you condition on the actual number observed, say 71. You do not condition on any of the other events you mention, because they are less informative than the actual number—conditioning on them would amount to ignoring part of the data. (In some circumstances, the result of conditioning on all the data may be the same as the result of conditioning on some function of the data—when that function is a “sufficient statistic”, but it’s always correct to condition on all the data.)
      This is absolutely standard Bayesian procedure. There is nothing in the least bit controversial about it. (That is, it is definitely how Bayesian inference works—there are of course some people who don’t accept that Bayesian inference is the right thing to do.)
      Similarly, there are certain rules for how to apply decision theory to choose an action to maximize your expected utility, based on probability judgements that you’ve made.
      If you compute probabilities incorrectly, and then incorrectly apply decision theory to choose an action based on these incorrect probabilities, it is possible that your two errors will cancel out. That is actually rather likely if you have other ways of telling what the right answer is, and hence have the opportunity to make ad hoc (incorrect) alterations to how you apply decision theory in order to get the right decision with the wrong probabilities.
      If you’d like to outline some specific betting scenario for Sleeping Beauty, I’ll show you how applying decision theory correctly produces the right action only if Beauty judges the probability of Heads to be ¹⁄₃.
  - Martin Randall 7 Aug 2023 2:52 UTC
    1 point
    0
    Parent
    
    Of course, there are plenty of frequentists in the world, but I presume they are uninterested in the Sleeping Beauty problem, since to a frequentist, Beauty’s probability for Heads is a meaningless concept, since they don’t think probability can be used to represent degrees of belief.
    
    Tangent: I ran across an apparently Frequentist analysis of Sleeping Beauty here: Sleeping Beauty: Exploring a Neglected Solution, Luna
    
    To make the concept meaningful under Frequentism, Luna has Beauty perform an experiment of asking the higher level experimenters which awakening she is in (H1, T1, or T2). If she undergoes both sets of experiments many times, the frequency of the experimenters responding H1 will tend to ¹⁄₃, and so the Frequentist probability is similarly ¹⁄₃.
    
    I say “apparently Frequentist” because Luna doesn’t use the term and I’m not sure of the exact terminology when Luna reasons about the frequency of hypothetical experiments that Beauty has not actually performed.