Said Achmiz comments on Probability is a model, frequency is an observation: Why both halfers and thirders are correct in the Sleeping Beauty problem.

Said Achmiz 12 Jul 2018 23:19 UTC
2 points

We can then try to work backwards from what the right action is to what probability Beauty should assign to the coin landing Heads after being wakened, in order that this probability will lead (by standard decision theory) to her taking the action we’ve decided is the correct one.

Is there some reason why you’re committed to standard (by which you presumably mean, causal—or what?) decision theory, when approaching this question? After all:

For your second scenario, Beauty really has to commit to what to do before the experiment

As I understand it, UDT (or some similar decision theory) is the now-standard solution for such dilemmas.

I think it ought to be the case that, regardless of the reward structure, if you work backwards from correct action to probabilities, you get that Beauty after wakening should give probability 1⁄3 to Heads.

Why, though? More importantly, why does it matter?

It seems to me that all that Beauty needs to know, given that that the scenario (one or two awakenings) is chosen by the flip of a fair coin, is that fair coins land heads half the time, and tails the other half. I really don’t see any reason why we should insist on there being some single, “objectively correct”, subjective probability assignment over the outcomes, that has to hold true for all formulations of this thought experiment, and/or all other Sleeping-Beauty-esque scenarios, etc.

In other words:

We agree about what the right actions are for the various reward structures.

I am struggling to see why there should be anything more to the matter than this. We all agree what the right actions are and we are all equally quite capable of determining what those right actions are. It seems to me that we’re done.
- Radford Neal 12 Jul 2018 23:43 UTC
  1 point
  Parent
  A big reason why probability (and belief in general) is useful is that it separates our observations of the world from our decisions. Rather than somehow relating every observation to every decision we might sometime need to make, we instead relate observations to our beliefs, and then use our beliefs when deciding on actions. That’s the cognitive architecture that evolution has selected for (excepting some more ancient reflexes), and it seems like a good one.
  - Said Achmiz 12 Jul 2018 23:50 UTC
    3 points
    Parent
    I don’t really disagree, per se, with this general point, but it seems strange to insist on rejecting an answer we already have, and already know is right, in the service of this broad point. If you want to undertake the project of generalizing and formalizing the cognitive algorithms that led us to the right answer, fine and well, but in no event should that get in the way of clarity w.r.t. the original question.
    
    Again: we know the correct answer (i.e. the correct action for Beauty to take); and we know it differs depending on what reward structure is on offer. The question of whether there is, in some sense, a “right answer” even if there are no rewards at all, seems to me to be even potentially useful or interesting only in the case that said “right answer” does in fact generate all the practical correct answers that we already have. (And then we can ask whether it’s an improvement on whatever algorithm we had used to generate said right answers, etc.)
    - Radford Neal 13 Jul 2018 0:16 UTC
      1 point
      Parent
      Well of course. If we know the right action from other reasoning, then the correct probabilities better lead us to the same action. That was my point about working backwards from actions to see what the correct probabilities are. One of the nice features about probabilities in “normal” situations is that the probabilities do not depend on the reward structure. Instead we have a decision theory that takes the reward structure and probabilities as input and produces actions. It would be nice if the same nice property held in SB-type problems, and so far it seems to me that it does.
      I don’t think there has ever been much dispute about the right actions for Beauty to take in the SB problem (i.e., everyone agrees about the right bets for Beauty to make, for whatever payoff structure is defined). So if just getting the right answer for the actions was the goal, SB would never have been considered of much interest.