We agree about what the right actions are for the various reward structures. We can then try to work backwards from what the right action is to what probability Beauty should assign to the coin landing Heads after being wakened, in order that this probability will lead (by standard decision theory) to her taking the action we’ve decided is the correct one.
For your second scenario, Beauty really has to commit to what to do before the experiment, which means this scheme of working backwards from correct decision to probability of Heads after wakening doesn’t seem to work. Guessing either Heads or Tails is equally good, but only if done consistently. Deciding after each wakening without having thought about it beforehand doesn’t work well, since with the two possibilities being equally good, Beauty might choose differently on Monday and Tuesday, with bad results. Now, if the problem is tweaked with slightly different rewards for guessing Heads correctly than Tails correctly, we can avoid the situation of both guesses being equally good. But the coordination problem still seems to confuse the issue of how to work backwards to the appropriate probabilities (for me at least).
I think it ought to be the case that, regardless of the reward structure, if you work backwards from correct action to probabilities, you get that Beauty after wakening should give probability 1⁄3 to Heads. That seems to be what happens for all the reward structures where Beauty can decide what to do each day without having to know what she might do or have done the other day.
We can then try to work backwards from what the right action is to what probability Beauty should assign to the coin landing Heads after being wakened, in order that this probability will lead (by standard decision theory) to her taking the action we’ve decided is the correct one.
Is there some reason why you’re committed to standard (by which you presumably mean, causal—or what?) decision theory, when approaching this question? After all:
For your second scenario, Beauty really has to commit to what to do before the experiment
As I understand it, UDT (or some similar decision theory) is the now-standard solution for such dilemmas.
I think it ought to be the case that, regardless of the reward structure, if you work backwards from correct action to probabilities, you get that Beauty after wakening should give probability 1⁄3 to Heads.
Why, though? More importantly, why does it matter?
It seems to me that all that Beauty needs to know, given that that the scenario (one or two awakenings) is chosen by the flip of a fair coin, is that fair coins land heads half the time, and tails the other half. I really don’t see any reason why we should insist on there being some single, “objectively correct”, subjective probability assignment over the outcomes, that has to hold true for all formulations of this thought experiment, and/or all other Sleeping-Beauty-esque scenarios, etc.
In other words:
We agree about what the right actions are for the various reward structures.
I am struggling to see why there should be anything more to the matter than this. We all agree what the right actions are and we are all equally quite capable of determining what those right actions are. It seems to me that we’re done.
A big reason why probability (and belief in general) is useful is that it separates our observations of the world from our decisions. Rather than somehow relating every observation to every decision we might sometime need to make, we instead relate observations to our beliefs, and then use our beliefs when deciding on actions. That’s the cognitive architecture that evolution has selected for (excepting some more ancient reflexes), and it seems like a good one.
I don’t really disagree, per se, with this general point, but it seems strange to insist on rejecting an answer we already have, and already know is right, in the service of this broad point. If you want to undertake the project of generalizing and formalizing the cognitive algorithms that led us to the right answer, fine and well, but in no event should that get in the way of clarity w.r.t. the original question.
Again: we know the correct answer (i.e. the correct action for Beauty to take); and we know it differs depending on what reward structure is on offer. The question of whether there is, in some sense, a “right answer” even if there are no rewards at all, seems to me to be even potentially useful or interesting only in the case that said “right answer” does in fact generate all the practical correct answers that we already have. (And then we can ask whether it’s an improvement on whatever algorithm we had used to generate said right answers, etc.)
Well of course. If we know the right action from other reasoning, then the correct probabilities better lead us to the same action. That was my point about working backwards from actions to see what the correct probabilities are. One of the nice features about probabilities in “normal” situations is that the probabilities do not depend on the reward structure. Instead we have a decision theory that takes the reward structure and probabilities as input and produces actions. It would be nice if the same nice property held in SB-type problems, and so far it seems to me that it does.
I don’t think there has ever been much dispute about the right actions for Beauty to take in the SB problem (i.e., everyone agrees about the right bets for Beauty to make, for whatever payoff structure is defined). So if just getting the right answer for the actions was the goal, SB would never have been considered of much interest.
We agree about what the right actions are for the various reward structures. We can then try to work backwards from what the right action is to what probability Beauty should assign to the coin landing Heads after being wakened, in order that this probability will lead (by standard decision theory) to her taking the action we’ve decided is the correct one.
For your second scenario, Beauty really has to commit to what to do before the experiment, which means this scheme of working backwards from correct decision to probability of Heads after wakening doesn’t seem to work. Guessing either Heads or Tails is equally good, but only if done consistently. Deciding after each wakening without having thought about it beforehand doesn’t work well, since with the two possibilities being equally good, Beauty might choose differently on Monday and Tuesday, with bad results. Now, if the problem is tweaked with slightly different rewards for guessing Heads correctly than Tails correctly, we can avoid the situation of both guesses being equally good. But the coordination problem still seems to confuse the issue of how to work backwards to the appropriate probabilities (for me at least).
I think it ought to be the case that, regardless of the reward structure, if you work backwards from correct action to probabilities, you get that Beauty after wakening should give probability 1⁄3 to Heads. That seems to be what happens for all the reward structures where Beauty can decide what to do each day without having to know what she might do or have done the other day.
Is there some reason why you’re committed to standard (by which you presumably mean, causal—or what?) decision theory, when approaching this question? After all:
As I understand it, UDT (or some similar decision theory) is the now-standard solution for such dilemmas.
Why, though? More importantly, why does it matter?
It seems to me that all that Beauty needs to know, given that that the scenario (one or two awakenings) is chosen by the flip of a fair coin, is that fair coins land heads half the time, and tails the other half. I really don’t see any reason why we should insist on there being some single, “objectively correct”, subjective probability assignment over the outcomes, that has to hold true for all formulations of this thought experiment, and/or all other Sleeping-Beauty-esque scenarios, etc.
In other words:
I am struggling to see why there should be anything more to the matter than this. We all agree what the right actions are and we are all equally quite capable of determining what those right actions are. It seems to me that we’re done.
A big reason why probability (and belief in general) is useful is that it separates our observations of the world from our decisions. Rather than somehow relating every observation to every decision we might sometime need to make, we instead relate observations to our beliefs, and then use our beliefs when deciding on actions. That’s the cognitive architecture that evolution has selected for (excepting some more ancient reflexes), and it seems like a good one.
I don’t really disagree, per se, with this general point, but it seems strange to insist on rejecting an answer we already have, and already know is right, in the service of this broad point. If you want to undertake the project of generalizing and formalizing the cognitive algorithms that led us to the right answer, fine and well, but in no event should that get in the way of clarity w.r.t. the original question.
Again: we know the correct answer (i.e. the correct action for Beauty to take); and we know it differs depending on what reward structure is on offer. The question of whether there is, in some sense, a “right answer” even if there are no rewards at all, seems to me to be even potentially useful or interesting only in the case that said “right answer” does in fact generate all the practical correct answers that we already have. (And then we can ask whether it’s an improvement on whatever algorithm we had used to generate said right answers, etc.)
Well of course. If we know the right action from other reasoning, then the correct probabilities better lead us to the same action. That was my point about working backwards from actions to see what the correct probabilities are. One of the nice features about probabilities in “normal” situations is that the probabilities do not depend on the reward structure. Instead we have a decision theory that takes the reward structure and probabilities as input and produces actions. It would be nice if the same nice property held in SB-type problems, and so far it seems to me that it does.
I don’t think there has ever been much dispute about the right actions for Beauty to take in the SB problem (i.e., everyone agrees about the right bets for Beauty to make, for whatever payoff structure is defined). So if just getting the right answer for the actions was the goal, SB would never have been considered of much interest.