Your numerator is ½ * p(y), which seems like a Pr (H | M) * Pr(X2 |H, M)
Your denominator is 1/2⋅p(y)+1/2⋅p(y)(2−q(y)), which seems like
Pr(H∣M) * Pr(X2∣H,M) + Pr(¬H∣M) * Pr(X2∣¬H,M), which is Pr(X2 |M)
By bayes rule, Pr (H | M) * Pr(X2 |H, M) / Pr(X2 |M) = Pr(H∣X2, M), which is not the same quantity you claimed to compute Pr(H∣X2). Unless you have some sort of other derivation or a good reason why you omitted M in your calculations: this isn’t really “solving” anything.
Second, the dismissal of betting arguments is strange. If decision theory is indeed downstream of probability, then probability acts as an input to the decision theory. So, if there is a particular probability p of heads at a given moment, it means it’s most rational to bet according to said probability. If your ideal decision theory diverges from probability estimates to arrive at the right answer on betting puzzles, then the question of probability is useless. If it takes the probability into account and gets the wrong answer, then this is not truly rational.
More generally, probability theory is supposed to completely capture the state of knowledge of an agent and if there is other knowledge that is obscured by probability, that means it is important to capture as well in another system. Building a functional AI would then require a knowledge representation that is separate, but interfacing from a probability representation, making the real question is: what is that knowledge representation?
“probability theory is logically prior to decision theory.” Yes, this is the common view because probability theory was developed first and is easier but it’s not actually obvious this *has* to be the case. If there is a new math that puts decisions as more fundamental than beliefs, then it might be better for a real AI.
Third, dismissal of “not H and it’s Tuesday” as not propositions doesn’t make sense. Classical logic encodes arbitrary statements within AND and OR -type constructions. There isn’t a whole lot of restrictions on them.
Fourth, the assumptions. Generally, I have read the problem as whatever the beauty experiences on Monday is the same as on Tuesday, or q(y) = 1, at which point this argument reduces to ½-er position and then the usual anti-1/2, pro-1/3 arguments apply. The paradox still stands for the moment when you wake up, or if you get no additional bits of input. The question of updating on actual input in the problem is an interesting one, but it hides the paradox of what your probability should be *at the moment of waking up*. You seem to simply declare it to be ½, by saying:
The prior for H is even odds: Pr(H∣M)=Pr(¬H∣M)=1/2.
This is generally indistinguishable from the ½ position you dismiss that argues for that prior on the basis of “no new information.” You still don’t know how to handle the situation of being told that it’s Monday and needing to update your probability accordingly, vs conditioning on Monday and doing inferences.
By bayes rule, Pr (H | M) * Pr(X2 |H, M) / Pr(X2 |M) = Pr(H∣X2, M), which is not the same quantity you claimed to compute Pr(H∣X2).
That’s a typo. I meant to write Pr(H∣X2,M), not Pr(H∣X2).
Second, the dismissal of betting arguments is strange.
I’ll have more to say soon about what I think is the correct betting argument. Until then, see my comment in reply to Radford Neal about disagreement on how to apply betting arguments to this problem.
“probability theory is logically prior to decision theory.” Yes, this is the common view because probability theory was developed first and is easier but it’s not actually obvious this *has* to be the case.
I said logically prior, not chronologically prior. You cannot have decision theory without probability theory—the former is necessarily based on the latter. In contrast, probability theory requires no reference to decision theory for its justification and development. Have you read any of the literature on how probability theory is either an or the uniquely determined extension of propositional logic to handle degrees of certainty? If not, see my references. Neither Cox’s Theorem nor my theorem rely on any form of decision theory.
Third, dismissal of “not H and it’s Tuesday” as not propositions doesn’t make sense. Classical logic encodes arbitrary statements within AND and OR -type constructions. There isn’t a whole lot of restrictions on them.
I’ll repeat my response to Jeff Jo: The standard textbook definition of a proposition is a sentence that has a truth value of either true or false. The problem with a statement whose truth varies with time is that it does not have a simple true/false truth value; instead, its truth value is a function from time to the set {true,false}. In logical terms, such a statement is a predicate, not a proposition. For example, “Today is Monday” corresponds to the predicate P(t)≜(dayof(t)=Monday). It doesn’t become a proposition until you substitute in a specific value for t, e.g. “Unix timestamp 1527556491 is a Monday.”
The paradox still stands for the moment when you wake up
You have not considered the possibility that the usual decision analysis applied to this problem is wrong. There is, in fact, disagreement as to what the correct decision analysis is. I will be writing more on this in a future post.
You seem to simply declare [Beauty’s probability at the moment of awakening] to be ½, by saying:
The prior for H is even odds: Pr(H∣M)=Pr(¬H∣M)=1/2.
This is generally indistinguishable from the ½ position you dismiss that argues for that prior on the basis of “no new information.”
In fact, I explicitly said that at the instant of awakening, Beauty’s probability is the same as the prior, because at that point she does not yet have any new information. As she receives sensory input, her probability for Heads decreases asymptotically to 1⁄2. All of this is just standard probability theory, conditioning on the new information as it arrives. I dismissed the naive halfer position because it incorrectly assumes that Beauty’s sensory input is irrelevant to the determination of her probability for Heads.
You still don’t know how to handle the situation of being told that it’s Monday and needing to update your probability accordingly,
Uh, yes I do—it’s just standard probability theory again. Just do the math. If Beauty finds out that it is Monday, her new information since Sunday changes from (R(y,M) or R(y,T)) to just R(y,M), and since the problem definition assumes that
Pr(R(y,M)∣H,M)=Pr(R(y,M)∣not H,M)
we get equal posterior probabilities for H and not H, which is generally accepted to be the right answer.
You said: “The standard textbook definition of a proposition is a sentence that has a truth value of either true or false.
This is correct. And when a well-defined truth value is not known to an observer, the standard textbook definition of a probability (or confidence) for the proposition, is that there is a probability P that it is “true” and a probability 1-P that it is “false.”
For example, if I flip a coin but keep it hidden from you, the statement “The coin shows Heads on the face-up side” fits your definition of a proposition. But since you do not know whether it is true or false, you can assign a 50% probability to the result where “It shows Heads” is true, and a 50% probability the event where “it shows Heads” is false. This entire debate can be reduced to you confusing a truth value, with the probability of that truth value.
On Monday Beauty is awakened. While awake she obtains no information that would help her infer the day of the week. Later in the day she is put to sleep again.
During this part of the experiment, the statement “today is Monday” has the truth value “true”, and does not have the truth value “false.” So by your definition, it is a valid proposition. But Beauty does not know that it is “true.”
On Tuesday the experimenters flip a fair coin. If it lands Tails, Beauty is administered a drug that erases her memory of the Monday awakening, and step 2 is repeated.
During this part of the experiment, the statement “today is Monday” has the truth value “false”, and does not have the truth value “true.” So by your definition, it is a valid proposition. But Beauty dos not know that it is “false.”
In either case, the statement “today is Monday” is a valid proposition by the standard definition you use. What you refuse to acknowledge, is that it is also a proposition that Beauty can treat as “true” or “false” with probabilities P and 1-P.
[Moderator Note:] I am reasonably confident that this current format of the discussion is not going to cause any participant to change their mind, and seems quite stressful to the people participating in it, at least from the outside. While I haven’t been able to read the whole debate in detail, it seems like you are repeating similar points over and over, in mostly the same language. I think it’s fine for you to continue and comment, but I just really want to make sure that people don’t feel an obligation to respond and get dragged into a debate that they don’t expect to get any value from.
if the is indeed a typo, please correct it at the top level post and link to this comment. The broader point is that the interpretation of P( H | X2, M) is probability of heads conditioned on Monday and X2, and P (H |X2) is probability of heads conditioned on X2. In the later paragraphs, you seem to use the second interpretation. In fact, It seems your whole post’s argument and “solution” rests on this typo.
Dismissing betting arguments is very reminiscent of dismissing one-boxing in Newcomb’s because one defines “CDT” as rational. The point of probability theory is to be helpful in constructing rational agents. If the agents that your probability theory leads to are not winning bets with the information given to them by said theory, the theory has questionable usefulness.
Just to clarify, I have read Probability, the Logic of science, Bostrom’s and Armstrong’s papers on this. I have also read https://meaningness.com/probability-and-logic. The question of the relationship of probability and logic is not clear cut. And as Armstrong has pointed out, decisions can be more easily determined than probabilities, which means it’s possible the ideal relationship between decision theory and probability theory is not clear cut, but that’s a broader philosophical point that needs a top level post.
No, P(H | X2, M) is Pr(H∣X2,M), and notPr(H∣X2,Monday). Recall that M is the proposed model. If you thought it meant “today is Monday,” I question how closely you read the post you are criticizing.
I find it ironic that you write “Dismissing betting arguments is very reminiscent of dismissing one-boxing in Newcomb’s”—in an earlier version of this blog post I brought up Newcomb myself as an example of why I am skeptical of standard betting arguments (not sure why or how that got dropped.) The point was that standard betting arguments can get the wrong answer in some problems involving unusual circumstances where a more comprehensive decision theory is required (perhaps FDT).
Re constructing rational agents: this is one use of probability theory; it is not “the point”. We can discuss logic from a purely analytical viewpoint without ever bringing decisions and agents into the discussion. Logic and epistemology are legitimate subjects of their own quite apart from decision theory. And probability theory is the unique extension of classical propositional logic to handle intermediate degrees of plausibility.
You say you have read PTLOS and others. Have you read Cox’s actual paper, or any or detailed discussions of it such as Paris’s discussion in The Uncertain Reasoner’s Companion, or my own “Constructing a Logic of Plausible Inference: A Guide to Cox’s Theorem”? If you think that Cox’s Theorem has too many arguable technical requirements, then I invite you to read my paper, “From Propositional Logic to Plausible Reasoning: A Uniqueness Theorem” (preprint here). That proof assumes only that certain existing properties of classical propositional logic be retained when extending the logic to handle degrees of plausibility. It does not assume any particular functional decomposition of plausibilities, nor does it even assume that plausibilities must be real numbers. As with Cox, we end up with the result that the logic must be isomorphic to probability theory. In addition, the theorem gives the required numeric value for a probability Pr(A∣X) when X contains, in propositional form, all of the background information we are using to assess the probability of A. How much more “clear cut” do you demand the relationship between logic and probability be?
Regardless, for my argument about indexicals all that is necessary is that probability theory deals with classical propositions.
I think this post is fairly wrong headed.
First, your math seems to be wrong.
Your numerator is ½ * p(y), which seems like a Pr (H | M) * Pr(X2 |H, M)
Your denominator is 1/2⋅p(y)+1/2⋅p(y)(2−q(y)), which seems like
Pr(H∣M) * Pr(X2∣H,M) + Pr(¬H∣M) * Pr(X2∣¬H,M), which is Pr(X2 |M)
By bayes rule, Pr (H | M) * Pr(X2 |H, M) / Pr(X2 |M) = Pr(H∣X2, M), which is not the same quantity you claimed to compute Pr(H∣X2). Unless you have some sort of other derivation or a good reason why you omitted M in your calculations: this isn’t really “solving” anything.
Second, the dismissal of betting arguments is strange. If decision theory is indeed downstream of probability, then probability acts as an input to the decision theory. So, if there is a particular probability p of heads at a given moment, it means it’s most rational to bet according to said probability. If your ideal decision theory diverges from probability estimates to arrive at the right answer on betting puzzles, then the question of probability is useless. If it takes the probability into account and gets the wrong answer, then this is not truly rational.
More generally, probability theory is supposed to completely capture the state of knowledge of an agent and if there is other knowledge that is obscured by probability, that means it is important to capture as well in another system. Building a functional AI would then require a knowledge representation that is separate, but interfacing from a probability representation, making the real question is: what is that knowledge representation?
“probability theory is logically prior to decision theory.” Yes, this is the common view because probability theory was developed first and is easier but it’s not actually obvious this *has* to be the case. If there is a new math that puts decisions as more fundamental than beliefs, then it might be better for a real AI.
Third, dismissal of “not H and it’s Tuesday” as not propositions doesn’t make sense. Classical logic encodes arbitrary statements within AND and OR -type constructions. There isn’t a whole lot of restrictions on them.
Fourth, the assumptions. Generally, I have read the problem as whatever the beauty experiences on Monday is the same as on Tuesday, or q(y) = 1, at which point this argument reduces to ½-er position and then the usual anti-1/2, pro-1/3 arguments apply. The paradox still stands for the moment when you wake up, or if you get no additional bits of input. The question of updating on actual input in the problem is an interesting one, but it hides the paradox of what your probability should be *at the moment of waking up*. You seem to simply declare it to be ½, by saying:
The prior for H is even odds: Pr(H∣M)=Pr(¬H∣M)=1/2.
This is generally indistinguishable from the ½ position you dismiss that argues for that prior on the basis of “no new information.” You still don’t know how to handle the situation of being told that it’s Monday and needing to update your probability accordingly, vs conditioning on Monday and doing inferences.
That’s a typo. I meant to write Pr(H∣X2,M), not Pr(H∣X2).
I’ll have more to say soon about what I think is the correct betting argument. Until then, see my comment in reply to Radford Neal about disagreement on how to apply betting arguments to this problem.
I said logically prior, not chronologically prior. You cannot have decision theory without probability theory—the former is necessarily based on the latter. In contrast, probability theory requires no reference to decision theory for its justification and development. Have you read any of the literature on how probability theory is either an or the uniquely determined extension of propositional logic to handle degrees of certainty? If not, see my references. Neither Cox’s Theorem nor my theorem rely on any form of decision theory.
I’ll repeat my response to Jeff Jo: The standard textbook definition of a proposition is a sentence that has a truth value of either true or false. The problem with a statement whose truth varies with time is that it does not have a simple true/false truth value; instead, its truth value is a function from time to the set {true,false}. In logical terms, such a statement is a predicate, not a proposition. For example, “Today is Monday” corresponds to the predicate P(t)≜(dayof(t)=Monday). It doesn’t become a proposition until you substitute in a specific value for t, e.g. “Unix timestamp 1527556491 is a Monday.”
You have not considered the possibility that the usual decision analysis applied to this problem is wrong. There is, in fact, disagreement as to what the correct decision analysis is. I will be writing more on this in a future post.
In fact, I explicitly said that at the instant of awakening, Beauty’s probability is the same as the prior, because at that point she does not yet have any new information. As she receives sensory input, her probability for Heads decreases asymptotically to 1⁄2. All of this is just standard probability theory, conditioning on the new information as it arrives. I dismissed the naive halfer position because it incorrectly assumes that Beauty’s sensory input is irrelevant to the determination of her probability for Heads.
Uh, yes I do—it’s just standard probability theory again. Just do the math. If Beauty finds out that it is Monday, her new information since Sunday changes from (R(y,M) or R(y,T)) to just R(y,M), and since the problem definition assumes that
we get equal posterior probabilities for H and not H, which is generally accepted to be the right answer.
You said: “The standard textbook definition of a proposition is a sentence that has a truth value of either true or false.
This is correct. And when a well-defined truth value is not known to an observer, the standard textbook definition of a probability (or confidence) for the proposition, is that there is a probability P that it is “true” and a probability 1-P that it is “false.”
For example, if I flip a coin but keep it hidden from you, the statement “The coin shows Heads on the face-up side” fits your definition of a proposition. But since you do not know whether it is true or false, you can assign a 50% probability to the result where “It shows Heads” is true, and a 50% probability the event where “it shows Heads” is false. This entire debate can be reduced to you confusing a truth value, with the probability of that truth value.
On Monday Beauty is awakened. While awake she obtains no information that would help her infer the day of the week. Later in the day she is put to sleep again.
During this part of the experiment, the statement “today is Monday” has the truth value “true”, and does not have the truth value “false.” So by your definition, it is a valid proposition. But Beauty does not know that it is “true.”
On Tuesday the experimenters flip a fair coin. If it lands Tails, Beauty is administered a drug that erases her memory of the Monday awakening, and step 2 is repeated.
During this part of the experiment, the statement “today is Monday” has the truth value “false”, and does not have the truth value “true.” So by your definition, it is a valid proposition. But Beauty dos not know that it is “false.”
In either case, the statement “today is Monday” is a valid proposition by the standard definition you use. What you refuse to acknowledge, is that it is also a proposition that Beauty can treat as “true” or “false” with probabilities P and 1-P.
[Moderator Note:] I am reasonably confident that this current format of the discussion is not going to cause any participant to change their mind, and seems quite stressful to the people participating in it, at least from the outside. While I haven’t been able to read the whole debate in detail, it seems like you are repeating similar points over and over, in mostly the same language. I think it’s fine for you to continue and comment, but I just really want to make sure that people don’t feel an obligation to respond and get dragged into a debate that they don’t expect to get any value from.
if the is indeed a typo, please correct it at the top level post and link to this comment. The broader point is that the interpretation of P( H | X2, M) is probability of heads conditioned on Monday and X2, and P (H |X2) is probability of heads conditioned on X2. In the later paragraphs, you seem to use the second interpretation. In fact, It seems your whole post’s argument and “solution” rests on this typo.
Dismissing betting arguments is very reminiscent of dismissing one-boxing in Newcomb’s because one defines “CDT” as rational. The point of probability theory is to be helpful in constructing rational agents. If the agents that your probability theory leads to are not winning bets with the information given to them by said theory, the theory has questionable usefulness.
Just to clarify, I have read Probability, the Logic of science, Bostrom’s and Armstrong’s papers on this. I have also read https://meaningness.com/probability-and-logic. The question of the relationship of probability and logic is not clear cut. And as Armstrong has pointed out, decisions can be more easily determined than probabilities, which means it’s possible the ideal relationship between decision theory and probability theory is not clear cut, but that’s a broader philosophical point that needs a top level post.
In the meantime, Fix Your Math!
No, P(H | X2, M) is Pr(H∣X2,M), and not Pr(H∣X2,Monday). Recall that M is the proposed model. If you thought it meant “today is Monday,” I question how closely you read the post you are criticizing.
I find it ironic that you write “Dismissing betting arguments is very reminiscent of dismissing one-boxing in Newcomb’s”—in an earlier version of this blog post I brought up Newcomb myself as an example of why I am skeptical of standard betting arguments (not sure why or how that got dropped.) The point was that standard betting arguments can get the wrong answer in some problems involving unusual circumstances where a more comprehensive decision theory is required (perhaps FDT).
Re constructing rational agents: this is one use of probability theory; it is not “the point”. We can discuss logic from a purely analytical viewpoint without ever bringing decisions and agents into the discussion. Logic and epistemology are legitimate subjects of their own quite apart from decision theory. And probability theory is the unique extension of classical propositional logic to handle intermediate degrees of plausibility.
You say you have read PTLOS and others. Have you read Cox’s actual paper, or any or detailed discussions of it such as Paris’s discussion in The Uncertain Reasoner’s Companion, or my own “Constructing a Logic of Plausible Inference: A Guide to Cox’s Theorem”? If you think that Cox’s Theorem has too many arguable technical requirements, then I invite you to read my paper, “From Propositional Logic to Plausible Reasoning: A Uniqueness Theorem” (preprint here). That proof assumes only that certain existing properties of classical propositional logic be retained when extending the logic to handle degrees of plausibility. It does not assume any particular functional decomposition of plausibilities, nor does it even assume that plausibilities must be real numbers. As with Cox, we end up with the result that the logic must be isomorphic to probability theory. In addition, the theorem gives the required numeric value for a probability Pr(A∣X) when X contains, in propositional form, all of the background information we are using to assess the probability of A. How much more “clear cut” do you demand the relationship between logic and probability be?
Regardless, for my argument about indexicals all that is necessary is that probability theory deals with classical propositions.
I responded to David Chapman’s essay (https://meaningness.com/probability-and-logic) a couple of years ago here.