Neal wants us the condition on all information, including the apparently random experiences that Sleeping Beauty will undergo before they answer the interview question. This information seems irrelevant, but Neal argues that if it were irrelevant that it wouldn’t affect the calculation. If, contrary to expectations, it actually does, then Neal would suggest that we were wrong about its irrelevance.
This isn’t just Neal’s position. Jaynes argues the same in Probability Theory: The Logic of Science. I have never once encountered an academic book or paper that argued otherwise. The technical term for conditioning on less than all the information is “cherry-picking the evidence” :-).
The context is *all* applications of probability theory. Look, when I tell you that A or not A is a rule of classical propositional logic, we don’t argue about the context or what assumptions we are relying on. That’s just a universal rule of classical logic. Ditto with conditioning on all the information you have. That’s just one of the rules of epistemic probability theory that *always* applies. The only time you are allowed to NOT condition on some piece of known information is if you would get the same answer whether or not you conditioned on it. When we leave known information Y out and say it is “irrelevant”, what that means is that Pr(A | Y and X) = Pr(A | X), where X is the rest of the information we’re using. If I can show that these probabilities are NOT the same, then I have proven that Y is, in fact, relevant.
“Look, when I tell you that A or not A is a rule of classical propositional logic, we don’t argue about the context or what assumptions we are relying on”—Actually, you get questions like, “This sentence is false”, which fall outside out classical propositional logic. This is why it is important to understand the limits which apply.
This isn’t just Neal’s position. Jaynes argues the same in Probability Theory: The Logic of Science. I have never once encountered an academic book or paper that argued otherwise. The technical term for conditioning on less than all the information is “cherry-picking the evidence” :-).
But within what context? You can’t just take a formula or rule and apply it without understanding the assumptions it is reliant upon.
The context is *all* applications of probability theory. Look, when I tell you that A or not A is a rule of classical propositional logic, we don’t argue about the context or what assumptions we are relying on. That’s just a universal rule of classical logic. Ditto with conditioning on all the information you have. That’s just one of the rules of epistemic probability theory that *always* applies. The only time you are allowed to NOT condition on some piece of known information is if you would get the same answer whether or not you conditioned on it. When we leave known information Y out and say it is “irrelevant”, what that means is that Pr(A | Y and X) = Pr(A | X), where X is the rest of the information we’re using. If I can show that these probabilities are NOT the same, then I have proven that Y is, in fact, relevant.
“Look, when I tell you that A or not A is a rule of classical propositional logic, we don’t argue about the context or what assumptions we are relying on”—Actually, you get questions like, “This sentence is false”, which fall outside out classical propositional logic. This is why it is important to understand the limits which apply.