But as far as I know there’s nothing in Cox’s theorem or the axioms of probability theory or anything like those that says I had to use that particular prior
The way I interpret hypotheticals in which one person is said to be able to do something other than what they will do, such as “depending on how those techniques are applied,” all of the person’s priors are to be held constant in the hypothetical. This is the most charitable interpretation of the OP because the claim is that, under Bayesian reasoning, results do not depend on how the same data is applied. This seems obviously wrong if the OP is interpreted as discussing results reached after decision processes with identical data but differing priors, so it’s more interesting to talk about agents with other things differing, such as perhaps likelihood-generating models, than it is to talk about agents with different priors.
I could just as easily have used a different...likelihood model, and gotten a totally different posterior that’s nonetheless legitimate.
This is the most charitable interpretation of the OP because the claim is that, under Bayesian reasoning, results do not depend on how the same data is applied. This seems obviously wrong if the OP is interpreted as discussing results reached after decision processes with identical data but differing priors, so it’s more interesting to talk about agents with other things differing, such as perhaps likelihood-generating models, than it is to talk about agents with different priors.
But even if we assume the OP means that data and priors are held constant but not likelihoods, it still seems to me obviously wrong. Moreover, likelihoods are just as fundamental to an application of Bayes’s theorem as priors, so I’m not sure why I would have/ought to have read the OP as implicitly assuming priors were held constant but not likelihoods (or likelihood-generating models).
Can you give an example?
I didn’t have one, but here’s a quick & dirty ESP example I just made up. Suppose that out of the blue, I get a gut feeling that my friend Joe is about to phone me, and a few minutes later Joe does. After we finish talking and I hang up, I realize I can use what just happened as evidence to update my prior probability for my having ESP. I write down:
my evidence: “I correctly predicted Joe would call” (call this E for short)
the hypothesis H0 — that I don’t have ESP — and its prior probability, 95%
the opposing hypothesis H1 — that I have ESP — and its prior probability, 5%
Now let’s think about two hypothetical mes.
The first me guesses at some likelihoods, deciding that both P(E | H0) and P(E | H1) were both 10%. Turning the crank, it gets a posterior for H1, P(H1 | E), that’s proportional to P(H1) P(E | H1) = 5% × 10% = 0.5%, and a posterior for H0, P(H0 | E), that’s proportional to P(H0) P(E | H0) = 95% × 10% = 9.5%. Of course its posteriors have to add to 100%, not 10%, so it multiplies both by 10 to normalize them. Unsurprisingly, as the likelihoods were equal, its posteriors come out at 95% for H0 and 5% for H1; the priors are unchanged.
When the second me is about to guess at some likelihoods, its brain is suddenly zapped by a stray gamma ray. The second me therefore decides that P(E | H0) was 2% but that P(E | H1) was 50%. Applying Bayes’s theorem in precisely the same way as the first me, it gets a P(H1 | E) proportional to 5% × 50% = 2.5%, and a P(H0 | E) proportional to 95% × 2% = 1.9%. Normalizing (but this time multiplying by 100/(2.5+1.9)) gives posteriors of P(H0 | E) = 43.2% and P(H1 | E) = 56.8%.
So the first me still strongly doubts it has ESP after updating on the evidence, but the second me ends up believing ESP the more likely hypothesis. Yet both used the same method of inference, the same piece of evidence and the same priors!
The way I interpret hypotheticals in which one person is said to be able to do something other than what they will do, such as “depending on how those techniques are applied,” all of the person’s priors are to be held constant in the hypothetical. This is the most charitable interpretation of the OP because the claim is that, under Bayesian reasoning, results do not depend on how the same data is applied. This seems obviously wrong if the OP is interpreted as discussing results reached after decision processes with identical data but differing priors, so it’s more interesting to talk about agents with other things differing, such as perhaps likelihood-generating models, than it is to talk about agents with different priors.
Can you give an example?
But even if we assume the OP means that data and priors are held constant but not likelihoods, it still seems to me obviously wrong. Moreover, likelihoods are just as fundamental to an application of Bayes’s theorem as priors, so I’m not sure why I would have/ought to have read the OP as implicitly assuming priors were held constant but not likelihoods (or likelihood-generating models).
I didn’t have one, but here’s a quick & dirty ESP example I just made up. Suppose that out of the blue, I get a gut feeling that my friend Joe is about to phone me, and a few minutes later Joe does. After we finish talking and I hang up, I realize I can use what just happened as evidence to update my prior probability for my having ESP. I write down:
my evidence: “I correctly predicted Joe would call” (call this E for short)
the hypothesis H0 — that I don’t have ESP — and its prior probability, 95%
the opposing hypothesis H1 — that I have ESP — and its prior probability, 5%
Now let’s think about two hypothetical mes.
The first me guesses at some likelihoods, deciding that both P(E | H0) and P(E | H1) were both 10%. Turning the crank, it gets a posterior for H1, P(H1 | E), that’s proportional to P(H1) P(E | H1) = 5% × 10% = 0.5%, and a posterior for H0, P(H0 | E), that’s proportional to P(H0) P(E | H0) = 95% × 10% = 9.5%. Of course its posteriors have to add to 100%, not 10%, so it multiplies both by 10 to normalize them. Unsurprisingly, as the likelihoods were equal, its posteriors come out at 95% for H0 and 5% for H1; the priors are unchanged.
When the second me is about to guess at some likelihoods, its brain is suddenly zapped by a stray gamma ray. The second me therefore decides that P(E | H0) was 2% but that P(E | H1) was 50%. Applying Bayes’s theorem in precisely the same way as the first me, it gets a P(H1 | E) proportional to 5% × 50% = 2.5%, and a P(H0 | E) proportional to 95% × 2% = 1.9%. Normalizing (but this time multiplying by 100/(2.5+1.9)) gives posteriors of P(H0 | E) = 43.2% and P(H1 | E) = 56.8%.
So the first me still strongly doubts it has ESP after updating on the evidence, but the second me ends up believing ESP the more likely hypothesis. Yet both used the same method of inference, the same piece of evidence and the same priors!