You have a coin.
The coin is biased.
You don’t know which way it’s biased or how much it’s biased. Someone just told you, “The coin is biased” and that’s all they said.
This is all the information you have, and the only information you have.
You draw the coin forth, flip it, and slap it down.
Now—before you remove your hand and look at the result—are you willing to say that you assign a 0.5 probability to the coin having come up heads?
The frequentist says, “No. Saying ‘probability 0.5’ means that the coin has an inherent propensity to come up heads as often as tails, so that if we flipped the coin infinitely many times, the ratio of heads to tails would approach 1:1. But we know that the coin is biased, so it can have any probability of coming up heads except 0.5.”
The frequentists get this exactly wrong, ruling out the only the correct answer given their knowledge of the situation.
The article goes on to describe scenarios in which having different partial knowledge to the situation leads to different probabilities. The frequentist perspective doesn’t merely lead to the wrong answer for these scenarios, it fails to even produce a coherent analysis. Because there is no single probability attached to the event itself. The probability really is a property of the mind analyzing that event, to the extent that it is sensitive to the partial knowledge of that mind.
The competent frequentist would presumably not be befuddled by these supposed paradoxes. Since he would not be befuddled (or so I am fairly certain), the “paradoxes” fail to prove the superiority of the Bayesian approach.
Eliezer responded with:
Not the last two paradoxes, no. But the first case given, the biased coin whose bias is not known, is indeed a classic example of the difference between Bayesians and frequentists.
and in the post he wrote
The frequentist perspective doesn’t merely lead to the wrong answer for these scenarios, it fails to even produce a coherent analysis.
But the frequentist does have a coherent analysis for solving this problem. Because we’re not actually interested in the long-term probability of flipping heads (of which all anyone can say is that it is not .5) but the expected outcome of a single flip of a biased coin. This is an expected value calculation, and I’ll even apply your idea about events with symmetric alternatives. (So I do not have to make any assumptions about the shape of the distribution of possible biases.)
I will calculate my expected value using that the coin is biased towards heads or it is biased towards tails with equal probability. Let p be the probability that the coin flips to the biased orientation (i.e., p>.5).
The probability of heads is p with probability of 0.5. The probability of tails in this case is (1-p)*0.5.
The probability of heads is (1-p) with probability of 0.5. The probability of tails in this case is (p)*0.5.
Thus, the expected value of heads is p .5+(1-p) 0.5 = 0.5.
So there’s no befuddlement, only a change in random variables from the long-term expectation of the outcome of many flips to the long-term expectation of whether heads or tails is preferred and a single flip. Which we should expect, since the random variable we are really being asked about has changed with the different contexts.
You just pushed aside your notion of an objective probability and calculated a subjective probability reflecting your partial information. Congratulations, you are a Bayesian.
I applied completely orthodox frequentist probability.
I had predicted your objection would be that expected value is an application of Bayes’ theorem, but I was prepared to argue that orthodox probability does include Bayes’ theorem. It is one of the pillars of any introductory probability textbook.
A problem isn’t “Bayesian” or “frequentist”. The approach is. Frequentists take the priors as given assumptions. The assumptions are incorporated at the beginning as part of the context of the problem, and we know the objective solution depends upon (and is defined within) a given context. A Bayesian in contrast, has a different perspective and doesn’t require formalizing the priors as given assumptions. Apparently they are comfortable with asserting that the priors are “subjective”. As a frequentist, I would have to say that the problem is ill-posed (or under-determined) to the extent that the priors/assumptions are really subjective.
Suppose that I tell you I am going to pick up a card randomly and will ask you the probability of whether it is the ace of hearts. Your correct answer would be 1⁄52, even if I look at the card myself and know with probability 0 or 1 that the card is the ace of hearts. Frequentists have no problem with this “subjectivity”, they understand it as different probabilities for different contexts. This is mainly a response to this comment, but is relevant here.
Yet again, the misunderstanding has arisen because of not understanding what is meant by the probability is “in” the cards. In this way, Bayesian’s interpret the frequentist’s language too literally. But what does a frequentist actually mean? Just that the probability is objective? But the objectivity results from the preferred way of framing the problem … I’m willing to consider and have suggested the possibility that this “Platonic probability” is an artifact of a thought process that the frequentist experiences empirically (but mentally).
From Probability is in the Mind:
The frequentists get this exactly wrong, ruling out the only the correct answer given their knowledge of the situation.
The article goes on to describe scenarios in which having different partial knowledge to the situation leads to different probabilities. The frequentist perspective doesn’t merely lead to the wrong answer for these scenarios, it fails to even produce a coherent analysis. Because there is no single probability attached to the event itself. The probability really is a property of the mind analyzing that event, to the extent that it is sensitive to the partial knowledge of that mind.
I like the response of Constant2:
Eliezer responded with:
and in the post he wrote
But the frequentist does have a coherent analysis for solving this problem. Because we’re not actually interested in the long-term probability of flipping heads (of which all anyone can say is that it is not .5) but the expected outcome of a single flip of a biased coin. This is an expected value calculation, and I’ll even apply your idea about events with symmetric alternatives. (So I do not have to make any assumptions about the shape of the distribution of possible biases.)
I will calculate my expected value using that the coin is biased towards heads or it is biased towards tails with equal probability. Let p be the probability that the coin flips to the biased orientation (i.e., p>.5).
The probability of heads is p with probability of 0.5. The probability of tails in this case is (1-p)*0.5.
The probability of heads is (1-p) with probability of 0.5. The probability of tails in this case is (p)*0.5.
Thus, the expected value of heads is p .5+(1-p) 0.5 = 0.5.
So there’s no befuddlement, only a change in random variables from the long-term expectation of the outcome of many flips to the long-term expectation of whether heads or tails is preferred and a single flip. Which we should expect, since the random variable we are really being asked about has changed with the different contexts.
You just pushed aside your notion of an objective probability and calculated a subjective probability reflecting your partial information. Congratulations, you are a Bayesian.
I applied completely orthodox frequentist probability.
I had predicted your objection would be that expected value is an application of Bayes’ theorem, but I was prepared to argue that orthodox probability does include Bayes’ theorem. It is one of the pillars of any introductory probability textbook.
A problem isn’t “Bayesian” or “frequentist”. The approach is. Frequentists take the priors as given assumptions. The assumptions are incorporated at the beginning as part of the context of the problem, and we know the objective solution depends upon (and is defined within) a given context. A Bayesian in contrast, has a different perspective and doesn’t require formalizing the priors as given assumptions. Apparently they are comfortable with asserting that the priors are “subjective”. As a frequentist, I would have to say that the problem is ill-posed (or under-determined) to the extent that the priors/assumptions are really subjective.
Suppose that I tell you I am going to pick up a card randomly and will ask you the probability of whether it is the ace of hearts. Your correct answer would be 1⁄52, even if I look at the card myself and know with probability 0 or 1 that the card is the ace of hearts. Frequentists have no problem with this “subjectivity”, they understand it as different probabilities for different contexts. This is mainly a response to this comment, but is relevant here.
Yet again, the misunderstanding has arisen because of not understanding what is meant by the probability is “in” the cards. In this way, Bayesian’s interpret the frequentist’s language too literally. But what does a frequentist actually mean? Just that the probability is objective? But the objectivity results from the preferred way of framing the problem … I’m willing to consider and have suggested the possibility that this “Platonic probability” is an artifact of a thought process that the frequentist experiences empirically (but mentally).