Being a frequentist who hangs out on a Bayesian forum, I’ve thought about the difference between the two perspectives. I think the dichotomy is analogous to bottom-up verses top-down thinking; neither one is superior to the other but the usefulness of each waxes and wanes depending upon the current state of a scientific field. I think we need both to develop any field fully.
Possibly my understanding of the difference between a frequentist and Bayesian perspective is different than yours (I am a frequentist after all) so I will describe what I think the difference is here. I think the two POVs can definitely come to the same (true) conclusions, but the algorithm/thought-process feels different.
Consider tossing a fair-coin. Everyone observes that on average, heads comes up 50% of the time. A frequentist sees the coin-tossing as a realization of the abstract Platonic truth that the coin has a 50% chance of coming up heads. A Bayesian, in contrast, believes that the realization is the primary thing … the flipping of the coin yields the property of having 50% probability of coming up heads as you flip it. So both perspectives require the observation of many flips to ascertain that the coin is indeed fair, but the only difference between the two views is that the frequentist sees the “50% probability of being heads” as something that exists independently of the flips. It’s something you discover rather than something you create.
Seen this way, it sounds like frequentists are Platonists and Bayesians are non-Platonists. Abstract mathematicians tend to be Platonists (but not always) and they’ve lent their bias to the field. Smart Bayesians, on the other hand, tend to be more practical and become experimentalists.
There’s definitely a certain rankle between Platonists and non-Platonists. Non-platonists think that Platonists are nuts, and Platonists think that the non-Platonists are too literal.
May we consider the hypothesis that this difference is just a difference in brain hard-wiring? When a Platonist thinks about a coin flipping and the probability of getting heads, they really do perceive this “probability” as existing independently. However, what do they mean by “existing independently”? We learn what words mean from experience. A Platonist has experience of this type of perception and knows what they mean. A non-Platonist doesn’t know what is meant and thinks the same thing is meant as what everyone means when they say “a table exists”. These types of existence are different, but how can a Bayesian understand the Platonic meaning without the Platonic experience?
A Bayesian should just observe what does exist, and what words the Platonist uses, and redefine the words to match the experience. This translation must be done similarly with all frequentist mathematics, if you are a Bayesian.
Seen this way, it sounds like frequentists are Platonists and Bayesians are non-Platonists.
Counterexample: I have a Platonic view of mathematical truths, but a Bayesian view of probability.
A frequentist sees the coin-tossing as a realization of the abstract Platonic truth that the coin has a 50% chance of coming up heads.
This does not make sense. For any given coin flip, either the fundamental truth is that the coin will come up heads, or the fundamental truth is that the coin will come up tails. The 50% probability represents my uncertainty about the fundamental truth, which is not a property of the coin.
Counterexample: I have a Platonic view of mathematical truths, but a Bayesian view of probability.
That’s interesting. I had imagined that people would be one way or the other about everything. Can anyone else provide datapoints on whether they are Platonic about only a subset of things?
… in order to triangulate closer to whether Platonism is “hard-wired”, do you find it possible to be non-Platonic about mathematical truths? Can someone who is non-Platonic think about them Platonically—is it a choice?
For any given coin flip, either the fundamental truth is that the coin will come up heads, or the fundamental truth is that the coin will come up tails. The 50% probability represents my uncertainty about the fundamental truth, which is not a property of the coin.
See, that’s just not the way a frequentist sees it. At first I notice, you are defining “fundamental truth” as what will actually happen in the next coin flip. In contrast, it is more natural to me to think of the “fundamental truth” as being what the probability of heads is, as a property of the coin and the flip, since the outcome isn’t determined yet. But that’s just asking different questions. So if the question is, what is the truth about the outcome of the next flip, we are talking about empirical reality (an experiment) and my perspective will be more Bayesian.
The outcome is determined timelessly, by the properties of the coin-tossing setup. It hasn’t happened yet. What came before the coin determines the coin, but in turn is determined by the stuff located further and further in the past from the actual coin-toss. It is a type error to speak of when the outcome is determined.
Whether or not the universe is deterministic is not determined yet. Even if you and I both think that a deterministic universe is more logical, we should accept that certain figures of speech will persist. When I said the toss wasn’t determined yet, I meant that the outcome of the toss was not known yet by me. I don’t see how your correction adds to the discussion except possibly to make me seem naive, like I’ve never considered the concept of determinism before.
what the probability of heads is, as a property of the coin and the flip
I meant that the outcome of the toss was not known yet by me
Map/territory distinction. As a property of the actual coin and flip, the probability of heads is 0 or 1 (modulo some nonzero but utterly negligible quantum uncertainty); as a property of your state of knowledge, it can be 0.5.
This comment helped things come into better focus for me.
A frequentist believes that there is a probability of flipping heads, as a property of the coin and (yes, certainly) the conditions of the flipping. To a frequentist, this probability is independent of whether the outcome is determined or not and is even independent of what the outcome is. Consider the following sequence of flips: H T T
A frequentist believes that the probability of flipping heads was .5 all along right? The first ‘H’ and the second ‘T’ and the third ‘T’ were just discrete realizations of this probability.
The reasons why I’ve been calling this a Platonic perspective is because I think the critical difference in philosophy is the frequentist idea of this non-empirical “probability’ existing independent of realizations. The probability of flipping heads for a set of conditions is .5 whether you actually flip the coins or not. However, frequentists agree you must flip the coin to know that the probability was .5.
You might think this perspective is wrong-headed, and from a strict empirical view where you allow no Platonic entities/concepts, it kind of is. But the question I am really interested in is the following: to what extent is this point of view a choice we can be wrong or right about, or a perspective that some (or most?) people have hard-wired in their physical brain? Further, how can you argue that it isn’t useful when it demonstrably has been so useful? Perhaps it facilitates or is necessary for some categories of abstract thought.
But the question I am really interested in is the following: to what extent is this point of view a choice we can be wrong or right about, or a perspective that most people have hard-wired in their physical brain algorithms?
It could be hard-wired and still be right or wrong.
Correct, generally. But how could a perspective be wrong?
I can think of two ways a perspective can be wrong: either because it (a) asserts a fact about external reality that is not true or (b) yields false conclusions about the external world.
(a) Frequentists don’t assert anything extra about the empirical world, they assert the use of (and obstensibly, the “existence” of) something symbolic. From the empiricist perspective, it’s not really there. Like a little icon floating above or around the actual thing that your cursor doesn’t interact with, so it can’t be false in the empirical sense.
(b) It would be fascinating if the frequentist perspective yielded false conclusions,and in such a case, is there any doubt that people would develop and embrace new mathematics that avoided such errors? In fact, we already see this happening where physics at extreme scales seems to defy intuition. If someone wanted to propose a new theory of everything I don’t think anyone would ever criticize it on the grounds of not being frequentist. I guess the point here is just that it’s useful or not.
Later edit: Ok, I finally get it. Maybe the reason we don’t understand physics at the extreme scales is because the frequentist approach was evolved (hard-wired) for understanding intermediate physical scales and it’s (apparently) beginning to fail. You guys are using empirical philosophy to try and develop a new brand mathematics that won’t have these inborn errors of intuition. So while I argue that frequentism has definitely been productive so far, you argue that it is intrinsically limited based on philosophical principles.
A perspective can be wrong if it arbitrarily assigns a probability of 1 to an event that has a symmetrical alternative. Read the intro to My Bayesian Enlightenment for Eliezer’s description of a frequentist going wrong in this way with respect to the problem of the mathematician with two children, at least one of which is a boy.
No, Bayesian probability and orthodox statistics give exactly the same answers if the context of the problem is the same. The two schools may tend to have different ideas about what is a “natural” context, but any good textbook will always define exactly what the context is so that there is no guessing and no disagreement.
Nevertheless, which event with a symmetrical alternative were you referring to? (You are given that the women said she has at least 1 boy, so it would be correct to assign that probability 1 in the context of a given assumption, obviously when applying the orthodox method.) Both approaches work differently, but they both work.
Nevertheless, which event with a symmetrical alternative were you referring to?
Given that the women does have a boy and a girl, what is the probability that she would state that at least one of them is a boy? By symmetry, you would expect a priori, not knowing anything about this person’s preferences, that in the same conditions, she is equally likely to state that at least one of her children is a girl, so to assign the conditional probability higher than .5 does not make sense, so it is definitely not right for the frequentist Eliezer was talking with to act as though the conditional probability were 1. (The case could be made that the statement is also evidence that the woman has a tendency to say at least once child is a boy rather than that at least one child is a girl. But this is a small effect, and still does not justify assigning a conditional probability of 1.)
I think the frequentist approach could handle this problem if applied correctly, but it seems that frequentist in practice get it wrong because they do not even consider the conditional probability that they would observe a piece of evidence if a theory they are considering is true.
any good textbook will always define exactly what the context is so that there is no guessing and no disagreement.
If you read the article I cited, Eliezer did explain that this was a mangling of the original problem, in which the mathematician made the statement in response to a direct question, so one could reasonably approximate that she would make the statement exactly when it is true.
However, life does not always present us with neat textbook problems. Sometimes, the conditional probabilities are hard to figure out. I prefer the approach that says figure them out anyways to the one that glosses over their importance.
so to assign the conditional probability higher than .5 does not make sense, so it is definitely not right for the frequentist Eliezer was talking with to act as though the conditional probability were 1
In the “correct” formulation of the problem (the one in which the correct answer is 1⁄3), the frequentist tells us what the mother said as a given assumption; considering the prior <1 probability of this is rendered irrelevant because we are now working in the subset of probability space where she said that.
it seems that frequentist in practice get it wrong because they do not even consider the conditional probability that they would observe a piece of evidence if a theory they are considering is true.
Considering whether a theory is true is science—I completely agree science has important, necessary Bayesian elements.
Giving “probably” of actual outcome for the coin flip as ~1 looks like a type error, although it’s clear what you are saying. It’s more like P(coin is heads|coin is heads), tautologically 1, not really a probability.
As a property of the actual coin and flip, the probability of heads is 0 or 1 (modulo some nonzero but utterly negligible quantum uncertainty)
This mixes together two different kinds of probability, confusing the situation. There is nothing fuzzy about the events defining the possible outcomes, the fact that there is also indexical uncertainty imposed on your mind while it observes the outcome is from a different problem.
… in order to triangulate closer to whether Platonism is “hard-wired”, do you find it possible to be non-Platonic about mathematical truths? Can someone who is non-Platonic think about them Platonically—is it a choice?
Most of the time I think about math, I do not worry about if it is platonic or not. It was really only in the context of considering my epistemic uncertainty that 2+2=4 that I needed consider the nature of the territory I was mapping, and in this context it did not make sense for the territory to be the physical universe.
In contrast, it is more natural to me to think of the “fundamental truth” as being what the probability of heads is, as a property of the coin and the flip, since the outcome isn’t determined yet.
You mean, the outcome has not been determined by you, since you have not observed all the physical properties of coin, the person flipping it, and the environment, and calculated out all the physics that would tell you whether it would land heads or tails. Attaching a probability to the coin is just our way of dealing with the ignorance and lack of computing power that prevents us from finding the exact answer.
What is your point? You iterate the Bayesian perspective, but do you agree that frequentists and Bayesians have different perspectives about this?
I think it boils down to this: you are a frequentist (and I’ve been using the term Platonist) if you see the 50% probability as a property of the coin and the flip, and you are a Bayesian if you see the 50% probability as just a way of measuring the uncertainty.
(Given your rationale for being Platonic about mathematics, I don’t know if you are really a Platonist (in the hard-wired sense).)
My point is that the view that 50% probability is a fundamental property of the coin is wrong. It is an example of the Mind Projection Fallacy, thinking that because you don’t know the result, somehow the universe doesn’t either. It is certainly not the case that when asked about the result of a single coin flip, that giving a 50% probability for heads is the best possible answer. One could, in principle, do more investigation, and find that under the current conditions, the coin will come up heads (or tails) with 99% probability, and actually be right 99 times out of a hundred.
I don’t like to call this view of the probability as a fundamental property of the coin the frequentist view. It makes more sense to describe their perspective as a the probability being a combined property of the coin and a distribution of conditions in which it could be flipped. From this perspective, the mistake of attaching the probability to the coin is that miss the fact that you are flipping the coin in one particular condition, which will have a definite outcome. The probability comes from uncertainty of which condition from the distribution applies in this case, and of course, limits on computational power.
Are you saying that frequentists are wrong, or just me?
If the former, how can you say that and consider the case closed when frequentists arrive at correct conclusions? What I’m suggesting is that Bayesians are committing the mind projection fallacy when they assert that frequentists are “wrong”.
I am saying that you are wrong, and I am not sure there isn’t more to the frequentist view than you are saying, so I am not prepared to figure out if it is right or wrong until I know more about what it is saying.
If the former, how can you say that and consider the case closed when frequentists arrive at correct conclusions?
Like in the Monty Hall problem, where the frequentists will agree to the correct answer after you beat them over the head with a computer simulation?
What I’m suggesting is that Bayesians are committing the mind projection fallacy when they assert that frequentists are “wrong”.
Huh? What property of our minds do you think we are projecting onto the territory?
In the Monty Hall problem, intuiton tends to insist on the wrong answer, not valid application of frequentist theory.
Just curious—is the monty hall solution intuitively obvious to a “Bayesian”, or do they also need to work through the (Bayesian) math in order to be convinced?
Huh? What property of our minds do you think we are projecting onto the territory?
Just curious—is the monty hall solution intuitively obvious to a “Bayesian”, or do they also need to work through the (Bayesian) math in order to be convinced?
For me at least, it is not so much that the solution is intuitively obvious as that setting up the Bayesian math forces me to ask the important questions.
I meant the typical mind fallacy.
Then how do you think we are assuming that others think like us? It seems to me that we notice that others are not thinking like us, and that in this case, the different thinking is an error. I believe that 2+2=4, and if I said that someone was wrong for claiming that 2+2=3, that would not be a typical mind fallacy.
If the conclusions about reality were different, then the 2+2=4 verses 2+2=3 analogy would hold. Instead, you are objecting to the way frequentists approach the problem. (Sometimes, the difference seems to be as subtle as just the way they describe their approach.) Unless you show that they do not as consistently arrive at the correct answer, I think that objecting to their methods is the typical mind fallacy.
Asserting that frequentists are wrong is actually very non-Bayesian, because you have no evidence that the frequentist view is illogical. Only your intuition and logic guides you here. So finally, as two rationalists, we may observe a bona fide difference in what we consider intuitive, natural or logical.
I’m curious about the frequency of “natural” Bayesians and frequentists in the population, and wonder about their co-evolution. I also wonder about their lack of mutual understanding.
You have a coin.
The coin is biased.
You don’t know which way it’s biased or how much it’s biased. Someone just told you, “The coin is biased” and that’s all they said.
This is all the information you have, and the only information you have.
You draw the coin forth, flip it, and slap it down.
Now—before you remove your hand and look at the result—are you willing to say that you assign a 0.5 probability to the coin having come up heads?
The frequentist says, “No. Saying ‘probability 0.5’ means that the coin has an inherent propensity to come up heads as often as tails, so that if we flipped the coin infinitely many times, the ratio of heads to tails would approach 1:1. But we know that the coin is biased, so it can have any probability of coming up heads except 0.5.”
The frequentists get this exactly wrong, ruling out the only the correct answer given their knowledge of the situation.
The article goes on to describe scenarios in which having different partial knowledge to the situation leads to different probabilities. The frequentist perspective doesn’t merely lead to the wrong answer for these scenarios, it fails to even produce a coherent analysis. Because there is no single probability attached to the event itself. The probability really is a property of the mind analyzing that event, to the extent that it is sensitive to the partial knowledge of that mind.
The competent frequentist would presumably not be befuddled by these supposed paradoxes. Since he would not be befuddled (or so I am fairly certain), the “paradoxes” fail to prove the superiority of the Bayesian approach.
Eliezer responded with:
Not the last two paradoxes, no. But the first case given, the biased coin whose bias is not known, is indeed a classic example of the difference between Bayesians and frequentists.
and in the post he wrote
The frequentist perspective doesn’t merely lead to the wrong answer for these scenarios, it fails to even produce a coherent analysis.
But the frequentist does have a coherent analysis for solving this problem. Because we’re not actually interested in the long-term probability of flipping heads (of which all anyone can say is that it is not .5) but the expected outcome of a single flip of a biased coin. This is an expected value calculation, and I’ll even apply your idea about events with symmetric alternatives. (So I do not have to make any assumptions about the shape of the distribution of possible biases.)
I will calculate my expected value using that the coin is biased towards heads or it is biased towards tails with equal probability. Let p be the probability that the coin flips to the biased orientation (i.e., p>.5).
The probability of heads is p with probability of 0.5. The probability of tails in this case is (1-p)*0.5.
The probability of heads is (1-p) with probability of 0.5. The probability of tails in this case is (p)*0.5.
Thus, the expected value of heads is p .5+(1-p) 0.5 = 0.5.
So there’s no befuddlement, only a change in random variables from the long-term expectation of the outcome of many flips to the long-term expectation of whether heads or tails is preferred and a single flip. Which we should expect, since the random variable we are really being asked about has changed with the different contexts.
You just pushed aside your notion of an objective probability and calculated a subjective probability reflecting your partial information. Congratulations, you are a Bayesian.
I applied completely orthodox frequentist probability.
I had predicted your objection would be that expected value is an application of Bayes’ theorem, but I was prepared to argue that orthodox probability does include Bayes’ theorem. It is one of the pillars of any introductory probability textbook.
A problem isn’t “Bayesian” or “frequentist”. The approach is. Frequentists take the priors as given assumptions. The assumptions are incorporated at the beginning as part of the context of the problem, and we know the objective solution depends upon (and is defined within) a given context. A Bayesian in contrast, has a different perspective and doesn’t require formalizing the priors as given assumptions. Apparently they are comfortable with asserting that the priors are “subjective”. As a frequentist, I would have to say that the problem is ill-posed (or under-determined) to the extent that the priors/assumptions are really subjective.
Suppose that I tell you I am going to pick up a card randomly and will ask you the probability of whether it is the ace of hearts. Your correct answer would be 1⁄52, even if I look at the card myself and know with probability 0 or 1 that the card is the ace of hearts. Frequentists have no problem with this “subjectivity”, they understand it as different probabilities for different contexts. This is mainly a response to this comment, but is relevant here.
Yet again, the misunderstanding has arisen because of not understanding what is meant by the probability is “in” the cards. In this way, Bayesian’s interpret the frequentist’s language too literally. But what does a frequentist actually mean? Just that the probability is objective? But the objectivity results from the preferred way of framing the problem … I’m willing to consider and have suggested the possibility that this “Platonic probability” is an artifact of a thought process that the frequentist experiences empirically (but mentally).
I am a Platonist about mathematics by inclination, though I strongly suspect that this inclination is one that I should resist taking too seriously. I am a Bayesian about proability (at least in the following sense: it seems to me that the Bayesian approach subsumes the others, when they are applied correctly). I am mostly Bayesian about statistics, but don’t see any reason why you shouldn’t compute confidence intervals and unbiased estimators if you want to. I don’t think “Platonist” and “frequentist” are at all the same thing, so I don’t see any of the above as indicating that I’m (inclined to be) Platonist about some things but not about others.
[...] the fundamental truth [...]
This seems to have prompted a debate about whether The Fundamental Truth is one about the general propensities of the coin, or one about what will happen the next time it’s flipped. I don’t see why there should be exactly one Fundamental Truth about the coin; I’d have thought there would be either none or many depending on what sort of thing one wishes to count as a “fundamental truth”.
Anyway: imagine a precision robot coin-flipper. I hope it’s clear that with such a device one could arrange that the next million flips of the coin all come up heads, and then melt it down. So whatever “fundamental truth” there might be about What The Coin Will Do has to be relative to some model of what’s going to be done to it. The point of coin-flipping is that it’s a sort of randomness magnifier: small variations in what you do to it make bigger differences to what it does, so a small patch of possibility-space gets turned into a somewhat-uniform sampling of a larger patch (caution: Liouville, volume conservation, etc.). And the “fundamental truth” about the coin that you’re appealing to is that, plus what it implies about its ability to turn kinda-sorta-slightly-random-ish coin flipping actions into much more random-ish outcomes. To turn that into an actual expectation of (more or less) independent p=1/2 Bernoulli trials, you need to add some assumption about how people actually flip coins, and then the magic of physics means that a wide range of such assumptions all lead to very similar-looking conclusions about what the outcomes are likely to look like.
In other words: an accurate version of the frequentist way of looking at the coin’s behaviour starts with some assumption (wherever it happens to come from) about how coins actually get flipped, mixes that with some (not really probabilistic) facts about the coin, and ends up with a conclusion about what the coin is likely to do when flipped, which doesn’t depend too sensitively on that assumption we made.
Whereas a Bayesian way of looking at it starts with some assumption (wherever it happens to come from) about what happens when coins get flipped, mixes that with some (not really probabilistic) facts about what the coin has been observed to do and perhaps a bit of physics, and ends up with a conclusion about what the coin is likely to do when flipped in the future, which doesn’t depend too sensitively on that assumption we made.
Clearly the philosophical differences here are irreconcilable...
As a property of the coin and the flip and the environment and the laws of physics, the probability of heads is either 0 or 1. Just because you haven’t computed it doesn’t mean the answer becomes a superposition of what you might compute, or something.
What you want is something like the result of taking a natural generalization of the exact situation—if the universe is continuous and the system is chaotic enough “round to some precision” works—and then computing the answer in this parameterized space of situations, and then averaging over the parameter.
The problem is that “natural generalization” is pretty hard to define.
Being a Platonist and a frequentist aren’t the same thing, but they correlate because they’re both errors in thinking.
The objection to frequentism is that it builds the answer into the solution so the problem actually changes from the original real world problem. This is fine as long as you can test discrepancies between theory and practice, but that’s not always going to possible.
“A Bayesian, in contrast, believes that the realization is the primary thing … the flipping of the coin yields the property of having 50% probability of coming up heads as you flip it.”
Thanks for trying to explain the difference, but I have no idea what this means.
What I was thinking about was this: Bayesians and frequentists both agree that if a fair coin is tossed n times (where n is very large) then a string of heads and tails will result and the probability of heads is .5 in some way related to the fact that the number of heads divided by n will approach .5 for large n.
In my mind, the frequentist perspective is that the .5 probability of getting heads exists first, and then the string of heads and tails realize (i.e., make a physical manifestation of) this abstract probability lurking in the background. As though there is a bin of heads and tails somewhere with exactly a 1:1 ratio and each flip picks randomly from this bin. The Bayesian perspective is that there is nothing but the string of heads and tails—only the string exists, there’s no abstract probability that the string is a realization of. No picking from a bin in the sky. Inspecting the string, a Bayesian can calculate the 0.5 probability … so the 0.5 probability results from the string. So according to me, the philosophical debate boils down to: what comes first, the probability or the string?
I definitely get the impression that the Bayesians in this thread are skeptical of this description of the difference, and seem to prefer describing the difference of the Bayesian view as considering probability a measure of your uncertainty. However, probability is also taught as a measure of uncertainty in classical probability, so I’m skeptical of this dichotomy. (In favor of my view, the name “frequentist” comes from the observation that they believe in a notion of “frequency”—i.e., that there’s a hypothetical distribution “out there” that observed data is being sampled from.)
Perhaps the difference in whether the correct approach is subjective or objective better gets to the heart of the difference. I am leaning towards this hypothesis because I can see how a frequentist can confuse something being objective with that something having an independent “existence”.
I have a little difficulty with the notion that the probable outcome of a coin toss is the result of the toss, rather like the collapse of a quantum probability into reality when observed. Looking at the coin before the toss, surely three probabilities may be objectively observed - H, T or E, and the likelihood of the coin coming to rest on its edge dismissed.
Since the coin MUST then end up H or T ; the sum of both probabilities is 1, both outcomes are a priori equally likely and have the value1/2 before the toss. Whether one chooses to believe that the a priori probabilities have actual existence is a metaphysical issue.
Being a frequentist who hangs out on a Bayesian forum, I’ve thought about the difference between the two perspectives. I think the dichotomy is analogous to bottom-up verses top-down thinking; neither one is superior to the other but the usefulness of each waxes and wanes depending upon the current state of a scientific field. I think we need both to develop any field fully.
Possibly my understanding of the difference between a frequentist and Bayesian perspective is different than yours (I am a frequentist after all) so I will describe what I think the difference is here. I think the two POVs can definitely come to the same (true) conclusions, but the algorithm/thought-process feels different.
Consider tossing a fair-coin. Everyone observes that on average, heads comes up 50% of the time. A frequentist sees the coin-tossing as a realization of the abstract Platonic truth that the coin has a 50% chance of coming up heads. A Bayesian, in contrast, believes that the realization is the primary thing … the flipping of the coin yields the property of having 50% probability of coming up heads as you flip it. So both perspectives require the observation of many flips to ascertain that the coin is indeed fair, but the only difference between the two views is that the frequentist sees the “50% probability of being heads” as something that exists independently of the flips. It’s something you discover rather than something you create.
Seen this way, it sounds like frequentists are Platonists and Bayesians are non-Platonists. Abstract mathematicians tend to be Platonists (but not always) and they’ve lent their bias to the field. Smart Bayesians, on the other hand, tend to be more practical and become experimentalists.
There’s definitely a certain rankle between Platonists and non-Platonists. Non-platonists think that Platonists are nuts, and Platonists think that the non-Platonists are too literal.
May we consider the hypothesis that this difference is just a difference in brain hard-wiring? When a Platonist thinks about a coin flipping and the probability of getting heads, they really do perceive this “probability” as existing independently. However, what do they mean by “existing independently”? We learn what words mean from experience. A Platonist has experience of this type of perception and knows what they mean. A non-Platonist doesn’t know what is meant and thinks the same thing is meant as what everyone means when they say “a table exists”. These types of existence are different, but how can a Bayesian understand the Platonic meaning without the Platonic experience?
A Bayesian should just observe what does exist, and what words the Platonist uses, and redefine the words to match the experience. This translation must be done similarly with all frequentist mathematics, if you are a Bayesian.
Counterexample: I have a Platonic view of mathematical truths, but a Bayesian view of probability.
This does not make sense. For any given coin flip, either the fundamental truth is that the coin will come up heads, or the fundamental truth is that the coin will come up tails. The 50% probability represents my uncertainty about the fundamental truth, which is not a property of the coin.
That’s interesting. I had imagined that people would be one way or the other about everything. Can anyone else provide datapoints on whether they are Platonic about only a subset of things?
… in order to triangulate closer to whether Platonism is “hard-wired”, do you find it possible to be non-Platonic about mathematical truths? Can someone who is non-Platonic think about them Platonically—is it a choice?
See, that’s just not the way a frequentist sees it. At first I notice, you are defining “fundamental truth” as what will actually happen in the next coin flip. In contrast, it is more natural to me to think of the “fundamental truth” as being what the probability of heads is, as a property of the coin and the flip, since the outcome isn’t determined yet. But that’s just asking different questions. So if the question is, what is the truth about the outcome of the next flip, we are talking about empirical reality (an experiment) and my perspective will be more Bayesian.
The outcome is determined timelessly, by the properties of the coin-tossing setup. It hasn’t happened yet. What came before the coin determines the coin, but in turn is determined by the stuff located further and further in the past from the actual coin-toss. It is a type error to speak of when the outcome is determined.
Whether or not the universe is deterministic is not determined yet. Even if you and I both think that a deterministic universe is more logical, we should accept that certain figures of speech will persist. When I said the toss wasn’t determined yet, I meant that the outcome of the toss was not known yet by me. I don’t see how your correction adds to the discussion except possibly to make me seem naive, like I’ve never considered the concept of determinism before.
Map/territory distinction. As a property of the actual coin and flip, the probability of heads is 0 or 1 (modulo some nonzero but utterly negligible quantum uncertainty); as a property of your state of knowledge, it can be 0.5.
This comment helped things come into better focus for me.
A frequentist believes that there is a probability of flipping heads, as a property of the coin and (yes, certainly) the conditions of the flipping. To a frequentist, this probability is independent of whether the outcome is determined or not and is even independent of what the outcome is. Consider the following sequence of flips: H T T
A frequentist believes that the probability of flipping heads was .5 all along right? The first ‘H’ and the second ‘T’ and the third ‘T’ were just discrete realizations of this probability.
The reasons why I’ve been calling this a Platonic perspective is because I think the critical difference in philosophy is the frequentist idea of this non-empirical “probability’ existing independent of realizations. The probability of flipping heads for a set of conditions is .5 whether you actually flip the coins or not. However, frequentists agree you must flip the coin to know that the probability was .5.
You might think this perspective is wrong-headed, and from a strict empirical view where you allow no Platonic entities/concepts, it kind of is. But the question I am really interested in is the following: to what extent is this point of view a choice we can be wrong or right about, or a perspective that some (or most?) people have hard-wired in their physical brain? Further, how can you argue that it isn’t useful when it demonstrably has been so useful? Perhaps it facilitates or is necessary for some categories of abstract thought.
It could be hard-wired and still be right or wrong.
Correct, generally. But how could a perspective be wrong?
I can think of two ways a perspective can be wrong: either because it (a) asserts a fact about external reality that is not true or (b) yields false conclusions about the external world.
(a) Frequentists don’t assert anything extra about the empirical world, they assert the use of (and obstensibly, the “existence” of) something symbolic. From the empiricist perspective, it’s not really there. Like a little icon floating above or around the actual thing that your cursor doesn’t interact with, so it can’t be false in the empirical sense.
(b) It would be fascinating if the frequentist perspective yielded false conclusions,and in such a case, is there any doubt that people would develop and embrace new mathematics that avoided such errors? In fact, we already see this happening where physics at extreme scales seems to defy intuition. If someone wanted to propose a new theory of everything I don’t think anyone would ever criticize it on the grounds of not being frequentist. I guess the point here is just that it’s useful or not.
Later edit: Ok, I finally get it. Maybe the reason we don’t understand physics at the extreme scales is because the frequentist approach was evolved (hard-wired) for understanding intermediate physical scales and it’s (apparently) beginning to fail. You guys are using empirical philosophy to try and develop a new brand mathematics that won’t have these inborn errors of intuition. So while I argue that frequentism has definitely been productive so far, you argue that it is intrinsically limited based on philosophical principles.
A perspective can be wrong if it arbitrarily assigns a probability of 1 to an event that has a symmetrical alternative. Read the intro to My Bayesian Enlightenment for Eliezer’s description of a frequentist going wrong in this way with respect to the problem of the mathematician with two children, at least one of which is a boy.
No, Bayesian probability and orthodox statistics give exactly the same answers if the context of the problem is the same. The two schools may tend to have different ideas about what is a “natural” context, but any good textbook will always define exactly what the context is so that there is no guessing and no disagreement.
Nevertheless, which event with a symmetrical alternative were you referring to? (You are given that the women said she has at least 1 boy, so it would be correct to assign that probability 1 in the context of a given assumption, obviously when applying the orthodox method.) Both approaches work differently, but they both work.
Given that the women does have a boy and a girl, what is the probability that she would state that at least one of them is a boy? By symmetry, you would expect a priori, not knowing anything about this person’s preferences, that in the same conditions, she is equally likely to state that at least one of her children is a girl, so to assign the conditional probability higher than .5 does not make sense, so it is definitely not right for the frequentist Eliezer was talking with to act as though the conditional probability were 1. (The case could be made that the statement is also evidence that the woman has a tendency to say at least once child is a boy rather than that at least one child is a girl. But this is a small effect, and still does not justify assigning a conditional probability of 1.)
I think the frequentist approach could handle this problem if applied correctly, but it seems that frequentist in practice get it wrong because they do not even consider the conditional probability that they would observe a piece of evidence if a theory they are considering is true.
If you read the article I cited, Eliezer did explain that this was a mangling of the original problem, in which the mathematician made the statement in response to a direct question, so one could reasonably approximate that she would make the statement exactly when it is true.
However, life does not always present us with neat textbook problems. Sometimes, the conditional probabilities are hard to figure out. I prefer the approach that says figure them out anyways to the one that glosses over their importance.
In the “correct” formulation of the problem (the one in which the correct answer is 1⁄3), the frequentist tells us what the mother said as a given assumption; considering the prior <1 probability of this is rendered irrelevant because we are now working in the subset of probability space where she said that.
Considering whether a theory is true is science—I completely agree science has important, necessary Bayesian elements.
Considering whether a theory is true is not science, althought the two are certainly useful to each other.
Giving “probably” of actual outcome for the coin flip as ~1 looks like a type error, although it’s clear what you are saying. It’s more like P(coin is heads|coin is heads), tautologically 1, not really a probability.
Edited to clarify.
This mixes together two different kinds of probability, confusing the situation. There is nothing fuzzy about the events defining the possible outcomes, the fact that there is also indexical uncertainty imposed on your mind while it observes the outcome is from a different problem.
Yeah, it just felt like too much work to add ”...randomly sampling from future Everett branches according to the Born probabilities” or the like.
My point is that most of the time decision-theoretic problems are best handled in a deterministic world.
Hence it’s your uncertainty, which can as well be handled in deterministic world. And in deterministic world, I don’t know how to parse your sentence
Most of the time I think about math, I do not worry about if it is platonic or not. It was really only in the context of considering my epistemic uncertainty that 2+2=4 that I needed consider the nature of the territory I was mapping, and in this context it did not make sense for the territory to be the physical universe.
You mean, the outcome has not been determined by you, since you have not observed all the physical properties of coin, the person flipping it, and the environment, and calculated out all the physics that would tell you whether it would land heads or tails. Attaching a probability to the coin is just our way of dealing with the ignorance and lack of computing power that prevents us from finding the exact answer.
What is your point? You iterate the Bayesian perspective, but do you agree that frequentists and Bayesians have different perspectives about this?
I think it boils down to this: you are a frequentist (and I’ve been using the term Platonist) if you see the 50% probability as a property of the coin and the flip, and you are a Bayesian if you see the 50% probability as just a way of measuring the uncertainty.
(Given your rationale for being Platonic about mathematics, I don’t know if you are really a Platonist (in the hard-wired sense).)
My point is that the view that 50% probability is a fundamental property of the coin is wrong. It is an example of the Mind Projection Fallacy, thinking that because you don’t know the result, somehow the universe doesn’t either. It is certainly not the case that when asked about the result of a single coin flip, that giving a 50% probability for heads is the best possible answer. One could, in principle, do more investigation, and find that under the current conditions, the coin will come up heads (or tails) with 99% probability, and actually be right 99 times out of a hundred.
I don’t like to call this view of the probability as a fundamental property of the coin the frequentist view. It makes more sense to describe their perspective as a the probability being a combined property of the coin and a distribution of conditions in which it could be flipped. From this perspective, the mistake of attaching the probability to the coin is that miss the fact that you are flipping the coin in one particular condition, which will have a definite outcome. The probability comes from uncertainty of which condition from the distribution applies in this case, and of course, limits on computational power.
Are you saying that frequentists are wrong, or just me?
If the former, how can you say that and consider the case closed when frequentists arrive at correct conclusions? What I’m suggesting is that Bayesians are committing the mind projection fallacy when they assert that frequentists are “wrong”.
I am saying that you are wrong, and I am not sure there isn’t more to the frequentist view than you are saying, so I am not prepared to figure out if it is right or wrong until I know more about what it is saying.
Like in the Monty Hall problem, where the frequentists will agree to the correct answer after you beat them over the head with a computer simulation?
Huh? What property of our minds do you think we are projecting onto the territory?
In the Monty Hall problem, intuiton tends to insist on the wrong answer, not valid application of frequentist theory.
Just curious—is the monty hall solution intuitively obvious to a “Bayesian”, or do they also need to work through the (Bayesian) math in order to be convinced?
Oops. I meant the typical mind fallacy.
For me at least, it is not so much that the solution is intuitively obvious as that setting up the Bayesian math forces me to ask the important questions.
Then how do you think we are assuming that others think like us? It seems to me that we notice that others are not thinking like us, and that in this case, the different thinking is an error. I believe that 2+2=4, and if I said that someone was wrong for claiming that 2+2=3, that would not be a typical mind fallacy.
If the conclusions about reality were different, then the 2+2=4 verses 2+2=3 analogy would hold. Instead, you are objecting to the way frequentists approach the problem. (Sometimes, the difference seems to be as subtle as just the way they describe their approach.) Unless you show that they do not as consistently arrive at the correct answer, I think that objecting to their methods is the typical mind fallacy.
Asserting that frequentists are wrong is actually very non-Bayesian, because you have no evidence that the frequentist view is illogical. Only your intuition and logic guides you here. So finally, as two rationalists, we may observe a bona fide difference in what we consider intuitive, natural or logical.
I’m curious about the frequency of “natural” Bayesians and frequentists in the population, and wonder about their co-evolution. I also wonder about their lack of mutual understanding.
From Probability is in the Mind:
The frequentists get this exactly wrong, ruling out the only the correct answer given their knowledge of the situation.
The article goes on to describe scenarios in which having different partial knowledge to the situation leads to different probabilities. The frequentist perspective doesn’t merely lead to the wrong answer for these scenarios, it fails to even produce a coherent analysis. Because there is no single probability attached to the event itself. The probability really is a property of the mind analyzing that event, to the extent that it is sensitive to the partial knowledge of that mind.
I like the response of Constant2:
Eliezer responded with:
and in the post he wrote
But the frequentist does have a coherent analysis for solving this problem. Because we’re not actually interested in the long-term probability of flipping heads (of which all anyone can say is that it is not .5) but the expected outcome of a single flip of a biased coin. This is an expected value calculation, and I’ll even apply your idea about events with symmetric alternatives. (So I do not have to make any assumptions about the shape of the distribution of possible biases.)
I will calculate my expected value using that the coin is biased towards heads or it is biased towards tails with equal probability. Let p be the probability that the coin flips to the biased orientation (i.e., p>.5).
The probability of heads is p with probability of 0.5. The probability of tails in this case is (1-p)*0.5.
The probability of heads is (1-p) with probability of 0.5. The probability of tails in this case is (p)*0.5.
Thus, the expected value of heads is p .5+(1-p) 0.5 = 0.5.
So there’s no befuddlement, only a change in random variables from the long-term expectation of the outcome of many flips to the long-term expectation of whether heads or tails is preferred and a single flip. Which we should expect, since the random variable we are really being asked about has changed with the different contexts.
You just pushed aside your notion of an objective probability and calculated a subjective probability reflecting your partial information. Congratulations, you are a Bayesian.
I applied completely orthodox frequentist probability.
I had predicted your objection would be that expected value is an application of Bayes’ theorem, but I was prepared to argue that orthodox probability does include Bayes’ theorem. It is one of the pillars of any introductory probability textbook.
A problem isn’t “Bayesian” or “frequentist”. The approach is. Frequentists take the priors as given assumptions. The assumptions are incorporated at the beginning as part of the context of the problem, and we know the objective solution depends upon (and is defined within) a given context. A Bayesian in contrast, has a different perspective and doesn’t require formalizing the priors as given assumptions. Apparently they are comfortable with asserting that the priors are “subjective”. As a frequentist, I would have to say that the problem is ill-posed (or under-determined) to the extent that the priors/assumptions are really subjective.
Suppose that I tell you I am going to pick up a card randomly and will ask you the probability of whether it is the ace of hearts. Your correct answer would be 1⁄52, even if I look at the card myself and know with probability 0 or 1 that the card is the ace of hearts. Frequentists have no problem with this “subjectivity”, they understand it as different probabilities for different contexts. This is mainly a response to this comment, but is relevant here.
Yet again, the misunderstanding has arisen because of not understanding what is meant by the probability is “in” the cards. In this way, Bayesian’s interpret the frequentist’s language too literally. But what does a frequentist actually mean? Just that the probability is objective? But the objectivity results from the preferred way of framing the problem … I’m willing to consider and have suggested the possibility that this “Platonic probability” is an artifact of a thought process that the frequentist experiences empirically (but mentally).
I’m Platonistic in general I suppose, but I see Bayesianism as subjectively objective as a Platonistic truth.
I am a Platonist about mathematics by inclination, though I strongly suspect that this inclination is one that I should resist taking too seriously. I am a Bayesian about proability (at least in the following sense: it seems to me that the Bayesian approach subsumes the others, when they are applied correctly). I am mostly Bayesian about statistics, but don’t see any reason why you shouldn’t compute confidence intervals and unbiased estimators if you want to. I don’t think “Platonist” and “frequentist” are at all the same thing, so I don’t see any of the above as indicating that I’m (inclined to be) Platonist about some things but not about others.
This seems to have prompted a debate about whether The Fundamental Truth is one about the general propensities of the coin, or one about what will happen the next time it’s flipped. I don’t see why there should be exactly one Fundamental Truth about the coin; I’d have thought there would be either none or many depending on what sort of thing one wishes to count as a “fundamental truth”.
Anyway: imagine a precision robot coin-flipper. I hope it’s clear that with such a device one could arrange that the next million flips of the coin all come up heads, and then melt it down. So whatever “fundamental truth” there might be about What The Coin Will Do has to be relative to some model of what’s going to be done to it. The point of coin-flipping is that it’s a sort of randomness magnifier: small variations in what you do to it make bigger differences to what it does, so a small patch of possibility-space gets turned into a somewhat-uniform sampling of a larger patch (caution: Liouville, volume conservation, etc.). And the “fundamental truth” about the coin that you’re appealing to is that, plus what it implies about its ability to turn kinda-sorta-slightly-random-ish coin flipping actions into much more random-ish outcomes. To turn that into an actual expectation of (more or less) independent p=1/2 Bernoulli trials, you need to add some assumption about how people actually flip coins, and then the magic of physics means that a wide range of such assumptions all lead to very similar-looking conclusions about what the outcomes are likely to look like.
In other words: an accurate version of the frequentist way of looking at the coin’s behaviour starts with some assumption (wherever it happens to come from) about how coins actually get flipped, mixes that with some (not really probabilistic) facts about the coin, and ends up with a conclusion about what the coin is likely to do when flipped, which doesn’t depend too sensitively on that assumption we made.
Whereas a Bayesian way of looking at it starts with some assumption (wherever it happens to come from) about what happens when coins get flipped, mixes that with some (not really probabilistic) facts about what the coin has been observed to do and perhaps a bit of physics, and ends up with a conclusion about what the coin is likely to do when flipped in the future, which doesn’t depend too sensitively on that assumption we made.
Clearly the philosophical differences here are irreconcilable...
As a property of the coin and the flip and the environment and the laws of physics, the probability of heads is either 0 or 1. Just because you haven’t computed it doesn’t mean the answer becomes a superposition of what you might compute, or something.
What you want is something like the result of taking a natural generalization of the exact situation—if the universe is continuous and the system is chaotic enough “round to some precision” works—and then computing the answer in this parameterized space of situations, and then averaging over the parameter.
The problem is that “natural generalization” is pretty hard to define.
Being a Platonist and a frequentist aren’t the same thing, but they correlate because they’re both errors in thinking.
The objection to frequentism is that it builds the answer into the solution so the problem actually changes from the original real world problem. This is fine as long as you can test discrepancies between theory and practice, but that’s not always going to possible.
“A Bayesian, in contrast, believes that the realization is the primary thing … the flipping of the coin yields the property of having 50% probability of coming up heads as you flip it.”
Thanks for trying to explain the difference, but I have no idea what this means.
What I was thinking about was this: Bayesians and frequentists both agree that if a fair coin is tossed n times (where n is very large) then a string of heads and tails will result and the probability of heads is .5 in some way related to the fact that the number of heads divided by n will approach .5 for large n.
In my mind, the frequentist perspective is that the .5 probability of getting heads exists first, and then the string of heads and tails realize (i.e., make a physical manifestation of) this abstract probability lurking in the background. As though there is a bin of heads and tails somewhere with exactly a 1:1 ratio and each flip picks randomly from this bin. The Bayesian perspective is that there is nothing but the string of heads and tails—only the string exists, there’s no abstract probability that the string is a realization of. No picking from a bin in the sky. Inspecting the string, a Bayesian can calculate the 0.5 probability … so the 0.5 probability results from the string. So according to me, the philosophical debate boils down to: what comes first, the probability or the string?
I definitely get the impression that the Bayesians in this thread are skeptical of this description of the difference, and seem to prefer describing the difference of the Bayesian view as considering probability a measure of your uncertainty. However, probability is also taught as a measure of uncertainty in classical probability, so I’m skeptical of this dichotomy. (In favor of my view, the name “frequentist” comes from the observation that they believe in a notion of “frequency”—i.e., that there’s a hypothetical distribution “out there” that observed data is being sampled from.)
Perhaps the difference in whether the correct approach is subjective or objective better gets to the heart of the difference. I am leaning towards this hypothesis because I can see how a frequentist can confuse something being objective with that something having an independent “existence”.
I have a little difficulty with the notion that the probable outcome of a coin toss is the result of the toss, rather like the collapse of a quantum probability into reality when observed. Looking at the coin before the toss, surely three probabilities may be objectively observed - H, T or E, and the likelihood of the coin coming to rest on its edge dismissed.
Since the coin MUST then end up H or T ; the sum of both probabilities is 1, both outcomes are a priori equally likely and have the value1/2 before the toss. Whether one chooses to believe that the a priori probabilities have actual existence is a metaphysical issue.