Why write down the basics of logic if they are so evident?
In My Bayesian Enlightenment, Eliezer writes that he was born a Bayesian. That he decided to become a Bayesian no more than fish prefer to breathe water.
Maybe all people are born Bayesian? Although, in that case, why doesn’t everyone use Bayesian statistics? And why do many people learn little new by studying Bayesianism, while for some almost everything in Bayesianism is new? And finally, why are people who read books are much better Bayesians than those who spend all their time on the farm?
I think I have found a very simple and good explanation for this phenomenon.
Imagine that you live in a world where cars are everywhere. Even if you haven’t intentionally tried to study cars, your brain automatically detects that [these iron boxes] are fast, and can suddenly change direction or stop. In general, your brain will automatically learn about machines, and, as a result, you will intuitively understand them.
What if you live in a world full of people? Then, just by spending time with them, you will find that they are practically not dangerous, that they look like you, that they do not like it when you eat food that they call “their own”...
But if you live in a world whose structure is Bayesian (as, for example, in our world)? If you live in Bayesian networks and evidence, and your brain is always set up the hypothesis? Then you will automatically learn the art of Bayesian. This is a simple theory, and besides, it explains why people who read a lot of books are more Bayesian than rural residents who spend all day on a farm: they just saw more situations, more plot twists, in general, more cause-and-effect relationships.
So, we should to teach people Bayesian and basic logic, because for some people it’s not as obvious as it is for you.
Finally, I think that from birth we are no more Bayesian than race car drivers, but living in the Bayesian world, we inevitably study it.
Edit: In fact, the structure of the world is not Bayesian, it’s just that Bayesianism is convenient for describing the world. Therefore, now I understand this story this way: people learn logic, Bayesianism and frequentism, because the world they live in is well described by these theories.
What does it even mean to live in a Bayesian universe? A universe needs some basic properties to make probabilistic reasoning possible and meaningful. Probabilistic reasoning is only meaningful in deterministic universe if the reasoner is limited, so that there is some Knightian uncertainty. It’s always meaningful in an indeterministic universe if the indeterminism itself follows statistical patterns—otherwise you have possibility without probability. (Complex systems can defy simple statistical patterns, as Pinker notes) Frequentism also requires events to fall into reference classes with n>1 members. That clearly isn’t always—or never—the case in our universe. Bayesian probability is much less demanding.
Which to choose out of frequentism and Bayesianism? Frequentism is always better where you can use it because it is objective. But Bayesianism can be used where frequentism can’t. So the answer is “both”—specifically, use frequentism when you can, and Bayesianism when you can’t.
I mean, we’re simplifying reality down to Bayesian networks and scenario trees. And it works. It seems that we can say that the universe is Bayesian.
Bayesianism works up to a point. Frequentism works up to a point. Various other things work.
You haven’t shown that frequentism doesn’t work, or that frequentism and bayesianism are mutually exclusive.
Ok I agree
Frequentist and Bayesian reasoning are two ways to handle Knightian uncertainty. Frequentism gives you statements that are outright true in the face of this uncertainty, which is fantastic. But this sets an incredibly high bar that is very difficult to work with.
For a classic example, let’s say you want have a possibly biased coin in front of you and you want to say something about its rate of heads. From frequentism, you can lock in a method of obtaining a confidence interval after, say, 100 flips and say “I’m about to flip this coin 100 times and give you a confidence interval for p_heads. The chance that the interval will contain p_heads is at least 99%, regardless of what the true value of p_heads is” There’s no Bayesian analogue.
Now let’s say I had a complex network of conditional probability distributions with a bunch of parameters which have Knightian uncertainty. Getting confidence regions will be extremely expensive, and they’ll probably be way too huge to be useful. So we put on a convenient prior and go.
ETA: Randomized complexity classes also feel fundamentally frequentist.
This isn’t as clear, hence why there is still a debate over frequentism vs Bayes. There is also a reason frequentism is usually taught much earlier: it’s simpler for children to understand. The introduction to probability theory invariably involves the coin flip or the card draw, where it’s simple to assert the existence of a true Psuccess that can be estimated through repeated trials, which the child can physically do. This view of probability is usually carried on into life without much challenge, in the absence of formal teaching of Bayesian methods.
So it becomes less clear if our world is truly “Bayesian” in structure, since the invocation of Bayesian methodology (the assigning of a prior probability and is subsequent updating) is rarely used by the majority of society whose day-to-day lives rarely require such formalism.
Comparing Bayesian theory with frequentism is like comparing general relativity with Newton’s theory. Both explain reality, and it is even easier to explain Newton’s theory to children, although the general relativity is more true.
Most people intuitively learn frequentism, but some learn Bayesian methods for cases that they constantly encounter and in which frequency methods are not accurate enough.
But, in any case, frequency methods are built on Bayesian one way or another. An ordinary person, observing the Bayesian world, comes up with a simplified version of it—frequency methods. Newton, observing the world of general relativity, comes up with his own theory, where gravity is a force.
Newtonian gravity is a limiting case of GR (specifically in the low velocity and weak gravity case), which means GR is a strict superset of Newtonian gravity. But it’s not accurate to think of frequentism as a limiting case of Bayesian theory. They have two completely different philosophies, and both require assumptions that are sometimes incompatible with each other. The obvious example is the frequentist critique that the choice of prior is arbitrary (ie there being no true uninformative prior).
I also don’t think you can assert that the world is inherently Bayesian, given that statistical frameworks exist to interpret the world which has inherent uncertainties. It doesn’t make sense to say that there is one true statistical framework under which probabilistic events occur.
I don’t see the difference. The theory of relativity and Newton’s theory also have different philosophies: Newton’s theory states that gravity is a force, that the universe is constant and eternal, etc.
Newton’s theory is not exactly a special case of the theory of relativity, because it is less accurate.