I too spent a few years with a similar desire to understand probability and statistics at a deeper level, but we might have been stuck on different things. Here’s an explanation:
Suppose you have 37 numbers. Purchase a massless ruler and 37 identical weights. For each of your numbers, find the number on the ruler and glue a weight there. You now have a massless ruler with 37 weights glued onto it.
Now try to balance the ruler sideways on a spike sticking out of the ground. The mean of your numbers will be the point on the ruler where it balances.
Now spin the ruler on the spike. It’s easy to speed up or slow down the spinning ruler if the weights are close together, but more force is required if the weights are far apart. The variance of your numbers is proportional to the amount the ruler resists changes to its angular velocity—how hard you have to twist the ruler to make it spin, or to make it stop spinning.
“I’d like to understand this more deeply” is a thought that occurs to people at many levels of study, so this explanation could be too high or low. Where did my comment hit?
If you are frustrated with hand waving in calculus, read a Real Analysis textbook. The magic words which explain how the heck you can have a probability distributions over real numbers is measure theory).
How does that answer the question? It’s true that the center of gravity is a mean, but the moment of inertia is not a variance. It’s one thing to say something is “proportional to a variance” to mean that the constant is 2 or pi, but when the constant is the number of points, I think it’s missing the statistical point.
But the bigger problem is that these are not statistical examples! Means and sums of squares occur many places, but why are they are a good choice for the central tendency and the tendency to be central? Are you suggesting that we think of a random variable as a physical rod? Why? Does trying to spin it have any probabilistic or statistical meaning?
I wasn’t aiming to answer Locaha’s question as much as figure out what question to answer. The range of math knowledge here is high, and I don’t know where Locaha stands. I mean,
But why [is the mean calculated as] sum/n?
That could be a basic question about the meaning of averages—the sort of knowledge I internalized so deeply that I have trouble forming it into words.
But maybe Locaha’s asking a question like:
Why is an unbiased estimator of population mean a sum/n, but an unbiased estimator of population variance a sum/(n-1)?
That’s a less philosophical question. So if Locaha says “means are like the centers of mass! I never understood that intuition until now!”, I’ll have a different follow up than if Locaha says “Yes, captain obvious, of course means are like centers of mass. I’m asking about XYZ”.
Mean and variance are closely related to center of mass and moment of inertia. This is good intuition to have, and it’s statistical. The only difference is that the first two are moments of a probability distribution, and the second two are moments of a mass distribution.
If you are frustrated with explanations in calculus, read a Real Analysis textbook. And the magic words that explain how the heck you can have probability distributions over real numbers is measure theory.
I too spent a few years with a similar desire to understand probability and statistics at a deeper level, but we might have been stuck on different things. Here’s an explanation:
Suppose you have 37 numbers. Purchase a massless ruler and 37 identical weights. For each of your numbers, find the number on the ruler and glue a weight there. You now have a massless ruler with 37 weights glued onto it.
Now try to balance the ruler sideways on a spike sticking out of the ground. The mean of your numbers will be the point on the ruler where it balances.
Now spin the ruler on the spike. It’s easy to speed up or slow down the spinning ruler if the weights are close together, but more force is required if the weights are far apart. The variance of your numbers is proportional to the amount the ruler resists changes to its angular velocity—how hard you have to twist the ruler to make it spin, or to make it stop spinning.
“I’d like to understand this more deeply” is a thought that occurs to people at many levels of study, so this explanation could be too high or low. Where did my comment hit?
Moments of mass in physics is a good intro to moments in stats for people who like to visualize or “feel out” concepts concretely. Good post!
A different level explanation, which may or may not be helpful:
Read up on affine space, convex combinations, and maybe this article about torsors.
If you are frustrated with hand waving in calculus, read a Real Analysis textbook. The magic words which explain how the heck you can have a probability distributions over real numbers is measure theory).
How does that answer the question?
It’s true that the center of gravity is a mean, but the moment of inertia is not a variance. It’s one thing to say something is “proportional to a variance” to mean that the constant is 2 or pi, but when the constant is the number of points, I think it’s missing the statistical point.
But the bigger problem is that these are not statistical examples! Means and sums of squares occur many places, but why are they are a good choice for the central tendency and the tendency to be central? Are you suggesting that we think of a random variable as a physical rod? Why? Does trying to spin it have any probabilistic or statistical meaning?
I wasn’t aiming to answer Locaha’s question as much as figure out what question to answer. The range of math knowledge here is high, and I don’t know where Locaha stands. I mean,
That could be a basic question about the meaning of averages—the sort of knowledge I internalized so deeply that I have trouble forming it into words.
But maybe Locaha’s asking a question like:
That’s a less philosophical question. So if Locaha says “means are like the centers of mass! I never understood that intuition until now!”, I’ll have a different follow up than if Locaha says “Yes, captain obvious, of course means are like centers of mass. I’m asking about XYZ”.
Mean and variance are closely related to center of mass and moment of inertia. This is good intuition to have, and it’s statistical. The only difference is that the first two are moments of a probability distribution, and the second two are moments of a mass distribution.
Using the word “distribution” doesn’t make it statistical.
Telegraph to a younger me:
If you are frustrated with explanations in calculus, read a Real Analysis textbook. And the magic words that explain how the heck you can have probability distributions over real numbers is measure theory.