Good point about the central limit theorem. Two nitpicks, though.
By the central limit theorem, if you add up a sufficiently large number of independent and identically distributed random variables, the distribution you get is well-approximated by a distribution that depends only on mean and variance (or any other measure of spreadout-ness)
The “or any other measure of spreadout-ness” can be dropped here; viewing the normal distribution through the lens of either the principle of maximum entropy or sufficient statistics tells us that it is variance specifically which is relevant, and any spread-metric not isomorphic to variance will be a leaky abstraction. (Leaky meaning that it will not capture all the relevant information about the spread, whereas variance does capture all the information, in a formal sense: it’s a sufficient statistic.)
But if I were to define a measure of how spread out a distribution is as E[|X-m|] for some m, I would use m=median rather than m=mean. This is because m=median minimizes this expected absolute value (in fact, median can be defined this way)...
I don’t think this is right. Suppose I have a uniform distribution over a finite set of X-values. The value of m minimizing E[|X-m|] should change if I decrease the minimum X-value a lot, while leaving everything else constant, but the median would stay the same.
I think the measure which would produce median is E[1 − 2 I[X>m]], where I[.] is an indicator function?
The “or any other measure of spreadout-ness” can be dropped here
What I meant is that, if you restrict attention to normal distributions with a fixed mean, then any reasonable measure of how spread out it is (including any of the E[|x-mean|^p]) will be a sufficient statistic, because any such measure, in order to be reasonable, must increase as variance increases (for normal distributions), so this function can be inverted to recover the variance. In other words, any other such measure will indeed be isomorphic to variance when restricted to normal distributions.
The value of m minimizing E[|X-m|] should change if I decrease the minimum X-value a lot, while leaving everything else constant
This does not change the minimizer of E[|X-m|] because it increases E[|X-m|] by the same amount for every m>min(X).
In general, you can’t decrease E[|X-m|] by moving m from median to median-d for d>0 because, for x≥median (half the distribution), you increase |X-m| by d, and for the other half, you decrease |X-m| by at most d.
“Any other such measure will indeed be isomorphic to variance when restricted to normal distributions.”
It’s true, but you should not restrict to normal distributions in this context. It is possible to find some distributions X1 and X2 with different variances but same value E(|x-mean|^p) for p≠2. Then X1 and X2 looks the same to this p-variance, but their normalized sample average will converge to different normal distributions. Hence variance is indeed the right and only measure of spreadout-ness to consider when applying the central limit theorem.
That’s exactly what I was trying to say, not a disagreement with it. The only step where I claimed all reasonable ways of measuring spreadout-ness agree was on the result you get after summing up a large number of iid random variables, not the random variables that were being summed up.
Good point about the central limit theorem. Two nitpicks, though.
The “or any other measure of spreadout-ness” can be dropped here; viewing the normal distribution through the lens of either the principle of maximum entropy or sufficient statistics tells us that it is variance specifically which is relevant, and any spread-metric not isomorphic to variance will be a leaky abstraction. (Leaky meaning that it will not capture all the relevant information about the spread, whereas variance does capture all the information, in a formal sense: it’s a sufficient statistic.)
I don’t think this is right. Suppose I have a uniform distribution over a finite set of X-values. The value of m minimizing E[|X-m|] should change if I decrease the minimum X-value a lot, while leaving everything else constant, but the median would stay the same.
I think the measure which would produce median is E[1 − 2 I[X>m]], where I[.] is an indicator function?
What I meant is that, if you restrict attention to normal distributions with a fixed mean, then any reasonable measure of how spread out it is (including any of the E[|x-mean|^p]) will be a sufficient statistic, because any such measure, in order to be reasonable, must increase as variance increases (for normal distributions), so this function can be inverted to recover the variance. In other words, any other such measure will indeed be isomorphic to variance when restricted to normal distributions.
This does not change the minimizer of E[|X-m|] because it increases E[|X-m|] by the same amount for every m>min(X).
In general, you can’t decrease E[|X-m|] by moving m from median to median-d for d>0 because, for x≥median (half the distribution), you increase |X-m| by d, and for the other half, you decrease |X-m| by at most d.
I don’t agree with the argument on the variance :
“Any other such measure will indeed be isomorphic to variance when restricted to normal distributions.”
It’s true, but you should not restrict to normal distributions in this context. It is possible to find some distributions X1 and X2 with different variances but same value E(|x-mean|^p) for p≠2. Then X1 and X2 looks the same to this p-variance, but their normalized sample average will converge to different normal distributions. Hence variance is indeed the right and only measure of spreadout-ness to consider when applying the central limit theorem.
That’s exactly what I was trying to say, not a disagreement with it. The only step where I claimed all reasonable ways of measuring spreadout-ness agree was on the result you get after summing up a large number of iid random variables, not the random variables that were being summed up.
Ah, these make sense. Thanks.