A1987dM comments on Rationality Quotes September 2012

A1987dM 29 Sep 2012 9:57 UTC
14 points

For a hundred years or so, mathematical statisticians have been in love with the fact that the probability distribution of the sum of a very large number of very small random deviations almost always converges to a normal distribution. … This infatuation tended to focus interest away from the fact that, for real data, the normal distribution is often rather poorly realized, if it is realized at all. We are often taught, rather casually, that, on average, measurements will fall within ±σ of the true value 68% of the time, within ±2σ 95% of the time, and within ±3σ 99.7% of the time. Extending this, one would expect a measurement to be off by ±20σ only one time out of 2 × 10^88. We all know that “glitches” are much more likely than that!

-- W.H. Press et al., Numerical Recipes, Sec. 15.1
- ThirdOrderScientist 4 Oct 2012 18:23 UTC
  0 points
  Parent
  I don’t think it’s fair to blame the mathematical statisticians. Any mathematical statistician worth his / her salt knows that the Central Limit Theorem applies to the sample mean of a collection of independent and identically distributed random variables, not to the random variables themselves. This, and the fact that the t-statistic converges in distribution to the normal distribution as the sample size increases, is the reason we apply any of this normal theory at all.
  
  Press’s comment applies more to those who use the statistics blindly, without understanding the underlying theory. Which, admittedly, can be blamed on those same mathematical statisticians who are teaching this very deep theory to undergraduates in an intro statistics class with a lot of (necessary at that level) hand-waving. If the statistics user doesn’t understand that a random variable is a measurable function from its sample space to the real line, then he/she is unlikely to appreciate the finer points of the Central Limit Theorem. But that’s because mathematical statistics is hard (i.e. requires non-trivial amounts of work to really grasp), not because the mathematical statisticians have done a disservice to science.