This is a good point. I think squared errors are often used because they are always positive and also analytic—you can take derivatives and get smooth functions. But for many problems they are not especially appropriate.
Informally problems are often posed with an absolute-value error function. Like the square root, this has a cusp at zero and so will “hold water”. If some people miss too high and others miss too low, then in this case it also makes sense to switch to the average. If everyone misses on the same side, then it doesn’t help but doesn’t hurt to switch to the average. So in general it is a good strategy.
I mentioned the other day one example of the good performance of the average in “guessing beans in a jar” type problems. In this case the average came out 3rd best compared to guesses from a class of 73 students. This implicitly uses an absolute-value error function and the problem was such that people missed on both sides. Jensen’s Inequality shows why averages work well in such problems.
Informally problems are often posed with an absolute-value error function. Like the square root, this has a cusp at zero
abs(x) has a corner, not a cusp at zero. For a cusp, the derivative approaches +infinity from one side and -infinity from the other; for a corner, it is undefined, but approaches a finite value from at least one of the sides.
This is a good point. I think squared errors are often used because they are always positive and also analytic—you can take derivatives and get smooth functions. But for many problems they are not especially appropriate.
Informally problems are often posed with an absolute-value error function. Like the square root, this has a cusp at zero and so will “hold water”. If some people miss too high and others miss too low, then in this case it also makes sense to switch to the average. If everyone misses on the same side, then it doesn’t help but doesn’t hurt to switch to the average. So in general it is a good strategy.
I mentioned the other day one example of the good performance of the average in “guessing beans in a jar” type problems. In this case the average came out 3rd best compared to guesses from a class of 73 students. This implicitly uses an absolute-value error function and the problem was such that people missed on both sides. Jensen’s Inequality shows why averages work well in such problems.