There is a variety of ways to get a tradeoff between the mean and the median (or, more generally, between an efficient but not robust estimator and a robust but not efficient estimator). The real question is how do you decide what a good tradeoff is.
Basically if your mean and your median are different, your distribution is asymmetric. If you want a single-point summary of the entire distribution, you need to decide how to deal with that asymmetry. Until you specify some criteria under which you’ll be optimizing your single-point summary you can’t really talk about what’s better and what’s worse.
This is just one of many possible algorithms which trade off between median and mean. Unfortunately there is no objective way to determine which one is best (or the setting of the hyperparameter.)
The criteria we are optimizing is just “how closely does it match the behavior we actually want.”
There is a variety of ways to get a tradeoff between the mean and the median (or, more generally, between an efficient but not robust estimator and a robust but not efficient estimator). The real question is how do you decide what a good tradeoff is.
Basically if your mean and your median are different, your distribution is asymmetric. If you want a single-point summary of the entire distribution, you need to decide how to deal with that asymmetry. Until you specify some criteria under which you’ll be optimizing your single-point summary you can’t really talk about what’s better and what’s worse.
This is just one of many possible algorithms which trade off between median and mean. Unfortunately there is no objective way to determine which one is best (or the setting of the hyperparameter.)
The criteria we are optimizing is just “how closely does it match the behavior we actually want.”
EDIT: Stuart Armstrong’s idea is much better: http://lesswrong.com/r/discussion/lw/mqk/mean_of_quantiles/
And what is “the behavior we actually want”?