A geometric intuition I came up with while reading:
Take a number line, and put 1, 2, and 4 on it.
-1-2---4- You’re moving a pointer along this line, and trying to minimize its total distance to the data points: -1-2---4- ....^ Intuitively, throwing it somewhere near the middle of the line makes sense. But drop 2 out, and look what happens as you move it: -1-----4- ....^ .|--| ....|--| vs. -1-----4- ......^ .|----| ......||
The distance is the same either way! (Specifically, it’s the same for any point in between 1 and 4.)
This means we’re free to move our pointer only with respect to 2, so the best answer is to get a distance-from-2 of 0 by putting it directly on 2.
To generalize this to medians of larger data sets, imagine adding more pairs of points on the outside of the range—the total distance to those points will be the same, just as it was for 1 and 4.
[edit: formatting came out a bit ugly—monospace sections seem to eat multiple spaces when displayed but not in the editor for some reason?]
A geometric intuition I came up with while reading:
Take a number line, and put 1, 2, and 4 on it.
-1-2---4-
You’re moving a pointer along this line, and trying to minimize its total distance to the data points:
-1-2---4-
....^
Intuitively, throwing it somewhere near the middle of the line makes sense. But drop 2 out, and look what happens as you move it:
-1-----4-
....^
.|--|
....|--|
vs.
-1-----4-
......^
.|----|
......||
The distance is the same either way! (Specifically, it’s the same for any point in between 1 and 4.)
This means we’re free to move our pointer only with respect to 2, so the best answer is to get a distance-from-2 of 0 by putting it directly on 2.
To generalize this to medians of larger data sets, imagine adding more pairs of points on the outside of the range—the total distance to those points will be the same, just as it was for 1 and 4.
[edit: formatting came out a bit ugly—monospace sections seem to eat multiple spaces when displayed but not in the editor for some reason?]