I don’t think maximizing the sum of the negative exponents gets you the mode. If you use 0−p=0 then the supremum (infinity) is not attained, while if you use 0−p=∞ then the maximum (infinity) is attained at any data point. If you do it with a continuous distribution you get more sensible answers but the solution (which is intuitively the “point of greatest concentration”) is not necessarily unique.
It’s worth mentioning that when p>1 the p-mean is unique: this is because x↦|x−m|p is a convex function, the sum of convex functions is convex, and convex functions have unique minima.
I’m using 0−p=∞ and using the cheaty convention that e.g. 3⋅∞>2⋅∞. I think this is what you get if you regard a discrete distribution as a limit of continuous ones. If this is too cheaty, of course it’s fine just to stick with non-negative values of p.
Yeah, OK. It works but you need to make sure to take the limit in a particular way, e.g. convolution with a sequence of approximations to the identity. Also you need to assume that p>−1 since otherwise the statistic diverges even for the continuous distributions.
I don’t think maximizing the sum of the negative exponents gets you the mode. If you use 0−p=0 then the supremum (infinity) is not attained, while if you use 0−p=∞ then the maximum (infinity) is attained at any data point. If you do it with a continuous distribution you get more sensible answers but the solution (which is intuitively the “point of greatest concentration”) is not necessarily unique.
It’s worth mentioning that when p>1 the p-mean is unique: this is because x↦|x−m|p is a convex function, the sum of convex functions is convex, and convex functions have unique minima.
I’m using 0−p=∞ and using the cheaty convention that e.g. 3⋅∞>2⋅∞. I think this is what you get if you regard a discrete distribution as a limit of continuous ones. If this is too cheaty, of course it’s fine just to stick with non-negative values of p.
Yeah, OK. It works but you need to make sure to take the limit in a particular way, e.g. convolution with a sequence of approximations to the identity. Also you need to assume that p>−1 since otherwise the statistic diverges even for the continuous distributions.