The median minimizes the mean absolute error

Index: The Book of Statistical Proofs ▷ General Theorems ▷ Probability theory ▷ Measures of central tendency ▷ Median minimizes mean absolute error

Theorem: Let $X_1, \ldots, X_n$ be a collection of continuous random variables drawn from a probability distribution with the probability density function $f(x)$ supported on $(-\infty, \infty)$ with common median $m$. Then, $m$ minimizes the mean absolute error:

\[\label{eq:med-mae} m = \operatorname*{arg\,min}_{a \in \mathbb{R}} \mathrm{E}\left[ \lvert X_i - a \rvert \right] \; .\]

Proof: We can find the optimum by performing a derivative test. First, since an absolute value function is not differentaible at 0, we simplify the objective function by splitting it into two separate integrals:

\[\label{eq:med-mae-s1} E(\lvert X_i - a \rvert) = \int_{-\infty}^a (a - x) f(x) \, \mathrm{d}x + \int_{a}^\infty (x - a) f(x) \, \mathrm{d}x \; .\]

Now note that $\lvert\frac{\partial}{\partial a}(a - x)f(x)\rvert = \lvert\frac{\partial}{\partial a}(x - a)f(x)\rvert = f(x)$. Consequently, $\int_{-\infty}^af(x) = P(X_i < a)$ and $\int_{a}^\infty f(x) = P(X_i > a)$, both of which must be finite by the axioms of probability. Therefore, these integrals meet the conditions for application of Leibniz’s rule.

Applying Leibniz’s integral rule, we can differentiate the objective function as follows:

\[\label{eq:med-mae-s2} \begin{split} &\frac{\partial}{\partial a} \left( \int_{-\infty}^a (a - x) f(x) \, \mathrm{d}x + \int_{a}^\infty (x - a) f(x) \, \mathrm{d}x \right) \\ =\; &(a - x) f(x) \Big|_{x=a} + \int_{-\infty}^a f(x) \, \mathrm{d}x - (x - a) f(x) \Big|_{x=a} - \int_{a}^\infty f(x) \, \mathrm{d}x \; . \end{split}\]

Canceling non-integral terms and setting this derivative to 0, it must be true that

\[\label{eq:dmed-da} \int_{-\infty}^a f(x) \, \mathrm{d}x - \int_{a}^\infty f(x) \, \mathrm{d}x = 0 \quad \Rightarrow \quad P(X_i < a) = P(X_i > a) \; .\]

Together with the probability of the complement, this yields the implication

\[\label{eq:med-mae-qed} P(X_i < a) = P(X_i > a) \quad \Rightarrow \quad P(X_i < a) = 1 - P(X_i < a) \quad \Rightarrow \quad P(X_i < a) = 0.5\]

As a result, $a$ satisfies the definition of a median at the critical point of the objective function.

Finally, the absolute value is a convex function. Thus, by Jensen’s inequality, its expected value is also convex. This implies, since the median is the sole critical point, that it must be a global minimum. Therefore, the median must minimize the mean absolute error, completing the proof.

∎

Sources:

Wikipedia (2024): "Derivative test"; in: Wikipedia, the free encyclopedia, retrieved on 2024-09-23; URL: https://en.wikipedia.org/wiki/Derivative_test.
Wikipedia (2024): "Leibniz integral rule"; in: Wikipedia, the free encyclopedia, retrieved on 2024-09-23; URL: https://en.wikipedia.org/wiki/Leibniz_integral_rule.
Wikipedia (2024): "Jensen's inequality"; in: Wikipedia, the free encyclopedia, retrieved on 2024-09-23; URL: https://en.wikipedia.org/wiki/Jensen%27s_inequality.
Wikipedia (2024): "Convex function"; in: Wikipedia, the free encyclopedia, retrieved on 2024-09-23; URL: https://en.wikipedia.org/wiki/Convex_function.

Metadata: ID: P471 | shortcut: med-mae | author: salbalkus | date: 2024-09-23, 23:30.