Index: The Book of Statistical ProofsGeneral TheoremsProbability theoryMeasures of central tendency ▷ Median minimizes mean absolute error

Theorem: Let $X_1, \ldots, X_n$ be a collection of continuous random variables drawn from a probability distribution with the probability density function $f(x)$ supported on $(-\infty, \infty)$ with common median $m$. Then, $m$ minimizes the mean absolute error:

\[\label{eq:med-mae} m = \operatorname*{arg\,min}_{a \in \mathbb{R}} \mathrm{E}\left[ \lvert X_i - a \rvert \right] \; .\]

Proof: We can find the optimum by performing a derivative test. First, since an absolute value function is not differentaible at 0, we simplify the objective function by splitting it into two separate integrals:

\[\label{eq:med-mae-s1} E(\lvert X_i - a \rvert) = \int_{-\infty}^a (a - x) f(x) \, \mathrm{d}x + \int_{a}^\infty (x - a) f(x) \, \mathrm{d}x \; .\]

Now note that $\lvert\frac{\partial}{\partial a}(a - x)f(x)\rvert = \lvert\frac{\partial}{\partial a}(x - a)f(x)\rvert = f(x)$. Consequently, $\int_{-\infty}^af(x) = P(X_i < a)$ and $\int_{a}^\infty f(x) = P(X_i > a)$, both of which must be finite by the axioms of probability. Therefore, these integrals meet the conditions for application of Leibniz’s rule.

Applying Leibniz’s integral rule, we can differentiate the objective function as follows:

\[\label{eq:med-mae-s2} \begin{split} &\frac{\partial}{\partial a} \left( \int_{-\infty}^a (a - x) f(x) \, \mathrm{d}x + \int_{a}^\infty (x - a) f(x) \, \mathrm{d}x \right) \\ =\; &(a - x) f(x) \Big|_{x=a} + \int_{-\infty}^a f(x) \, \mathrm{d}x - (x - a) f(x) \Big|_{x=a} - \int_{a}^\infty f(x) \, \mathrm{d}x \; . \end{split}\]

Canceling non-integral terms and setting this derivative to 0, it must be true that

\[\label{eq:dmed-da} \int_{-\infty}^a f(x) \, \mathrm{d}x - \int_{a}^\infty f(x) \, \mathrm{d}x = 0 \quad \Rightarrow \quad P(X_i < a) = P(X_i > a) \; .\]

Together with the probability of the complement, this yields the implication

\[\label{eq:med-mae-qed} P(X_i < a) = P(X_i > a) \quad \Rightarrow \quad P(X_i < a) = 1 - P(X_i < a) \quad \Rightarrow \quad P(X_i < a) = 0.5\]

As a result, $a$ satisfies the definition of a median at the critical point of the objective function.

Finally, the absolute value is a convex function. Thus, by Jensen’s inequality, its expected value is also convex. This implies, since the median is the sole critical point, that it must be a global minimum. Therefore, the median must minimize the mean absolute error, completing the proof.

Sources:

Metadata: ID: P471 | shortcut: med-mae | author: salbalkus | date: 2024-09-23, 23:30.