Mode of the multivariate normal distribution

Index: The Book of Statistical Proofs ▷ Probability Distributions ▷ Multivariate continuous distributions ▷ Multivariate normal distribution ▷ Mode

Theorem: Let $X$ follow a multivariate normal distribution:

\[\label{eq:mvn} X \sim \mathcal{N}(\mu, \Sigma) \; .\]

Then, the mode of $X$ is

\[\label{eq:mvn-mode} \mathrm{mode}(X) = \mu \; .\]

Proof: The mode is the value which maximizes the probability density function:

\[\label{eq:mode} \mathrm{mode}(X) = \operatorname*{arg\,max}_x f_X(x) \; .\]

The probability density function of the multivariate normal distribution is:

\[\label{eq:mvn-pdf} f_X(x) = \frac{1}{\sqrt{(2 \pi)^n |\Sigma|}} \cdot \exp \left[ -\frac{1}{2} (x-\mu)^\mathrm{T} \Sigma^{-1} (x-\mu) \right] \; .\]

The gradient of this function $\nabla_x f_X(x) \in \mathbb{R}^n$ is:

\[\label{eq:mvn-pdf-grad} \begin{split} \nabla_x f_X(x) &= f_X(x) \cdot \nabla_x \left[ -\frac{1}{2} (x-\mu)^\mathrm{T} \Sigma^{-1} (x-\mu) \right] \\ &= f_X(x) \cdot \left[ -\frac{1}{2} \left( 2 \Sigma^{-1} (x-\mu) \right) \right] \\ &= f_X(x) \cdot \left[ -\Sigma^{-1} (x-\mu) \right] \; . \end{split}\]

The Hessian of this function $\nabla_x^2 f_X(x) \in \mathbb{R}^{n \times n}$ is:

\[\label{eq:mvn-pdf-hess} \begin{split} \nabla_x^2 f_X(x) &= \nabla_x f_X(x) \cdot \left[ -\Sigma^{-1} (x-\mu) \right]^\mathrm{T} + f_X(x) \cdot \nabla_x \left[ -\Sigma^{-1} (x-\mu) \right] \\ &= f_X(x) \cdot \left[ -\Sigma^{-1} (x-\mu) \right] \left[ -(x-\mu)^\mathrm{T} \Sigma^{-1} \right] + f_X(x) \cdot \left[ -\Sigma^{-1} \right] \\ &= f_X(x) \cdot \left[ \Sigma^{-1} (x-\mu) (x-\mu)^\mathrm{T} \Sigma^{-1} - \Sigma^{-1} \right] \; . \end{split}\]

Setting the gradient \eqref{eq:mvn-pdf-grad} to zero, we calculate the root:

\[\label{eq:mvn-mode-s1} \begin{split} \nabla_x f_X(x) = 0 &= f_X(x) \cdot \left[ -\Sigma^{-1} (x-\mu) \right] \\ \Leftrightarrow \quad 0 &= -\Sigma^{-1} (x-\mu) \\ \Leftrightarrow \quad x &= \mu \; . \end{split}\]

Plugging this value into the Hessian \eqref{eq:mvn-pdf-hess}, we obtain:

\[\label{eq:mvn-mode-s2} \begin{split} \nabla_x^2 f_X(\mu) &= f_X(\mu) \cdot \left[ \Sigma^{-1} (\mu-\mu) (\mu-\mu)^\mathrm{T} \Sigma^{-1} - \Sigma^{-1} \right] \\ &= -f_X(\mu) \cdot \Sigma^{-1} \; . \end{split}\]

Since $\Sigma$ is positive-definite, $\Sigma^{-1}$ is also positive-definite and since $f_X(\mu)$ is positive, $-f_X(\mu) \cdot \Sigma^{-1}$ is a negative-definite matrix, so that $f_X(\mu)$ is a maximum. Taken together, this shows that

\[\label{eq:mvn-mode-qed} \mathrm{mode}(X) = \mu \; .\]

∎

Sources:

original work

Metadata: ID: P498 | shortcut: mvn-mode | author: JoramSoch | date: 2025-05-21, 13:12.