Index: The Book of Statistical ProofsStatistical Models ▷ Univariate normal data ▷ Univariate Gaussian with known variance ▷ Accuracy and complexity

Theorem: Let

$\label{eq:ugkv} y = \left\lbrace y_1, \ldots, y_n \right\rbrace, \quad y_i \sim \mathcal{N}(\mu, \sigma^2), \quad i = 1, \ldots, n$

be a univariate Gaussian data set with unknown mean $\mu$ and known variance $\sigma^2$. Moreover, assume a statistical model imposing a normal distribution as the prior distribution on the model parameter $\mu$:

$\label{eq:m} m: \; y_i \sim \mathcal{N}(\mu, \sigma^2), \; \mu \sim \mathcal{N}(\mu_0, \lambda_0^{-1}) \; .$

Then, accuracy and complexity of this model are

$\label{eq:UGkv-anc} \begin{split} \mathrm{Acc}(m) &= \frac{n}{2} \log\left( \frac{\tau}{2 \pi} \right) - \frac{1}{2} \left[ \tau y^\mathrm{T} y - 2 \, \tau n \bar{y} \mu_n + \tau n \mu_n^2 + \frac{\tau n}{\lambda_n} \right] \\ \mathrm{Com}(m) &= \frac{1}{2} \left[ \frac{\lambda_0}{\lambda_n} + \lambda_0 (\mu_0 - \mu_n)^2 - 1 + \log\left( \frac{\lambda_0}{\lambda_n} \right) \right] \end{split}$

where $\mu_n$ and $\lambda_n$ are the posterior hyperparameters for the univariate Gaussian with known variance, $\tau = 1/\sigma^2$ is the inverse variance or precision and $\bar{y}$ is the sample mean.

Proof: Model accuracy and complexity are defined as

$\label{eq:lme-anc} \begin{split} \mathrm{LME}(m) &= \mathrm{Acc}(m) - \mathrm{Com}(m) \\ \mathrm{Acc}(m) &= \left\langle \log p(y|\mu,m) \right\rangle_{p(\mu|y,m)} \\ \mathrm{Com}(m) &= \mathrm{KL} \left[ p(\mu|y,m) \, || \, p(\mu|m) \right] \; . \end{split}$

The accuracy term is the expectation of the log-likelihood function $\log p(y|\mu)$ with respect to the posterior distribution $p(\mu|y)$. With the log-likelihood function for the univariate Gaussian with known variance and the posterior distribution for the univariate Gaussian with known variance, the model accuracy of $m$ evaluates to:

$\label{eq:UGkv-Acc} \begin{split} \mathrm{Acc}(m) &= \left\langle \log p(y|\mu) \right\rangle_{p(\mu|y)} \\ &= \left\langle \frac{n}{2} \log\left( \frac{\tau}{2 \pi} \right) - \frac{\tau}{2} \left( y^\mathrm{T} y - 2 n \bar{y} \mu + n \mu^2 \right) \right\rangle_{\mathcal{N}(\mu_n, \lambda_n^{-1})} \\ &= \frac{n}{2} \log\left( \frac{\tau}{2 \pi} \right) - \frac{1}{2} \left[ \tau y^\mathrm{T} y - 2 \tau n \bar{y} \mu_n + \tau n \mu_n^2 + \frac{\tau n}{\lambda_n} \right] \; . \end{split}$

The complexity penalty is the Kullback-Leibler divergence of the posterior distribution $p(\mu|y)$ from the prior distribution $p(\mu)$. With the prior distribution given by \eqref{eq:m}, the posterior distribution for the univariate Gaussian with known variance and the Kullback-Leibler divergence of the normal distribution, the model complexity of $m$ evaluates to:

$\label{eq:UGkv-Com} \begin{split} \mathrm{Com}(m) &= \mathrm{KL} \left[ p(\mu|y) \, || \, p(\mu) \right] \\ &= \mathrm{KL} \left[ \mathcal{N}(\mu_n, \lambda_n^{-1}) \, || \, \mathcal{N}(\mu_0, \lambda_0^{-1}) \right] \\ &= \frac{1}{2} \left[ \frac{\lambda_0}{\lambda_n} + \lambda_0 (\mu_0 - \mu_n)^2 - 1 + \log\left( \frac{\lambda_0}{\lambda_n} \right) \right] \; . \end{split}$

A control calculation confirms that

$\label{eq:UGkv-anc-lme} \mathrm{Acc}(m) - \mathrm{Com}(m) = \mathrm{LME}(m)$

where $\mathrm{LME}(m)$ is the log model evidence for the univariate Gaussian with known variance.

Sources:

Metadata: ID: P214 | shortcut: ugkv-anc | author: JoramSoch | date: 2021-03-24, 07:49.