Proof: Accuracy and complexity for the univariate Gaussian
Theorem: Let
\[\label{eq:ug} m: \; y = \left\lbrace y_1, \ldots, y_n \right\rbrace, \quad y_i \sim \mathcal{N}(\mu, \sigma^2), \quad i = 1, \ldots, n\]be a univariate Gaussian data set with unknown mean $\mu$ and unknown variance $\sigma^2$. Moreover, assume a normal-gamma prior distribution over the model parameters $\mu$ and $\tau = 1/\sigma^2$:
\[\label{eq:UG-NG-prior} p(\mu,\tau) = \mathcal{N}(\mu; \mu_0, (\tau \lambda_0)^{-1}) \cdot \mathrm{Gam}(\tau; a_0, b_0) \; .\]Then, accuracy and complexity of this model are
\[\label{eq:UG-NG-AnC} \begin{split} \mathrm{Acc}(m) &= - \frac{1}{2} \frac{a_n}{b_n} \left( y^\mathrm{T} y - 2 n \bar{y} \mu_n + n \mu_n^2 \right) - \frac{1}{2} n \lambda_n^{-1} + \frac{n}{2} \left(\psi(a_n) - \log(b_n)\right) - \frac{n}{2} \log (2 \pi) \\ \mathrm{Com}(m) &= \frac{1}{2} \frac{a_n}{b_n} \left[ \lambda_0 (\mu_0 - \mu_n)^2 - 2 (b_n - b_0) \right] + \frac{1}{2} \frac{\lambda_0}{\lambda_n} - \frac{1}{2} \log \frac{\lambda_0}{\lambda_n} - \frac{1}{2} \\ &+ a_0 \cdot \log \frac{b_n}{b_0} - \log \frac{\Gamma(a_n)}{\Gamma(a_0)} + (a_n - a_0) \cdot \psi(a_n) \end{split}\]where $\mu_n$ and $\lambda_n$ as well as $a_n$ and $b_n$ are the posterior hyperparameters for the univariate Gaussian and $\bar{y}$ is the sample mean.
Proof: Model accuracy and complexity are defined as
\[\label{eq:lme-anc} \begin{split} \mathrm{LME}(m) &= \mathrm{Acc}(m) - \mathrm{Com}(m) \\ \mathrm{Acc}(m) &= \left\langle \log p(y|\mu,\tau,m) \right\rangle_{p(\mu,\tau|y,m)} \\ \mathrm{Com}(m) &= \mathrm{KL} \left[ p(\mu,\tau|y,m) \, || \, p(\mu,\tau|m) \right] \; . \end{split}\]
The accuracy term is the expectation of the log-likelihood function $\log p(y|\mu,\tau)$ with respect to the posterior distribution $p(\mu,\tau|y)$. With the log-likelihood function for the univariate Gaussian and the posterior distribution for the univariate Gaussian, the model accuracy of $m$ evaluates to:
The complexity penalty is the Kullback-Leibler divergence of the posterior distribution $p(\mu,\tau|y)$ from the prior distribution $p(\mu,\tau)$. With the prior distribution given by \eqref{eq:UG-NG-prior}, the posterior distribution for the univariate Gaussian and the Kullback-Leibler divergence of the normal-gamma distribution, the model complexity of $m$ evaluates to:
A control calculation confirms that
\[\label{eq:UG-NG-AnC-LME} \mathrm{Acc}(m) - \mathrm{Com}(m) = \mathrm{LME}(m)\]where $\mathrm{LME}(m)$ is the log model evidence for the univariate Gaussian.
Metadata: ID: P240 | shortcut: ug-anc | author: JoramSoch | date: 2021-07-14, 08:26.