Index: The Book of Statistical ProofsModel Selection ▷ Bayesian model selection ▷ Model evidence ▷ Partition into accuracy and complexity

Theorem: The log model evidence can be partitioned into accuracy and complexity

$\label{eq:LME} \mathrm{LME}(m) = \mathrm{Acc}(m) - \mathrm{Com}(m)$

where the accuracy term is the posterior expectation of the log-likelihood function

$\label{eq:Acc} \mathrm{Acc}(m) = \left\langle \log p(y|\theta,m) \right\rangle_{p(\theta|y,m)}$

and the complexity penalty is the Kullback-Leibler divergence of posterior from prior

$\label{eq:Com} \mathrm{Com}(m) = \mathrm{KL} \left[ p(\theta|y,m) \, || \, p(\theta|m) \right] \; .$

Proof: We consider Bayesian inference on data $y$ using model $m$ with parameters $\theta$. Then, Bayes’ theorem makes a statement about the posterior distribution, i.e. the probability of parameters, given the data and the model:

$\label{eq:AnC-s1} p(\theta|y,m) = \frac{p(y|\theta,m) \, p(\theta|m)}{p(y|m)} \; .$

Rearranging this for the model evidence, we have:

$\label{eq:AnC-s2} p(y|m) = \frac{p(y|\theta,m) \, p(\theta|m)}{p(\theta|y,m)} \; .$

Logarthmizing both sides of the equation, we obtain:

$\label{eq:AnC-s3} \log p(y|m) = \log p(y|\theta,m) - \log \frac{p(\theta|y,m)}{p(\theta|m)} \; .$

Now taking the expectation over the posterior distribution yields:

$\label{eq:AnC-s4} \log p(y|m) = \int p(\theta|y,m) \log p(y|\theta,m) \, \mathrm{d}\theta - \int p(\theta|y,m) \log \frac{p(\theta|y,m)}{p(\theta|m)} \, \mathrm{d}\theta \; .$

By definition, the left-hand side is the log model evidence and the terms on the right-hand side correspond to the posterior expectation of the log-likelihood function and the Kullback-Leibler divergence of posterior from prior

$\label{eq:LME-AnC} \mathrm{LME}(m) = \left\langle \log p(y|\theta,m) \right\rangle_{p(\theta|y,m)} - \mathrm{KL} \left[ p(\theta|y,m) \, || \, p(\theta|m) \right]$

which proofs the partition given by \eqref{eq:LME}.

Sources:

Metadata: ID: P3 | shortcut: lme-anc | author: JoramSoch | date: 2019-09-27, 16:13.