Index: The Book of Statistical ProofsGeneral TheoremsBayesian statisticsBayesian inference ▷ Free energy is lower bound on log model evidence

Theorem: Let $m$ be a generative model with likelihood function $p(y \vert \theta,m) = p(y \vert \theta)$ and prior distribution $p(\theta \vert m) = p(\theta)$. Then, under a variational Bayesian treatment using the approximate posterior distribution $q(\theta \vert m) = q(\theta) \approx p(\theta \vert y)$, the variational free energy is a lower bound on the log model evidence:

\[\label{eq:vb-fe-lme} \mathrm{F}[q(\theta)] \leq \log p(y) = \log \int_{\Theta} p(y,\theta \vert m) \, \mathrm{d}\theta \; .\]

Proof: Using a decomposition of the variational free energy, it can be shown that the free energy is equal to the difference between the log model evidence and the Kullback-Leibler divergence of approximate from true posterior distribution:

\[\label{eq:vb-fe-dec} \mathrm{F}[q(\theta)] = \log p(y) - \mathrm{KL}[q(\theta) || p(\theta \vert y)]\]

Since the KL divergence is zero or positive for any two distributions

\[\label{eq:kl-nonneg} \mathrm{KL}[P||Q] \geq 0 \; ,\]

the free energy must be smaller than or equal to the log model evidence:

\[\label{eq:vb-fe-lme-qed} \mathrm{F}[q(\theta)] \leq \log p(y) \; .\]
Sources:

Metadata: ID: P517 | shortcut: fren-lme | author: JoramSoch | date: 2025-09-25, 11:24.