Expression of the noise precision posterior for Bayesian linear regression using prediction and parameter errors

Index: The Book of Statistical Proofs ▷ Statistical Models ▷ Univariate normal data ▷ Bayesian linear regression ▷ Expression of posterior parameters using error terms

Theorem: Let there be a linear regression model

\[\label{eq:GLM} y = X\beta + \varepsilon, \; \varepsilon \sim \mathcal{N}(0, \sigma^2 V), \; \sigma^2 V = (\tau P)^{-1} \; ,\]

assume a normal-gamma prior distribution over the model parameters $\beta$ and $\tau = 1/\sigma^2$

\[\label{eq:GLM-NG-prior} p(\beta,\tau) = \mathcal{N}(\beta; \mu_0, (\tau \Lambda_0)^{-1}) \cdot \mathrm{Gam}(\tau; a_0, b_0)\]

and consider the Bayesian posterior distribution over these model parameters:

\[\label{eq:GLM-NG-post} p(\beta,\tau|y) = \mathcal{N}(\beta; \mu_n, (\tau \Lambda_n)^{-1}) \cdot \mathrm{Gam}(\tau; a_n, b_n) \; .\]

Then, the posterior hyperparameters for the noise precision $\tau$ can be expressed as

\[\label{eq:GLM-NG-post-tau} \begin{split} a_n &= a_0 + \frac{n}{2} \\ b_n &= b_0 + \frac{1}{2} \left( \varepsilon_y^\mathrm{T} P \varepsilon_y + \varepsilon_\beta^\mathrm{T} \Lambda_0 \varepsilon_\beta \right) \end{split}\]

where $\varepsilon_y$ and $\varepsilon_\beta$ are the “prediction errors” and “parameter errors”

\[\label{eq:GLM-NG-post-tau-err} \begin{split} \varepsilon_y &= y - \hat{y} \\ \varepsilon_\beta &= \mu_n - \mu_0 \end{split}\]

where $\hat{y}$ is the predicted signal at the posterior mean regression coefficients $\mu_n$:

\[\label{eq:GLM-NG-post-y-hat} \hat{y} = X \mu_n \; .\]

Proof: The posterior hyperparameter for Bayesian linear regression are:

\[\label{eq:GLM-NG-post-par} \begin{split} \mu_n &= \Lambda_n^{-1} (X^\mathrm{T} P y + \Lambda_0 \mu_0) \\ \Lambda_n &= X^\mathrm{T} P X + \Lambda_0 \\ a_n &= a_0 + \frac{n}{2} \\ b_n &= b_0 + \frac{1}{2} (y^\mathrm{T} P y + \mu_0^\mathrm{T} \Lambda_0 \mu_0 - \mu_n^\mathrm{T} \Lambda_n \mu_n) \; . \end{split}\]

The shape parameter $a_n$ is given by this equation. The rate parameter $b_n$ of the posterior distribution can be developped as follows:

\[\label{eq:GLM-NG-post-tau-qed} \begin{split} b_n &\overset{\eqref{eq:GLM-NG-post-par}}{=} b_0 + \frac{1}{2} \left( y^\mathrm{T} P y + \mu_0^\mathrm{T} \Lambda_0 \mu_0 - \mu_n^\mathrm{T} \Lambda_n \mu_n \right) \\ &\overset{\eqref{eq:GLM-NG-post-par}}{=} b_0 + \frac{1}{2} \left( y^\mathrm{T} P y + \mu_0^\mathrm{T} \Lambda_0 \mu_0 - \mu_n^\mathrm{T} (X^\mathrm{T} P X + \Lambda_0) \mu_n \right) \\ &= b_0 + \frac{1}{2} \left( y^\mathrm{T} P y + \mu_0^\mathrm{T} \Lambda_0 \mu_0 - \mu_n^\mathrm{T} X^\mathrm{T} P X \mu_n - \mu_n^\mathrm{T} \Lambda_0 \mu_n \right) \\ &= b_0 + \frac{1}{2} \left( (y^\mathrm{T} P y - \mu_n^\mathrm{T} X^\mathrm{T} P X \mu_n) + (\mu_0^\mathrm{T} \Lambda_0 \mu_0 - \mu_n^\mathrm{T} \Lambda_0 \mu_n) \right) \\ &= b_0 + \frac{1}{2} \left( (y - X \mu_n)^\mathrm{T} P (y - X \mu_n) + (\mu_0 - \mu_n)^\mathrm{T} \Lambda_0 (\mu_0 - \mu_n) \right) \\ &\overset{\eqref{eq:GLM-NG-post-y-hat}}{=} b_0 + \frac{1}{2} \left( (y - \hat{y})^\mathrm{T} P (y - \hat{y}) + (\mu_n - \mu_0)^\mathrm{T} \Lambda_0 (\mu_n - \mu_0) \right) \\ &\overset{\eqref{eq:GLM-NG-post-tau-err}}{=} b_0 + \frac{1}{2} \left( \varepsilon_y^\mathrm{T} P \varepsilon_y + \varepsilon_\beta^\mathrm{T} \Lambda_0 \varepsilon_\beta \right) \; . \end{split}\]

Together with equation (\ref{eq:GLM-NG-post-par}c), this completes the proof.

∎

Sources:

original work

Metadata: ID: P446 | shortcut: blr-posterr | author: JoramSoch | date: 2024-04-05, 16:07.