Construction of unbiased estimator for variance in multiple linear regression

Index: The Book of Statistical Proofs ▷ Model Selection ▷ Goodness-of-fit measures ▷ Residual variance ▷ Construction of unbiased estimator (p > 1)

Theorem: Consider a linear regression model with known design matrix $X$, known covariance structure $V$, unknown regression parameters $\beta$ and unknown noise variance $\sigma^2$:

\[\label{eq:mlr} y = X\beta + \varepsilon, \; \varepsilon \sim \mathcal{N}(0, \sigma^2 V) \; .\]

An unbiased estimator of $\sigma^2$ is given by

\[\label{eq:sigma-unb} \hat{\sigma}^2 = \frac{1}{n-p} (y-X\hat{\beta})^\mathrm{T} V^{-1} (y-X\hat{\beta})\]

where

\[\label{eq:beta-mle} \hat{\beta} = (X^\mathrm{T} V^{-1} X)^{-1} X^\mathrm{T} V^{-1} y \; .\]

Proof: It can be shown that the maximum likelihood estimator of $\sigma^2$

\[\label{eq:resvar-mle} \hat{\sigma}^2_{\mathrm{MLE}} = \frac{1}{n} (y-X\hat{\beta})^\mathrm{T} V^{-1} (y-X\hat{\beta})\]

is a biased estimator in the sense that

\[\label{eq:resvar-bias} \mathbb{E}\left[ \hat{\sigma}^2_{\mathrm{MLE}} \right] = \frac{n-p}{n} \sigma^2 \; .\]

From \eqref{eq:resvar-bias}, it follows that

\[\label{eq:resvar-bias-adj} \begin{split} \mathbb{E}\left[ \frac{n}{n-p} \hat{\sigma}^2_{\mathrm{MLE}} \right] &= \frac{n}{n-p} \mathbb{E}\left[ \hat{\sigma}^2_{\mathrm{MLE}} \right] \\ &\overset{\eqref{eq:resvar-bias}}{=} \frac{n}{n-p} \cdot \frac{n-p}{n} \sigma^2 \\ &= \sigma^2 \; , \end{split}\]

such that an unbiased estimator can be constructed as

\[\label{eq:resvar-unb-qed} \begin{split} \hat{\sigma}^2_{\mathrm{unb}} &= \frac{n}{n-p} \hat{\sigma}^2_{\mathrm{MLE}} \\ &\overset{\eqref{eq:resvar-mle}}{=} \frac{n}{n-p} \cdot \frac{1}{n} (y-X\hat{\beta})^\mathrm{T} V^{-1} (y-X\hat{\beta}) \\ &= \frac{1}{n-p} (y-X\hat{\beta})^\mathrm{T} V^{-1} (y-X\hat{\beta}) \; . \end{split}\]

∎

Sources:

original work

Metadata: ID: P439 | shortcut: resvar-unbp | author: JoramSoch | date: 2024-03-08, 10:09.