Index: The Book of Statistical ProofsStatistical Models ▷ Univariate normal data ▷ Multiple linear regression ▷ Distribution of OLS estimates, signal and residuals

Theorem: Assume a linear regression model with independent observations

$\label{eq:mlr} y = X\beta + \varepsilon, \; \varepsilon_i \overset{\mathrm{i.i.d.}}{\sim} \mathcal{N}(0, \sigma^2)$

and consider estimation using ordinary least squares. Then, the estimated parameters, fitted signal and residuals are distributed as

$\label{eq:mlr-dist} \begin{split} \hat{\beta} &\sim \mathcal{N}\left( \beta, \sigma^2 (X^\mathrm{T} X)^{-1} \right) \\ \hat{y} &\sim \mathcal{N}\left( X \beta, \sigma^2 P \right) \\ \hat{\varepsilon} &\sim \mathcal{N}\left( 0, \sigma^2 (I_n - P) \right) \end{split}$

where $P$ is the projection matrix for ordinary least squares

$\label{eq:mlr-pmat} P = X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \; .$

Proof: We will use the linear transformation theorem for the multivariate normal distribution:

$\label{eq:mvn-ltt} x \sim \mathcal{N}(\mu, \Sigma) \quad \Rightarrow \quad y = Ax + b \sim \mathcal{N}(A\mu + b, A \Sigma A^\mathrm{T}) \; .$

The distributional assumption in \eqref{eq:mlr} is equivalent to:

$\label{eq:mlr-vect} y = X\beta + \varepsilon, \; \varepsilon \sim \mathcal{N}(0, \sigma^2 I_n) \; .$

Applying \eqref{eq:mvn-ltt} to \eqref{eq:mlr-vect}, the measured data are distributed as

$\label{eq:y-dist} y \sim \mathcal{N}\left( X \beta, \sigma^2 I_n \right) \; .$

1) The parameter estimates from ordinary least sqaures are given by

$\label{eq:b-est} \hat{\beta} = (X^\mathrm{T} X)^{-1} X^\mathrm{T} y$

and thus, by applying \eqref{eq:mvn-ltt} to \eqref{eq:b-est}, they are distributed as

$\label{eq:b-est-dist} \begin{split} \hat{\beta} &\sim \mathcal{N}\left( \left[ (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right] X \beta, \, \sigma^2 \left[ (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right] I_n \left[ X (X^\mathrm{T} X)^{-1} \right] \right) \\ &\sim \mathcal{N}\left( \beta, \, \sigma^2 (X^\mathrm{T} X)^{-1} \right) \; . \end{split}$

2) The fitted signal in multiple linear regression is given by

$\label{eq:y-est} \hat{y} = X \hat{\beta} = X (X^\mathrm{T} X)^{-1} X^\mathrm{T} y = P y$

and thus, by applying \eqref{eq:mvn-ltt} to \eqref{eq:y-est}, they are distributed as

$\label{eq:y-est-dist} \begin{split} \hat{y} &\sim \mathcal{N}\left( X \beta, \, \sigma^2 X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right) \\ &\sim \mathcal{N}\left( X \beta, \, \sigma^2 P \right) \; . \end{split}$

3) The residuals of the linear regression model are given by

$\label{eq:e-est} \hat{\varepsilon} = y - X \hat{\beta} = \left( I_n - X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right) y = \left( I_n - P \right) y$

and thus, by applying \eqref{eq:mvn-ltt} to \eqref{eq:e-est}, they are distributed as

$\label{eq:e-est-dist-s1} \begin{split} \hat{\varepsilon} &\sim \mathcal{N}\left( \left[ I_n - X (X^\mathrm{T} X)^{-1} X^\mathrm{T} \right] X \beta, \, \sigma^2 \left[ I_n - P \right] I_n \left[ I_n - P \right]^\mathrm{T} \right) \\ &\sim \mathcal{N}\left( X \beta - X \beta, \, \sigma^2 \left[ I_n - P \right] \left[ I_n - P \right]^\mathrm{T} \right) \; . \end{split}$

Because the residual-forming matrix is symmetric and idempotent, this becomes:

$\label{eq:e-est-dist-s2} \hat{\varepsilon} \sim \mathcal{N}\left( 0, \sigma^2 (I_n - P) \right) \; .$
Sources:

Metadata: ID: P400 | shortcut: mlr-olsdist | author: JoramSoch | date: 2022-12-23, 16:36.