Index: The Book of Statistical ProofsStatistical ModelsUnivariate normal dataSimple linear regression ▷ F-test for model comparison

Theorem: Consider a simple linear regression model with independent observations

and the parameter estimates

\label{eq:slr-est} \begin{split} \hat{\beta}_0 &= \bar{y} - \hat{\beta}_1 \bar{x} \\ \hat{\beta}_1 &= \frac{s_{xy}}{s_x^2} \\ \hat{\sigma}^2 &= \frac{1}{n-2} \sum_{i=1}^{n} (y_i - \hat{\beta}_0 - \hat{\beta}_1 x_i)^2 \; . \end{split}

where \bar{x} and \bar{y} are the sample means of the x_i and y_i, s_{xy} is the sample covariance of the x_i and y_i and s_x^2 is the sample variance of the x_i.

Then, the test statistic

\label{eq:slr-f-comp} F = \frac{s_{xy}^2/s_x^2}{\hat{\sigma}^2/(n-1)}

follows an F-distribution

\label{eq:slr-f-comp-dist} F \sim \mathrm{F}(1, n-2)

under the scenario that the data were generated using a model in which the slope parameter is zero:

\label{eq:slr-f-comp-h0} H_0: \; \beta_1 = 0 \; .

Proof: In multiple linear regression, the contrast-based F-test is based on the F-statistic

\label{eq:mlr-f} F = \hat{\beta}^\mathrm{T} C \left( \hat{\sigma}^2 C^\mathrm{T} (X^\mathrm{T} V^{-1} X)^{-1} C \right)^{-1} C^\mathrm{T} \hat{\beta} / q

which follows an F-distribution under the null hypothesis that the product of the contrast matrix C \in \mathbb{R}^{p \times q} and the regression coefficients is a zero vector:

\label{eq:mlr-f-dist-h0} F \sim \mathrm{F}(q, n-p), \quad \text{if} \quad C^\mathrm{T} \beta = 0_q = \left[ 0, \ldots, 0 \right]^\mathrm{T} \; .

Since simple linear regression is a special case of multiple linear regression, we have the following quantities, if we want to compare the regression model against a model without the slope parameter:

\label{eq:slr-mlr} \beta = \left[ \begin{matrix} \beta_0 \\ \beta_1 \end{matrix} \right], \; \hat{\beta} = \left[ \begin{matrix} \hat{\beta}_0 \\ \hat{\beta}_1 \end{matrix} \right], \; C = \left[ \begin{matrix} 0 \\ 1 \end{matrix} \right], \; X = \left[ \begin{matrix} 1_n & x \end{matrix} \right], \; V = I_n \; .

Thus, we have the null hypothesis

\label{eq:slr-f-comp-h0-qed} H_0: \; C^\mathrm{T} \beta = \left[ \begin{matrix} 0 \\ 1 \end{matrix} \right]^\mathrm{T} \left[ \begin{matrix} \beta_0 \\ \beta_1 \end{matrix} \right] = \beta_1 = 0

and the contrast estimate

\label{eq:slr-f-comp-CTb} C^\mathrm{T} \hat{\beta} = \left[ \begin{matrix} 0 \\ 1 \end{matrix} \right]^\mathrm{T} \left[ \begin{matrix} \hat{\beta}_0 \\ \hat{\beta}_1 \end{matrix} \right] = \hat{\beta}_1 = \frac{s_{xy}}{s_x^2} \; .

Moreover, when deriving the distribution of ordinary least squares parameter estimates for simple linear regression with independent observations, we have identified the parameter covariance matrix as

\label{eq:slr-XTX-inv} (X^\mathrm{T} X)^{-1} = \frac{1}{(n-1) \, s_x^2} \cdot \left[ \begin{matrix} x^\mathrm{T}x/n & -\bar{x} \\ -\bar{x} & 1 \end{matrix} \right] \; .

Plugging \eqref{eq:slr-mlr}, \eqref{eq:slr-f-comp-CTb}, \eqref{eq:slr-XTX-inv} and \eqref{eq:slr-est} into \eqref{eq:mlr-f}, the test statistic becomes

\label{eq:slr-f-comp-qed} \begin{split} F &= \hat{\beta}^\mathrm{T} C \left( \hat{\sigma}^2 C^\mathrm{T} (X^\mathrm{T} V^{-1} X)^{-1} C \right)^{-1} C^\mathrm{T} \hat{\beta} / q \\ &= \left( \frac{s_{xy}}{s_x^2} \right) \left( \hat{\sigma}^2 \left[ \begin{matrix} 0 & 1 \end{matrix} \right] \left( \frac{1}{(n-1) \, s_x^2} \cdot \left[ \begin{matrix} x^\mathrm{T}x/n & -\bar{x} \\ -\bar{x} & 1 \end{matrix} \right] \right) \left[ \begin{matrix} 0 & 1 \end{matrix} \right]^\mathrm{T} \right)^{-1} \left( \frac{s_{xy}}{s_x^2} \right) / 1 \\ &= \frac{s_{xy}^2/(s_x^2)^2}{\hat{\sigma}^2/((n-1) \, s_x^2)} \\ &= \frac{s_{xy}^2/s_x^2}{\hat{\sigma}^2/(n-1)} \; . \end{split}

Finally, because C = \left[ \begin{matrix} 0 & 1 \end{matrix} \right]^\mathrm{T} \in \mathbb{R}^{2 \times 1} and X = \left[ \begin{matrix} 1_n & x \end{matrix} \right] \in \mathbb{R}^{n \times 2}, we have p = 2 and q = 1, such that from \eqref{eq:mlr-f-dist-h0} it follows that

\label{eq:slr-f-comp-dist-qed} \begin{split} F \sim \mathrm{F}(1, n-2), \quad \text{if} \quad \beta_1 = 0 \; . \end{split}
Sources:

Metadata: ID: P453 | shortcut: slr-fcomp | author: JoramSoch | date: 2024-05-24, 13:19.