F-statistic

Index: The Book of Statistical Proofs ▷ Model Selection ▷ Goodness-of-fit measures ▷ F-statistic ▷ Definition

Definition: Consider two linear regression models with independent observations

\[\label{eq:m0-m1} \begin{split} m_1: \; y &= X\beta + \varepsilon, \; \varepsilon_i \overset{\mathrm{i.i.d.}}{\sim} \mathcal{N}(0, \sigma^2) \\ m_0: \; y &= X_0\beta_0 + \varepsilon_0, \; \varepsilon_{0i} \overset{\mathrm{i.i.d.}}{\sim} \mathcal{N}(0, \sigma_0^2) \end{split}\]

operating on identical measured data $y$, but with different design matrices $X \in \mathbb{R}^{n \times p}$ and $X_0 \in \mathbb{R}^{n \times p_0}$ and thus different regression coefficients $\beta \in \mathbb{R}^{p \times 1}$ and $\beta_0 \in \mathbb{R}^{p_0 \times 1}$. Furthermore, let the design matrix of the null model be fully contained in the design matrix of the full model:

\[\label{eq:X-X0-X1} X = \left[ \begin{array}{cc} X_0 & X_1 \end{array} \right] \; .\]

Then, the F-statistic for model comparison is defined as the ratio of the difference in residual sum of squares between the two models, divided by the difference in number of parameters, to the residual sum of squares of the full model, divided by the number of degrees of freedom:

\[\label{eq:F} F = \frac{(\mathrm{RSS}_0-\mathrm{RSS})/(p-p_0)}{\mathrm{RSS}/(n-p)} \; .\]

Sources:

Wikipedia (2024): "F-test"; in: Wikipedia, the free encyclopedia, retrieved on 2024-03-15; URL: https://en.wikipedia.org/wiki/F-test#Regression_problems.

Metadata: ID: D196 | shortcut: fstat | author: JoramSoch | date: 2024-03-15, 11:31.