Proof: Variance of the coefficient of determination under the null hypothesis
Index:
The Book of Statistical Proofs ▷
Model Selection ▷
Goodness-of-fit measures ▷
R-squared ▷
Variance under null hypothesis
Metadata: ID: P509 | shortcut: rsq-var | author: JoramSoch | date: 2025-07-04, 13:12.
Theorem: Consider a linear regression model with known design matrix $X = \left[ 1_n, \; X_1 \right] \in \mathbb{R}^{n \times p}$, known covariance structure $V$, unknown regression parameters $\beta$ and unknown noise variance $\sigma^2$:
\[\label{eq:mlr} y = X\beta + \varepsilon, \; \varepsilon \sim \mathcal{N}(0, \sigma^2 V) \; .\]Then, under the null hypothesis that the true coefficient of determination is zero, i.e. $H_0: \; R^2 = 0$, the variance of the coefficient of determination is
\[\label{eq:rsq-var} \mathrm{Var}(R^2) = 2 \cdot \frac{(p-1) \cdot (n-p)}{(n+1) \cdot (n-1)^2} \; .\]Proof: We know that R-squared follows a beta distribution under $H_0$:
\[\label{eq:rsq-dist} R^2 \sim \mathrm{Bet}\left( \frac{p-1}{2}, \frac{n-p}{2} \right) \; .\]Using the variance of the beta distribution
\[\label{eq:beta-var} X \sim \mathrm{Bet}(\alpha, \beta) \\ \quad \Rightarrow \quad \mathrm{Var}(X) = \frac{\alpha \beta}{(\alpha+\beta+1) \cdot (\alpha+\beta)^2} \; ,\]we have:
\[\label{eq:rsq-var-qed} \begin{split} \mathrm{Var}(R^2) &= \frac{\frac{p-1}{2} \cdot \frac{n-p}{2}}{\left( \frac{p-1}{2} + \frac{n-p}{2} + 1 \right) \cdot \left( \frac{p-1}{2} + \frac{n-p}{2} \right)^2} \\ &= \frac{\frac{(p-1) \cdot (n-p)}{4}}{\frac{1}{2} \cdot (n+1) \cdot \frac{(n-1)^2}{4}} \\ &= 2 \cdot \frac{(p-1) \cdot (n-p)}{(n+1) \cdot (n-1)^2} \; . \end{split}\]This completes the proof.
∎
Sources: Metadata: ID: P509 | shortcut: rsq-var | author: JoramSoch | date: 2025-07-04, 13:12.