Index: The Book of Statistical ProofsStatistical ModelsUnivariate normal dataSimple linear regression ▷ Residual variance in terms of sample variance

Theorem: Assume a simple linear regression model with independent observations

\[\label{eq:slr} y = \beta_0 + \beta_1 x + \varepsilon, \; \varepsilon_i \sim \mathcal{N}(0, \sigma^2), \; i = 1,\ldots,n\]

and consider estimation using ordinary least squares. Then, residual variance and sample variance are related to each other via the correlation coefficient:

\[\label{eq:slr-vars} \hat{\sigma}^2 = \left( 1 - r_{xy}^2 \right) s_y^2 \; .\]

Proof: The residual variance can be expressed in terms of the residual sum of squares:

\[\label{eq:slr-res} \hat{\sigma}^2 = \frac{1}{n-1} \, \mathrm{RSS}(\hat{\beta}_0,\hat{\beta}_1)\]

and the residual sum of squares for simple linear regression is

\[\label{eq:slr-rss} \mathrm{RSS}(\hat{\beta}_0,\hat{\beta}_1) = (n-1) \left( s_y^2 - \frac{s_{xy}^2}{s_x^2} \right) \; .\]

Combining \eqref{eq:slr-res} and \eqref{eq:slr-rss}, we obtain:

\[\label{eq:slr-vars-s1} \begin{split} \hat{\sigma}^2 &= \left( s_y^2 - \frac{s_{xy}^2}{s_x^2} \right) \\ &= \left( 1 - \frac{s_{xy}^2}{s_x^2 s_y^2} \right) s_y^2 \\ &= \left( 1 - \left( \frac{s_{xy}}{s_x \, s_y} \right)^2 \right) s_y^2 \; . \end{split}\]

Using the relationship between correlation, covariance and standard deviation

\[\label{eq:corr-cov-std} \mathrm{Corr}(X,Y) = \frac{\mathrm{Cov}(X,Y)}{\sqrt{\mathrm{Var}(X)} \sqrt{\mathrm{Var}(Y)}}\]

which also holds for sample correlation, sample covariance and sample standard deviation

\[\label{eq:corr-cov-std-samp} r_{xy} = \frac{s_{xy}}{s_x \, s_y} \; ,\]

we get the final result:

\[\label{eq:slr-vars-s2} \hat{\sigma}^2 = \left( 1 - r_{xy}^2 \right) s_y^2 \; .\]
Sources:

Metadata: ID: P278 | shortcut: slr-resvar | author: JoramSoch | date: 2021-10-27, 14:37.