Index: The Book of Statistical ProofsStatistical Models ▷ Univariate normal data ▷ Simple linear regression ▷ Sums of squares

Theorem: Under ordinary least squares for simple linear regression, total, explained and residual sums of squares are given by

\[\label{eq:slr-sss} \begin{split} \mathrm{TSS} &= (n-1) \, s_y^2 \\ \mathrm{ESS} &= (n-1) \, \frac{s_{xy}^2}{s_x^2} \\ \mathrm{RSS} &= (n-1) \left( s_y^2 - \frac{s_{xy}^2}{s_x^2} \right) \end{split}\]

where $s_x^2$ and $s_y^2$ are the sample variances of $x$ and $y$ and $s_{xy}$ is the sample covariance between $x$ and $y$.

Proof: The ordinary least squares parameter estimates are given by

\[\label{eq:slr-ols} \hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x} \quad \text{and} \quad \hat{\beta}_1 = \frac{s_{xy}}{s_x^2} \; .\]


1) The total sum of squares is defined as

\[\label{eq:TSS} \mathrm{TSS} = \sum_{i=1}^{n} (y_i - \bar{y})^2\]

which can be reformulated as follows:

\[\label{eq:TSS-qed} \begin{split} \mathrm{TSS} &= \sum_{i=1}^{n} (y_i - \bar{y})^2 \\ &= (n-1) \frac{1}{n-1} \sum_{i=1}^{n} (y_i - \bar{y})^2 \\ &= (n-1) s_y^2 \; . \end{split}\]


2) The explained sum of squares is defined as

\[\label{eq:ESS} \mathrm{ESS} = \sum_{i=1}^n (\hat{y}_i - \bar{y})^2 \quad \text{where} \quad \hat{y}_i = \hat{\beta}_0 + \hat{\beta}_1 x_i\]

which, with the OLS parameter estimates, becomes:

\[\label{eq:ESS-qed} \begin{split} \mathrm{ESS} &= \sum_{i=1}^n (\hat{y}_i - \bar{y})^2 \\ &= \sum_{i=1}^n (\hat{\beta}_0 + \hat{\beta}_1 x_i - \bar{y})^2 \\ &\overset{\eqref{eq:slr-ols}}{=} \sum_{i=1}^n (\bar{y} - \hat{\beta}_1 \bar{x} + \hat{\beta}_1 x_i - \bar{y})^2 \\ &= \sum_{i=1}^n \left( \hat{\beta}_1 (x_i - \bar{x}) \right)^2 \\ &\overset{\eqref{eq:slr-ols}}{=} \sum_{i=1}^n \left( \frac{s_{xy}}{s_x^2} (x_i - \bar{x}) \right)^2 \\ &= \left( \frac{s_{xy}}{s_x^2} \right)^2 \sum_{i=1}^n (x_i - \bar{x})^2 \\ &= \left( \frac{s_{xy}}{s_x^2} \right)^2 (n-1) s_x^2 \\ &= (n-1) \, \frac{s_{xy}^2}{s_x^2} \; . \end{split}\]


3) The residual sum of squares is defined as

\[\label{eq:RSS} \mathrm{RSS} = \sum_{i=1}^n (y_i - \hat{y}_i)^2 \quad \text{where} \quad \hat{y}_i = \hat{\beta}_0 + \hat{\beta}_1 x_i\]

which, with the OLS parameter estimates, becomes:

\[\label{eq:RSS-qed} \begin{split} \mathrm{RSS} &= \sum_{i=1}^n (y_i - \hat{y}_i)^2 \\ &= \sum_{i=1}^n (y_i - \hat{\beta}_0 - \hat{\beta}_1 x_i)^2 \\ &\overset{\eqref{eq:slr-ols}}{=} \sum_{i=1}^n (y_i - \bar{y} + \hat{\beta}_1 \bar{x} - \hat{\beta}_1 x_i)^2 \\ &= \sum_{i=1}^n \left( (y_i - \bar{y}) - \hat{\beta}_1 (x_i - \bar{x}) \right)^2 \\ &= \sum_{i=1}^n \left( (y_i - \bar{y})^2 - 2 \hat{\beta}_1 (x_i - \bar{x}) (y_i - \bar{y}) + \hat{\beta}_1^2 (x_i - \bar{x})^2 \right) \\ &= \sum_{i=1}^n (y_i - \bar{y})^2 - 2 \hat{\beta}_1 \sum_{i=1}^n (x_i - \bar{x}) (y_i - \bar{y}) + \hat{\beta}_1^2 \sum_{i=1}^n (x_i - \bar{x})^2 \\ &= (n-1) \, s_y^2 - 2 (n-1) \, \hat{\beta}_1 \, s_{xy} + (n-1) \, \hat{\beta}_1^2 \, s_x^2 \\ &\overset{\eqref{eq:slr-ols}}{=} (n-1) \, s_y^2 - 2 (n-1) \left( \frac{s_{xy}}{s_x^2} \right) s_{xy} + (n-1) \left( \frac{s_{xy}}{s_x^2} \right)^2 s_x^2 \\ &= (n-1) \, s_y^2 - (n-1) \, \frac{s_{xy}^2}{s_x^2} \\ &= (n-1) \left( s_y^2 - \frac{s_{xy}^2}{s_x^2} \right) \; . \end{split}\]
Sources:

Metadata: ID: P284 | shortcut: slr-sss | author: JoramSoch | date: 2021-11-09, 11:34.