Index: The Book of Statistical ProofsStatistical Models ▷ Univariate normal data ▷ Simple linear regression ▷ t-test for slope parameter

Theorem: Consider a simple linear regression model with independent observations

\[\label{eq:slr} y_i = \beta_0 + \beta_1 x_i + \varepsilon_i, \; \varepsilon_i \sim \mathcal{N}(0, \sigma^2), \; i = 1,\ldots,n \; ,\]

and the parameter estimates

\[\label{eq:slr-est} \begin{split} \hat{\beta}_0 &= \bar{y} - \hat{\beta}_1 \bar{x} \\ \hat{\beta}_1 &= \frac{s_{xy}}{s_x^2} \\ \hat{\sigma}^2 &= \frac{1}{n-2} \sum_{i=1}^{n} (y_i - \hat{\beta}_0 - \hat{\beta}_1 x_i)^2 \; . \end{split}\]

where $\bar{x}$ and $\bar{y}$ are the sample means of the $x_i$ and $y_i$, $s_{xy}$ is the sample covariance of the $x_i$ and $y_i$ and $s_x^2$ is the sample variance of the $x_i$.

Then, the test statistic

\[\label{eq:slr-t-slo} t_1 = \frac{s_{xy}/s_x^2}{\sqrt{\hat{\sigma}^2 \; \sigma_1}}\]

with $\sigma_1$ equal to the first diagonal element of the parameter covariance matrix

\[\label{eq:slr-t-slo-sig} \sigma_1 = \frac{1}{\sum_{i=1}^{n} \left( x_i - \bar{x} \right)^2}\]

follows a t-distribution

\[\label{eq:slr-t-slo-dist} t_1 \sim \mathrm{t}(n-2)\]

under the null hypothesis that the slope parameter is zero:

\[\label{eq:slr-t-slo-h0} H_0: \; \beta_1 = 0 \; .\]

Proof: In multiple linear regression, the contrast-based t-test is based on the t-statistic

\[\label{eq:mlr-t} t = \frac{c^\mathrm{T} \hat{\beta}}{\sqrt{\hat{\sigma}^2 c^\mathrm{T} (X^\mathrm{T} V^{-1} X)^{-1} c}}\]

which follows a t-distribution under the null hypothesis that the scalar product of the contrast vector and the regression coefficients is zero:

\[\label{eq:mlr-t-dist-h0} t \sim \mathrm{t}(n-p), \quad \text{if} \quad c^\mathrm{T} \beta = 0 \; .\]

Since simple linear regression is a special case of multiple linear regression, in the present case we have the following quantities:

\[\label{eq:slr-mlr} \beta = \left[ \begin{matrix} \beta_0 \\ \beta_1 \end{matrix} \right], \; \hat{\beta} = \left[ \begin{matrix} \hat{\beta}_0 \\ \hat{\beta}_1 \end{matrix} \right], \; c_1 = \left[ \begin{matrix} 0 \\ 1 \end{matrix} \right], \; X = \left[ \begin{matrix} 1_n & x \end{matrix} \right], \; V = I_n \; .\]

Thus, we have the null hypothesis

\[\label{eq:slr-t-slo-h0-qed} H_0: \; c_1^\mathrm{T} \beta = \left[ \begin{matrix} 0 \\ 1 \end{matrix} \right]^\mathrm{T} \left[ \begin{matrix} \beta_0 \\ \beta_1 \end{matrix} \right] = \beta_1 = 0\]

and the contrast estimate

\[\label{eq:slr-t-slo-cTb} c_1^\mathrm{T} \hat{\beta} = \left[ \begin{matrix} 0 \\ 1 \end{matrix} \right]^\mathrm{T} \left[ \begin{matrix} \hat{\beta}_0 \\ \hat{\beta}_1 \end{matrix} \right] = \hat{\beta}_1 = \frac{s_{xy}}{s_x^2} \; .\]

Moreover, when deriving the distribution of ordinary least squares parameter estimates for simple linear regression with independent observations, we have identified the parameter covariance matrix as

\[\label{eq:slr-XTX-inv} (X^\mathrm{T} X)^{-1} = \frac{1}{(n-1) \, s_x^2} \cdot \left[ \begin{matrix} x^\mathrm{T}x/n & -\bar{x} \\ -\bar{x} & 1 \end{matrix} \right] \; .\]

Plugging \eqref{eq:slr-mlr}, \eqref{eq:slr-t-slo-cTb}, \eqref{eq:slr-XTX-inv} and \eqref{eq:slr-est} into \eqref{eq:mlr-t}, the test statistic becomes

\[\label{eq:slr-t-slo-qed} \begin{split} t_1 &= \frac{c_1^\mathrm{T} \hat{\beta}}{\sqrt{\hat{\sigma}^2 \; c_1^\mathrm{T} (X^\mathrm{T} X)^{-1} c_1}} \\ &= \frac{\left[ \begin{matrix} 0 & 1 \end{matrix} \right] \left[ \begin{matrix} \hat{\beta}_0 & \hat{\beta}_1 \end{matrix} \right]^\mathrm{T}}{\sqrt{\hat{\sigma}^2 \left[ \begin{matrix} 0 & 1 \end{matrix} \right] (X^\mathrm{T} X)^{-1} \left[ \begin{matrix} 0 & 1 \end{matrix} \right]^\mathrm{T}}} \\ &= \frac{\left[ \begin{matrix} 0 & 1 \end{matrix} \right] \left[ \begin{matrix} \hat{\beta}_0 & \hat{\beta}_1 \end{matrix} \right]^\mathrm{T}}{\sqrt{\hat{\sigma}^2 \left[ \begin{matrix} 0 & 1 \end{matrix} \right] \left( \frac{1}{(n-1) \, s_x^2} \cdot \left[ \begin{matrix} x^\mathrm{T}x/n & -\bar{x} \\ -\bar{x} & 1 \end{matrix} \right] \right) \left[ \begin{matrix} 0 & 1 \end{matrix} \right]^\mathrm{T}}} \\ &= \frac{\hat{\beta}_1}{\sqrt{\hat{\sigma}^2 \left( \frac{1}{(n-1) \, s_x^2} \right)}} \\ &= \frac{\hat{\beta}_1}{\sqrt{\hat{\sigma}^2 / \sum_{i=1}^{n} \left( x_i - \bar{x} \right)^2}} \\ &= \frac{s_{xy}/s_x^2}{\sqrt{\hat{\sigma}^2 \; \sigma_1}} \; . \end{split}\]

Finally, because $X = \left[ \begin{matrix} 1_n & x \end{matrix} \right]$ is an $n \times 2$ matrix, we have $p = 2$, such that from \eqref{eq:mlr-t-dist-h0} it follows that

\[\label{eq:slr-t-slo-dist-qed} \begin{split} t_1 \sim \mathrm{t}(n-2), \quad \text{if} \quad \beta_1 = 0 \; . \end{split}\]
Sources:

Metadata: ID: P452 | shortcut: slr-tslo | author: JoramSoch | date: 2024-05-17, 12:33.