Index: The Book of Statistical ProofsProbability DistributionsUnivariate continuous distributionsNormal distribution ▷ Relationship to chi-squared distribution

Theorem: Let be independent random variables where each of them is following a normal distribution with mean \mu and variance \sigma^2:

\label{eq:norm} X_i \sim \mathcal{N}(\mu, \sigma^2) \quad \text{for} \quad i = 1, \ldots, n \; .

Define the sample mean

\label{eq:mean-samp} \bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i

and the unbiased sample variance

\label{eq:var-samp} s^2 = \frac{1}{n-1} \sum_{i=1}^{n} \left( X_i - \bar{X} \right)^2 \; .

Then, the sampling distribution of the sample variance is given by a chi-squared distribution with n-1 degrees of freedom:

\label{eq:norm-chi2} V = (n-1) \, \frac{s^2}{\sigma^2} \sim \chi^2(n-1) \; .

Proof: Consider the random variable U_i defined as

\label{eq:Ui} U_i = \frac{X_i - \mu}{\sigma}

which follows a standard normal distribution

\label{eq:norm-snorm} U_i \sim \mathcal{N}(0,1) \; .

Then, the sum of squared random variables U_i can be rewritten as

\label{eq:sum-Ui2-s1} \begin{split} \sum_{i=1}^{n} U_i^2 &= \sum_{i=1}^{n} \left( \frac{X_i - \mu}{\sigma} \right)^2 \\ &= \sum_{i=1}^{n} \left( \frac{(X_i - \bar{X}) + (\bar{X} - \mu)}{\sigma} \right)^2 \\ &= \sum_{i=1}^{n} \frac{(X_i - \bar{X})^2}{\sigma^2} + \sum_{i=1}^{n} \frac{(\bar{X} - \mu)^2}{\sigma^2} + 2 \sum_{i=1}^{n} \frac{(X_i - \bar{X})(\bar{X} - \mu)}{\sigma^2} \\ &= \sum_{i=1}^{n} \left( \frac{X_i - \bar{X}}{\sigma} \right)^2 + \sum_{i=1}^{n} \left( \frac{\bar{X} - \mu}{\sigma} \right)^2 + 2\frac{(\bar{X} - \mu)}{\sigma^2} \sum_{i=1}^{n} (X_i - \bar{X}) \; . \end{split}

Because the following sum is zero

\label{eq:Xi-Xb} \begin{split} \sum_{i=1}^{n} (X_i - \bar{X}) &= \sum_{i=1}^{n} X_i - n \bar{X} \\ &= \sum_{i=1}^{n} X_i - n \cdot \frac{1}{n} \sum_{i=1}^{n} X_i \\ &= \sum_{i=1}^{n} X_i - \sum_{i=1}^{n} X_i \\ &= 0 \; , \end{split}

the third term disappears, i.e.

\label{eq:sum-Ui2-s2} \sum_{i=1}^{n} U_i^2 = \sum_{i=1}^{n} \left( \frac{X_i - \bar{X}}{\sigma} \right)^2 + \sum_{i=1}^{n} \left( \frac{\bar{X} - \mu}{\sigma} \right)^2 \; .

Cochran’s theorem states that, if a sum of squared standard normal random variables can be written as a sum of squared forms

\label{eq:cochran-p1} \begin{split} \sum_{i=1}^{n} U_i^2 = \sum_{j=1}^{m} Q_j \quad &\text{where} \quad Q_j = \sum_{k=1}^{n} \sum_{l=1}^{n} U_k B^{(j)}_{kl} U_l \\ &\text{with} \quad \sum_{j=1}^{m} B^{(j)} = I_n \\ &\text{and} \quad r_j = \mathrm{rank}(B^{(j)}) \; , \end{split}

then the terms Q_j are independent and each term Q_j follows a chi-squared distribution with r_j degrees of freedom:

\label{eq:cochran-p2} Q_j \sim \chi^2(r_j) \; .

We observe that \eqref{eq:sum-Ui2-s2} can be represented as

\label{eq:sum-Ui2-s3} \begin{split} \sum_{i=1}^{n} U_i^2 &= \sum_{i=1}^{n} \left( \frac{X_i - \bar{X}}{\sigma} \right)^2 + \sum_{i=1}^{n} \left( \frac{\bar{X} - \mu}{\sigma} \right)^2 \\ = Q_1 + Q_2 &= \sum_{i=1}^{n} \left( U_i - \frac{1}{n} \sum_{j=1}^n U_j \right)^2 + \frac{1}{n} \left( \sum_{i=1}^{n} U_i \right)^2 \end{split}

where, with the n \times n matrix of ones J_n, the matrices B^{(j)} are

\label{eq:sum-Ui2-s3-Bj} B^{(1)} = I_n - \frac{J_n}{n} \quad \text{and} \quad B^{(2)} = \frac{J_n}{n} \; .

Because all columns of B^{(2)} are identical, it has rank r_2 = 1. Because the n columns of B^{(1)} add up to zero, it has rank r_1 = n-1. Thus, the conditions of Cochran’s theorem are met and the squared form

\label{eq:Q1} Q_1 = \sum_{i=1}^{n} \left( \frac{X_i - \bar{X}}{\sigma} \right)^2 = (n-1) \, \frac{1}{\sigma^2} \, \frac{1}{n-1} \sum_{i=1}^{n} \left( X_i - \bar{X} \right)^2 = (n-1) \, \frac{s^2}{\sigma^2}

follows a chi-squared distribution with n-1 degrees of freedom:

\label{eq:norm-chi2-qed} (n-1) \, \frac{s^2}{\sigma^2} \sim \chi^2(n-1) \; .
Sources:

Metadata: ID: P233 | shortcut: norm-chi2 | author: JoramSoch | date: 2021-05-20, 10:18.