Index: The Book of Statistical ProofsStatistical ModelsMultivariate normal dataMultivariate Gaussian

Theorem: Let there be a bivariate normally distributed data set $y = \left\lbrace y_1, \ldots, y_n \right\rbrace$:

\[\label{eq:bvn-data} y_i = \left[ \begin{matrix} y_{i1} \\ y_{i2} \end{matrix} \right] \sim \mathcal{N}\left( \left[ \begin{matrix} \mu_1 \\ \mu_2 \end{matrix} \right], \left[ \begin{matrix} \sigma_1^2 & \rho \sigma_1 \sigma_2 \\ \rho \sigma_1 \sigma_2 & \sigma_2^2 \end{matrix} \right] \right), \quad i = 1, \ldots, n \; .\]

Then, the maximum likelihood estimates of $\mu_1$, $\mu_2$, $\sigma_1^2$, $\sigma_2^2$ and $\rho$ are given by

\[\label{eq:bvn-mle} \begin{split} \hat{\mu}_1 &= \frac{1}{n} \sum_{i=1}^n y_{i1} \\ \hat{\mu}_2 &= \frac{1}{n} \sum_{i=1}^n y_{i2} \\ \hat{\sigma}_1^2 &= \frac{1}{n} \sum_{i=1}^n (y_{i1} - \hat{\mu}_1)^2 \\ \hat{\sigma}_2^2 &= \frac{1}{n} \sum_{i=1}^n (y_{i2} - \hat{\mu}_2)^2 \\ \hat{\rho} &= \frac{\frac{1}{n} \sum_{i=1}^n (y_{i1} - \hat{\mu}_1) (y_{i2} - \hat{\mu}_2)}{\sqrt{\hat{\sigma}_1^2} \sqrt{\hat{\sigma}_2^2}} \; . \end{split}\]

Proof: This is a special case of multivariate normally distributed data

\[\label{eq:mvn-data} y_i = \left[ \begin{matrix} y_{i1} \\ \vdots \\ y_{ip} \end{matrix} \right] \sim \mathcal{N}\left( \mu, \Sigma \right), \quad i = 1, \ldots, n\]

for which maximum likelihood estimates are given by

\[\label{eq:mvn-mle} \begin{split} \hat{\mu} &= \frac{1}{n} \sum_{i=1}^n y_i \\ \hat{\Sigma} &= \frac{1}{n} \sum_{i=1}^n (y_i - \hat{\mu}) (y_i - \hat{\mu})^\mathrm{T} \; . \end{split}\]

Applying this to \eqref{eq:bvn-data} by setting $p=2$, we obtain:

\[\label{eq:bvn-mle-s1} \begin{split} \hat{\mu} &= \left[ \begin{matrix} \frac{1}{n} \sum_{i=1}^n y_{i1} \\ \frac{1}{n} \sum_{i=1}^n y_{i2} \end{matrix} \right] \\ \hat{\Sigma} &= \left[ \begin{matrix} \frac{1}{n} \sum_{i=1}^n (y_{i1}-\hat{\mu}_1) (y_{i1}-\hat{\mu}_1) & \frac{1}{n} \sum_{i=1}^n (y_{i1}-\hat{\mu}_1) (y_{i2}-\hat{\mu}_2) \\ \frac{1}{n} \sum_{i=1}^n (y_{i2}-\hat{\mu}_2) (y_{i1}-\hat{\mu}_1) & \frac{1}{n} \sum_{i=1}^n (y_{i2}-\hat{\mu}_2) (y_{i2}-\hat{\mu}_2) \end{matrix} \right] \; . \end{split}\]

Equating this with the parametrized bivariate normal estimates

\[\label{eq:bvn-mle-s2} \hat{\mu} = \left[ \begin{matrix} \hat{\mu}_1 \\ \hat{\mu}_2 \end{matrix} \right] \quad \text{and} \quad \hat{\Sigma} = \left[ \begin{matrix} \hat{\sigma}_1^2 & \hat{\rho} \hat{\sigma}_1 \hat{\sigma}_2 \\ \hat{\rho} \hat{\sigma}_1 \hat{\sigma}_2 & \hat{\sigma}_2^2 \end{matrix} \right] \; ,\]

we obtain ML estimates for the means as

\[\label{eq:bvn-mle-mean} \begin{split} \hat{\mu}_1 &= \frac{1}{n} \sum_{i=1}^n y_{i1} \\ \hat{\mu}_2 &= \frac{1}{n} \sum_{i=1}^n y_{i2} \; , \end{split}\]

we obtain ML estimates for the variances as

\[\label{eq:bvn-mle-var} \begin{split} \hat{\sigma}_1^2 &= \frac{1}{n} \sum_{i=1}^n (y_{i1} - \hat{\mu}_1)^2 \\ \hat{\sigma}_2^2 &= \frac{1}{n} \sum_{i=1}^n (y_{i2} - \hat{\mu}_2)^2 \end{split}\]

and we obtain an ML estimate for the correlation as:

\[\label{eq:bvn-mle-corr} \begin{split} \hat{\rho} \hat{\sigma}_1 \hat{\sigma}_2 &= \frac{1}{n} \sum_{i=1}^n (y_{i1}-\hat{\mu}_1) (y_{i2}-\hat{\mu}_2) \\ \hat{\rho} &= \frac{\frac{1}{n} \sum_{i=1}^n (y_{i1}-\hat{\mu}_1) (y_{i2}-\hat{\mu}_2)}{\hat{\sigma}_1 \hat{\sigma}_2} \\ \hat{\rho} &= \frac{\frac{1}{n} \sum_{i=1}^n (y_{i1} - \hat{\mu}_1) (y_{i2} - \hat{\mu}_2)}{\sqrt{\hat{\sigma}_1^2} \sqrt{\hat{\sigma}_2^2}} \\ \hat{\rho} &= \frac{\frac{1}{n} \sum_{i=1}^n (y_{i1}-\hat{\mu}_1) (y_{i2}-\hat{\mu}_2)}{\sqrt{\frac{1}{n} \sum_{i=1}^n (y_{i1} - \hat{\mu}_1)^2} \sqrt{\frac{1}{n} \sum_{i=1}^n (y_{i2} - \hat{\mu}_2)^2}} \; . \end{split}\]

Together, \eqref{eq:bvn-mle-mean}, \eqref{eq:bvn-mle-var} and \eqref{eq:bvn-mle-corr} constitute the maximum likelihood estimates for bivariate normally distributed data.

Sources:

Metadata: ID: P503 | shortcut: bvn-mle | author: JoramSoch | date: 2025-06-20, 12:16.