Index: The Book of Statistical ProofsGeneral TheoremsEstimation theory ▷ Interval estimates ▷ Construction of confidence intervals using Wilks' theorem

Theorem: Let $m$ be a generative model for measured data $y$ with model parameters $\theta \in \Theta$, consisting of a parameter of interest $\phi \in \Phi$ and nuisance parameters $\lambda \in \Lambda$:

$\label{eq:mod-par} m: p(y|\theta) = \mathcal{D}(y; \theta), \quad \theta = \left\lbrace \phi, \lambda \right\rbrace \; .$

Further, let $\hat{\theta}$ be an estimate of $\theta$, obtained using maximum-likelihood-estimation:

$\label{eq:theta-mle} \hat{\theta} = \operatorname*{arg\,max}_{\theta} \log p(y|\theta), \quad \hat{\theta} = \left\lbrace \hat{\phi}, \hat{\lambda} \right\rbrace \; .$

Then, an asymptotic confidence interval for $\theta$ is given by

$\label{eq:ci-wilks} \mathrm{CI}_{1-\alpha}(\hat{\phi}) = \left\lbrace \phi \, \vert \, \log p(y|\phi,\hat{\lambda}) \geq \log p(y|\hat{\phi},\hat{\lambda}) - \frac{1}{2} \chi^2_{1,1-\alpha} \right\rbrace$

where $1-\alpha$ is the confidence level and $\chi^2_{1,1-\alpha}$ is the $(1-\alpha)$-quantile of the chi-squared distribution with 1 degree of freedom.

Proof: The confidence interval is defined as the interval that, under infinitely repeated random experiments, contains the true parameter value with a certain probability.

Let us define the likelihood ratio

$\label{eq:lr} \Lambda(\phi) = \frac{p(y|\phi,\hat{\lambda})}{p(y|\hat{\phi},\hat{\lambda})} \quad \text{for all} \quad \phi \in \Phi$

and compute the log-likelihood ratio

$\label{eq:llr} \log \Lambda(\phi) = \log p(y|\phi,\hat{\lambda}) - \log p(y|\hat{\phi},\hat{\lambda}) \; .$

Wilks’ theorem states that, when comparing two statistical models with parameter spaces $\Theta_1$ and $\Theta_0 \subset \Theta_1$, as the sample size approaches infinity, the quantity calculated as $-2$ times the log-ratio of maximum likelihoods follows a chi-squared distribution, if the null hypothesis is true:

$\label{eq:wilks} H_0: \theta \in \Theta_0 \quad \Rightarrow \quad -2 \log \frac{\operatorname*{max}_{\theta \in \Theta_0} p(y|\theta)}{\operatorname*{max}_{\theta \in \Theta_1} p(y|\theta)} \sim \chi^2_{\Delta k} \quad \text{as} \quad n \rightarrow \infty$

where $\Delta k$ is the difference in dimensionality between $\Theta_0$ and $\Theta_1$. Applied to our example in \eqref{eq:llr}, we note that $\Theta_1 = \left\lbrace \phi, \hat{\phi} \right\rbrace$ and $\Theta_0 = \left\lbrace \phi \right\rbrace$, such that $\Delta k = 1$ and Wilks’ theorem implies:

$\label{eq:llr-wilks} -2 \log \Lambda(\phi) \sim \chi^2_1 \; .$

Using the quantile function $\chi^2_{k,p}$ of the chi-squared distribution, an $(1-\alpha)$-confidence interval is therefore given by all values $\phi$ that satisfy

$\label{eq:llr-chi2} -2 \log \Lambda(\phi) \leq \chi^2_{1,1-\alpha} \; .$

Applying \eqref{eq:llr} and rearranging, we can evaluate

$\label{eq:llr-chi2-dev} \begin{split} -2 \left[ \log p(y|\phi,\hat{\lambda}) - \log p(y|\hat{\phi},\hat{\lambda}) \right] &\leq \chi^2_{1,1-\alpha} \\ \log p(y|\phi,\hat{\lambda}) - \log p(y|\hat{\phi},\hat{\lambda}) &\geq -\frac{1}{2} \chi^2_{1,1-\alpha} \\ \log p(y|\phi,\hat{\lambda}) &\geq \log p(y|\hat{\phi},\hat{\lambda}) - \frac{1}{2} \chi^2_{1,1-\alpha} \end{split}$

which is equivalent to the confidence interval given by \eqref{eq:ci-wilks}.

Sources:

Metadata: ID: P56 | shortcut: ci-wilks | author: JoramSoch | date: 2020-02-19, 17:15.