Index: The Book of Statistical ProofsStatistical Models ▷ Categorical data ▷ Logistic regression ▷ Probability and log-odds

Theorem: Assume a logistic regression model

\[\label{eq:logreg} l_i = x_i \beta + \varepsilon_i, \; i = 1,\ldots,n\]

where $x_i$ are the predictors corresponding to the $i$-th observation $y_i$ and $l_i$ are the log-odds that $y_i = 1$.

Then, the log-odds in favor of $y_i = 1$ against $y_i = 0$ can also be expressed as

\[\label{eq:lodds} l_i = \log_b \frac{p(x_i|y_i=1) \, p(y_i=1)}{p(x_i|y_i=0) \, p(y_i=0)}\]

where $p(x_i \vert y_i)$ is a likelihood function consistent with \eqref{eq:logreg}, $p(y_i)$ are prior probabilities for $y_i = 1$ and $y_i = 0$ and where $b$ is the base used to form the log-odds $l_i$.

Proof: Using Bayes’ theorem and the law of marginal probability, the posterior probabilities for $y_i = 1$ and $y_i = 0$ are given by

\[\label{eq:prob} \begin{split} p(y_i=1|x_i) &= \frac{p(x_i|y_i=1) \, p(y_i=1)}{p(x_i|y_i=1) \, p(y_i=1) + p(x_i|y_i=0) \, p(y_i=0)} \\ p(y_i=0|x_i) &= \frac{p(x_i|y_i=0) \, p(y_i=0)}{p(x_i|y_i=1) \, p(y_i=1) + p(x_i|y_i=0) \, p(y_i=0)} \; . \end{split}\]

Calculating the log-odds from the posterior probabilties, we have

\[\label{eq:lodds-qed} \begin{split} l_i &= \log_b \frac{p(y_i=1|x_i)}{p(y_i=0|x_i)} \\ &= \log_b \frac{p(x_i|y_i=1) \, p(y_i=1)}{p(x_i|y_i=0) \, p(y_i=0)} \; . \end{split}\]
Sources:

Metadata: ID: P105 | shortcut: logreg-pnlo | author: JoramSoch | date: 2020-05-19, 05:08.