Index: The Book of Statistical ProofsStatistical Models ▷ Categorical data ▷ Logistic regression ▷ Definition

Definition: A logistic regression model is given by a set of binary observations $y_i \in \left\lbrace 0, 1 \right\rbrace, i = 1,\ldots,n$, a set of predictors $x_j \in \mathbb{R}^n, j = 1,\ldots,p$, a base $b$ and the assumption that the log-odds are a linear combination of the predictors:

\[\label{eq:logreg} l_i = x_i \beta + \varepsilon_i, \; i = 1,\ldots,n\]

where $l_i$ are the log-odds that $y_i = 1$

\[\label{eq:logodds} l_i = \log_b \frac{\mathrm{Pr}(y_i = 1)}{\mathrm{Pr}(y_i = 0)}\]

and $x_i$ is the $i$-th row of the $n \times p$ matrix

\[\label{eq:X} X = \left[ x_1, \ldots, x_p \right] \; .\]

Within this model,

  • $y$ are called “categorical observations” or “dependent variable”;

  • $X$ is called “design matrix” or “set of independent variables”;

  • $\beta$ are called “regression coefficients” or “weights”;

  • $\varepsilon_i$ is called “noise” or “error term”;

  • $n$ is the number of observations;

  • $p$ is the number of predictors.


Metadata: ID: D76 | shortcut: logreg | author: JoramSoch | date: 2020-06-28, 20:51.