Index: The Book of Statistical ProofsGeneral Theorems ▷ Machine learning ▷ Scoring rules ▷ Log probability scoring rule

Definition: A log (probability) scoring rule $S(q, y)$ is as a scoring rule that measures the quality of a probabilistic forecast in decision theory. Formally, it can be defined in discrete or continuous form as follows:

1) Log scoring rule for binary classification:

$% \label{eq:binary-lpsr-cases} S(q, y) = \left\{ \begin{array}{rl} \log q \; , & \text{if} \; y = 1 \\ \log(1-q) \; , & \text{if} \; y = 0 \end{array} \right.$

which can be expressed as

$\label{eq:binary-lpsr} S(q, y) = y \log q + (1-y) \log (1-q)$

Note that the expressions given above have slightly different domains. For the first equation, the domain is $D_1 = ([0,1) \times \left\lbrace 0 \right\rbrace) \cup ((0, 1] \times \left\lbrace 1 \right\rbrace)$, while for the second equation, the domain is $D_2 = (0,1) \times \left\lbrace 0,1 \right\rbrace$.

2) Log scoring rule for multiclass classification:

$\label{eq:multiclass-lpsr} S(q, y) = \sum_k y_k \log q_k(x) = \log q_{y^*}(x)$

where $y^*$ is the true class and $q$ is the predicted probability distribution over the classes. We have $y_k = 1$, if the true class is $k$ and $y_k = 0$ otherwise.

3) Log scoring rule for regression (continuous case):

$\label{eq:regression-lpsr} S(q, y) = \log q(y)$

where $q$ is the predicted probability distribution over the continuous space and $y$ is the true value.

Sources:

Metadata: ID: D195 | shortcut: lpsr | author: KarahanS | date: 2024-02-28, 20:50.