Proof: Entropy of the categorical distribution
Index:
The Book of Statistical Proofs ▷
Probability Distributions ▷
Multivariate discrete distributions ▷
Categorical distribution ▷
Shannon entropy
Metadata: ID: P336 | shortcut: cat-ent | author: JoramSoch | date: 2022-09-09, 15:41.
Theorem: Let $X$ be a random vector following a categorical distribution:
\[\label{eq:cat} X \sim \mathrm{Cat}(p) \; .\]Then, the (Shannon) entropy of $X$ is
\[\label{eq:cat-ent} \mathrm{H}(X) = - \sum_{i=1}^{k} p_i \cdot \log p_i \; .\]Proof: The entropy is defined as the probability-weighted average of the logarithmized probabilities for all possible values:
\[\label{eq:ent} \mathrm{H}(X) = - \sum_{x \in \mathcal{X}} p(x) \cdot \log_b p(x) \; .\]Since there are $k$ possible values for a categorical random vector with probabilities given by the entries of the $1 \times k$ vector $p$, we have:
\[\label{eq:cat-ent-qed} \begin{split} \mathrm{H}(X) &= - \mathrm{Pr}(X = e_1) \cdot \log \mathrm{Pr}(X = e_1) - \ldots - \mathrm{Pr}(X = e_k) \cdot \log \mathrm{Pr}(X = e_k) \\ \mathrm{H}(X) &= - \sum_{i=1}^{k} \mathrm{Pr}(X = e_i) \cdot \log \mathrm{Pr}(X = e_i) \\ \mathrm{H}(X) &= - \sum_{i=1}^{k} p_i \cdot \log p_i \; . \\ \end{split}\]∎
Sources: Metadata: ID: P336 | shortcut: cat-ent | author: JoramSoch | date: 2022-09-09, 15:41.