Entropy of the categorical distribution

Index: The Book of Statistical Proofs ▷ Probability Distributions ▷ Multivariate discrete distributions ▷ Categorical distribution ▷ Shannon entropy

Theorem: Let $X$ be a random vector following a categorical distribution:

\[\label{eq:cat} X \sim \mathrm{Cat}(p) \; .\]

Then, the (Shannon) entropy of $X$ is

\[\label{eq:cat-ent} \mathrm{H}(X) = - \sum_{i=1}^{k} p_i \cdot \log p_i \; .\]

Proof: The entropy is defined as the probability-weighted average of the logarithmized probabilities for all possible values:

\[\label{eq:ent} \mathrm{H}(X) = - \sum_{x \in \mathcal{X}} p(x) \cdot \log_b p(x) \; .\]

Since there are $k$ possible values for a categorical random vector with probabilities given by the entries of the $1 \times k$ vector $p$, we have:

\[\label{eq:cat-ent-qed} \begin{split} \mathrm{H}(X) &= - \mathrm{Pr}(X = e_1) \cdot \log \mathrm{Pr}(X = e_1) - \ldots - \mathrm{Pr}(X = e_k) \cdot \log \mathrm{Pr}(X = e_k) \\ \mathrm{H}(X) &= - \sum_{i=1}^{k} \mathrm{Pr}(X = e_i) \cdot \log \mathrm{Pr}(X = e_i) \\ \mathrm{H}(X) &= - \sum_{i=1}^{k} p_i \cdot \log p_i \; . \\ \end{split}\]

∎

Sources:

original work

Metadata: ID: P336 | shortcut: cat-ent | author: JoramSoch | date: 2022-09-09, 15:41.