Proof: Relation of discrete Kullback-Leibler divergence to Shannon entropy
Index:
The Book of Statistical Proofs ▷
General Theorems ▷
Information theory ▷
Kullback-Leibler divergence ▷
Relation to discrete entropy
Metadata: ID: P113 | shortcut: kl-ent | author: JoramSoch | date: 2020-05-27, 23:20.
Theorem: Let $X$ be a discrete random variable with possible outcomes $\mathcal{X}$ and let $P$ and $Q$ be two probability distributions on $X$. Then, the Kullback-Leibler divergence of $P$ from $Q$ can be expressed as
\[\label{eq:kl-ent} \mathrm{KL}[P||Q] = \mathrm{H}(P,Q) - \mathrm{H}(P)\]where $\mathrm{H}(P,Q)$ is the cross-entropy of $P$ and $Q$ and $\mathrm{H}(P)$ is the marginal entropy of $P$.
Proof: The discrete Kullback-Leibler divergence is defined as
\[\label{eq:KL} \mathrm{KL}[P||Q] = \sum_{x \in \mathcal{X}} p(x) \cdot \log \frac{p(x)}{q(x)}\]where $p(x)$ and $q(x)$ are the probability mass functions of $P$ and $Q$.
Separating the logarithm, we have:
\[\label{eq:KL-dev} \mathrm{KL}[P||Q] = - \sum_{x \in \mathcal{X}} p(x) \, \log q(x) + \sum_{x \in \mathcal{X}} p(x) \, \log p(x) \; .\]Now considering the definitions of marginal entropy and cross-entropy
\[\label{eq:ME-CE} \begin{split} \mathrm{H}(P) &= - \sum_{x \in \mathcal{X}} p(x) \, \log p(x) \\ \mathrm{H}(P,Q) &= - \sum_{x \in \mathcal{X}} p(x) \, \log q(x) \; , \end{split}\]we can finally show:
\[\label{eq:KL-qed} \mathrm{KL}[P||Q] = \mathrm{H}(P,Q) - \mathrm{H}(P) \; .\]∎
Sources: - Wikipedia (2020): "Kullback-Leibler divergence"; in: Wikipedia, the free encyclopedia, retrieved on 2020-05-27; URL: https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Motivation.
Metadata: ID: P113 | shortcut: kl-ent | author: JoramSoch | date: 2020-05-27, 23:20.