Proof: Non-symmetry of the Kullback-Leibler divergence
Index:
The Book of Statistical Proofs ▷
General Theorems ▷
Information theory ▷
Kullback-Leibler divergence ▷
Non-symmetry
Metadata: ID: P147 | shortcut: kl-nonsymm | author: JoramSoch | date: 2020-08-11, 06:57.
Theorem: The Kullback-Leibler divergence is non-symmetric, i.e.
\[\label{eq:KL-nonsymm} \mathrm{KL}[P||Q] \neq \mathrm{KL}[Q||P]\]for some probability distributions $P$ and $Q$.
Proof: Let $X \in \mathcal{X} = \left\lbrace 0, 1, 2 \right\rbrace$ be a discrete random variable and consider the two probability distributions
\[\label{eq:P-Q} \begin{split} P : \, X &\sim \mathrm{Bin}(2, 0.5) \\ Q : \, X &\sim \mathcal{U}(0, 2) \end{split}\]where $\mathrm{Bin}(n, p)$ indicates a binomial distribution and $\mathcal{U}(a, b)$ indicates a discrete uniform distribution.
Then, the probability mass function of the binomial distribution entails that
\[\label{eq:p(x)} p(x) = \left\{ \begin{array}{rl} 1/4 \; , & \text{if} \; x = 0 \\ 1/2 \; , & \text{if} \; x = 1 \\ 1/4 \; , & \text{if} \; x = 2 \end{array} \right.\]and the probability mass function of the discrete uniform distribution entails that
\[\label{eq:q(x)} q(x) = \frac{1}{3} \; ,\]such that the Kullback-Leibler divergence of $P$ from $Q$ is
\[\label{eq:KL-P-Q} \begin{split} \mathrm{KL}[P||Q] &= \sum_{x \in \mathcal{X}} p(x) \cdot \log \frac{p(x)}{q(x)} \\ &= \frac{1}{4} \log \frac{3}{4} + \frac{1}{2} \log \frac{3}{2} + \frac{1}{4} \log \frac{3}{4} \\ &= \frac{1}{2} \log \frac{3}{4} + \frac{1}{2} \log \frac{3}{2} \\ &= \frac{1}{2} \left( \log \frac{3}{4} + \log \frac{3}{2} \right) \\ &= \frac{1}{2} \log \left( \frac{3}{4} \cdot \frac{3}{2} \right) \\ &= \frac{1}{2} \log \frac{9}{8} = 0.0589 \end{split}\]and the Kullback-Leibler divergence of $Q$ from $P$ is
\[\label{eq:KL-Q-P} \begin{split} \mathrm{KL}[Q||P] &= \sum_{x \in \mathcal{X}} q(x) \cdot \log \frac{q(x)}{p(x)} \\ &= \frac{1}{3} \log \frac{4}{3} + \frac{1}{3} \log \frac{2}{3} + \frac{1}{3} \log \frac{4}{3} \\ &= \frac{1}{3} \left( \log \frac{4}{3} + \log \frac{2}{3} + \log \frac{4}{3} \right) \\ &= \frac{1}{3} \log \left( \frac{4}{3} \cdot \frac{2}{3} \cdot \frac{4}{3} \right) \\ &= \frac{1}{3} \log \frac{32}{27} = 0.0566 \end{split}\]which provides an example for
\[\label{eq:KL-nonsymm-qed} \mathrm{KL}[P||Q] \neq \mathrm{KL}[Q||P]\]and thus proves the theorem.
∎
Sources: - Kullback, Solomon (1959): "Divergence"; in: Information Theory and Statistics, ch. 1.3, pp. 6ff.; URL: http://index-of.co.uk/Information-Theory/Information%20theory%20and%20statistics%20-%20Solomon%20Kullback.pdf.
- Wikipedia (2020): "Kullback-Leibler divergence"; in: Wikipedia, the free encyclopedia, retrieved on 2020-08-11; URL: https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Basic_example.
Metadata: ID: P147 | shortcut: kl-nonsymm | author: JoramSoch | date: 2020-08-11, 06:57.