Index: The Book of Statistical ProofsGeneral TheoremsInformation theoryShannon entropy ▷ Gibbs' inequality

Theorem: Let be a discrete random variable and consider two probability distributions with probability mass functions p(x) and q(x). Then, Gibbs’ inequality states that the entropy of X according to P is smaller than or equal to the cross-entropy of P and Q:

\label{eq:Gibbs-ineq} - \sum_{x \in \mathcal{X}} p(x) \, \log_b p(x) \leq - \sum_{x \in \mathcal{X}} p(x) \, \log_b q(x) \; .

Proof: Without loss of generality, we will use the natural logarithm, because a change in the base of the logarithm only implies multiplication by a constant:

\label{eq:log-ln} \log_b a = \frac{\ln a}{\ln b} \; .

Let I be the set of all x for which p(x) is non-zero. Then, proving \eqref{eq:Gibbs-ineq} requires to show that

\label{eq:Gibbs-ineq-s1} \sum_{x \in I} p(x) \, \ln \frac{p(x)}{q(x)} \geq 0 \; .

For all x > 0, it holds that \ln x \leq x - 1, with equality only if x = 1. Multiplying this with -1, we have \ln \frac{1}{x} \geq 1 - x. Applying this to \eqref{eq:Gibbs-ineq-s1}, we can say about the left-hand side that

\label{eq:Gibbs-ineq-s2} \begin{split} \sum_{x \in I} p(x) \, \ln \frac{p(x)}{q(x)} &\geq \sum_{x \in I} p(x) \left( 1 - \frac{q(x)}{p(x)} \right) \\ &= \sum_{x \in I} p(x) - \sum_{x \in I} q(x) \; . \end{split}

Finally, since p(x) and q(x) are probability mass functions, we have

\label{eq:p-q-pmf} \begin{split} 0 \leq p(x) \leq 1, \quad \sum_{x \in I} p(x) &= 1 \quad \text{and} \\ 0 \leq q(x) \leq 1, \quad \sum_{x \in I} q(x) &\leq 1 \; , \end{split}

such that it follows from \eqref{eq:Gibbs-ineq-s2} that

\label{eq:Gibbs-ineq-s3} \begin{split} \sum_{x \in I} p(x) \, \ln \frac{p(x)}{q(x)} &\geq \sum_{x \in I} p(x) - \sum_{x \in I} q(x) \\ &= 1 - \sum_{x \in I} q(x) \geq 0 \; . \end{split}
Sources:

Metadata: ID: P164 | shortcut: gibbs-ineq | author: JoramSoch | date: 2020-09-09, 02:18.