Proof: Convexity of the cross-entropy
Index:
The Book of Statistical Proofs ▷
General Theorems ▷
Information theory ▷
Shannon entropy ▷
Convexity of cross-entropy
Metadata: ID: P150 | shortcut: entcross-conv | author: JoramSoch | date: 2020-08-11, 09:16.
Theorem: The cross-entropy is convex in the probability distribution $q$, i.e.
\[\label{eq:ent-cross-conv} \mathrm{H}[p,\lambda q_1 + (1-\lambda) q_2] \leq \lambda \mathrm{H}[p,q_1] + (1-\lambda) \mathrm{H}[p,q_2]\]where $p$ is a fixed and $q_1$ and $q_2$ are any two probability distributions and $0 \leq \lambda \leq 1$.
Proof: The relationship between Kullback-Leibler divergence, entropy and cross-entropy is:
\[\label{eq:kl-ent} \mathrm{KL}[P||Q] = \mathrm{H}(P,Q) - \mathrm{H}(P) \; .\]Note that the KL divergence is convex in the pair of probability distributions $(p,q)$:
\[\label{eq:kl-conv} \mathrm{KL}[\lambda p_1 + (1-\lambda) p_2||\lambda q_1 + (1-\lambda) q_2] \leq \lambda \mathrm{KL}[p_1||q_1] + (1-\lambda) \mathrm{KL}[p_2||q_2]\]A special case of this is given by
\[\label{eq:kl-conv-p} \begin{split} \mathrm{KL}[\lambda p + (1-\lambda) p||\lambda q_1 + (1-\lambda) q_2] &\leq \lambda \mathrm{KL}[p||q_1] + (1-\lambda) \mathrm{KL}[p||q_2] \\ \mathrm{KL}[p||\lambda q_1 + (1-\lambda) q_2] &\leq \lambda \mathrm{KL}[p||q_1] + (1-\lambda) \mathrm{KL}[p||q_2] \end{split}\]and applying equation \eqref{eq:kl-ent}, we have
\[\label{eq:ent-cross-conv-qed} \begin{split} \mathrm{H}[p,\lambda q_1 + (1-\lambda) q_2] - \mathrm{H}[p] &\leq \lambda \left( \mathrm{H}[p,q_1] - \mathrm{H}[p] \right) + (1-\lambda) \left( \mathrm{H}[p,q_2] - \mathrm{H}[p] \right) \\ \mathrm{H}[p,\lambda q_1 + (1-\lambda) q_2] - \mathrm{H}[p] &\leq \lambda \mathrm{H}[p,q_1] + (1-\lambda) \mathrm{H}[p,q_2] - \mathrm{H}[p] \\ \mathrm{H}[p,\lambda q_1 + (1-\lambda) q_2] &\leq \lambda \mathrm{H}[p,q_1] + (1-\lambda) \mathrm{H}[p,q_2] \end{split}\]which is equivalent to \eqref{eq:ent-cross-conv}.
∎
Sources: - Wikipedia (2020): "Cross entropy"; in: Wikipedia, the free encyclopedia, retrieved on 2020-08-11; URL: https://en.wikipedia.org/wiki/Cross_entropy#Definition.
- gunes (2019): "Convexity of cross entropy"; in: StackExchange CrossValidated, retrieved on 2020-11-08; URL: https://stats.stackexchange.com/questions/394463/convexity-of-cross-entropy.
Metadata: ID: P150 | shortcut: entcross-conv | author: JoramSoch | date: 2020-08-11, 09:16.