Index: The Book of Statistical ProofsStatistical ModelsCount dataBinomial observations ▷ Log Bayes factor

Theorem: Let $y$ be the number of successes resulting from $n$ independent trials with unknown success probability $p$, such that $y$ follows a binomial distribution:

\[\label{eq:Bin} y \sim \mathrm{Bin}(n,p) \; .\]

Moreover, assume two statistical models, one assuming that $p$ is 0.5 (null model), the other imposing a beta distribution as the prior distribution on the model parameter $p$ (alternative):

\[\label{eq:Bin-m01} \begin{split} m_0&: \; y \sim \mathrm{Bin}(n,p), \; p = 0.5 \\ m_1&: \; y \sim \mathrm{Bin}(n,p), \; p \sim \mathrm{Bet}(\alpha_0, \beta_0) \; . \end{split}\]

Then, the log Bayes factor in favor of $m_1$ against $m_0$ is

\[\label{eq:Bin-LBF} \mathrm{LBF}_{10} = \log B(\alpha_n,\beta_n) - \log B(\alpha_0,\beta_0) - n \log \left( \frac{1}{2} \right)\]

where $B(x,y)$ is the beta function and $\alpha_n$ and $\beta_n$ are the posterior hyperparameters for binomial observations which are functions of the number of trials $n$ and the number of successes $y$.

Proof: The log Bayes factor is equal to the difference of two log model evidences:

\[\label{eq:LBF-LME} \mathrm{LBF}_{12} = \mathrm{LME}(m_1) - \mathrm{LME}(m_2) \; .\]

The LME of the alternative $m_1$ is equal to the log model evidence for binomial observations:

\[\label{eq:Bin-LME-m1} \mathrm{LME}(m_1) = \log p(y|m_1) = \log {n \choose y} + \log B(\alpha_n,\beta_n) - \log B(\alpha_0,\beta_0) \; .\]

Because the null model $m_0$ has no free parameter, its log model evidence (logarithmized marginal likelihood) is equal to the log-likelihood function for binomial observations at the value $p = 0.5$:

\[\label{eq:Bin-LME-m0} \begin{split} \mathrm{LME}(m_0) = \log p(y|p=0.5) &= \log {n \choose y} + y \log(0.5) + (n-y) \log (1-0.5) \\ &= \log {n \choose y} + n \log \left( \frac{1}{2} \right) \; . \end{split}\]

Subtracting the two LMEs from each other, the LBF emerges as

\[\label{eq:Bin-LBF-m10} \mathrm{LBF}_{10} = \log B(\alpha_n,\beta_n) - \log B(\alpha_0,\beta_0) - n \log \left( \frac{1}{2} \right)\]

where the posterior hyperparameters are given by

\[\label{eq:Bin-post-par} \begin{split} \alpha_n &= \alpha_0 + y \\ \beta_n &= \beta_0 + (n-y) \end{split}\]

with the number of trials $n$ and the number of successes $y$.

Sources:

Metadata: ID: P383 | shortcut: bin-lbf | author: JoramSoch | date: 2022-11-25, 14:40.