Jan Christian Refsgaard comments on a visual explanation of Bayesian updating

Jan Christian Refsgaard 10 May 2021 9:10 UTC
3 points
I am well aware that nobody asked for this, but here is the proof that the posterior is $B e t a (α + z, β - z + 1)$ for the beta-bernoulli model.

We start with Bayes Theorem:

$p (θ ∣ z) = \frac{p (z ∣ θ) p (θ)}{p (z)}$

Then we plug in the definition for the Bernoulli likelihood and Beta prior:

$p (θ ∣ z) = θ^{z} (1 - θ)^{1 - z} \times \frac{θ^{α - 1} (1 - θ)^{β - 1}}{B (α, β)} \times \frac{1}{p (z)}$

Let’s collect the powers in the numerator, and things that does not depend on $θ$ in the denominator

$p (θ ∣ z) = \frac{θ^{α + z - 1} (1 - θ)^{β - z}}{B (α, β) p (z)}$

Here comes the conjugation shenanigans. If you squint, the top of the distribution looks like the top of a Beta distribution:

$\begin{matrix} α^{'} & = α + z β^{'} & = β - z + 1 p (θ ∣ z) & = \frac{θ^{α^{'} - 1} (1 - θ)^{β^{'} - 1}}{B (α, β) p (z)} \end{matrix}$

Let’s continue the shenanigans, since the numerator looks like the numerator of a beta distribution, we know that it would be a proper beta distribution if we changed the denominator like this:

$\begin{matrix} p (θ ∣ z) & = \frac{θ^{α^{'} - 1} (1 - θ)^{β^{'} - 1}}{B (α^{'}, β^{'})} p (θ ∣ z) & = \frac{θ^{α + z - 1} (1 - θ)^{β - z}}{B (α + z, β - z + 1)} \end{matrix}$