I am well aware that nobody asked for this, but here is the proof that the posterior is Beta(α+z,β−z+1) for the beta-bernoulli model.
We start with Bayes Theorem:
p(θ∣z)=p(z∣θ)p(θ)p(z)
Then we plug in the definition for the Bernoulli likelihood and Beta prior:
p(θ∣z)=θz(1−θ)1−z×θα−1(1−θ)β−1B(α,β)×1p(z)
Let’s collect the powers in the numerator, and things that does not depend on
θ in the denominator
p(θ∣z)=θα+z−1(1−θ)β−zB(α,β)p(z)
Here comes the conjugation shenanigans. If you squint, the top of the distribution looks like
the top of a Beta distribution:
α′=α+zβ′=β−z+1p(θ∣z)=θα′−1(1−θ)β′−1B(α,β)p(z)
Let’s continue the shenanigans, since the numerator looks like the numerator of
a beta distribution, we know that it would be a proper beta
distribution if we changed the denominator like this:
I am well aware that nobody asked for this, but here is the proof that the posterior is Beta(α+z,β−z+1) for the beta-bernoulli model.
We start with Bayes Theorem:
p(θ∣z)=p(z∣θ)p(θ)p(z)
Then we plug in the definition for the Bernoulli likelihood and Beta prior:
p(θ∣z)=θz(1−θ)1−z×θα−1(1−θ)β−1B(α,β)×1p(z)
Let’s collect the powers in the numerator, and things that does not depend on θ in the denominator
p(θ∣z)=θα+z−1(1−θ)β−zB(α,β)p(z)
Here comes the conjugation shenanigans. If you squint, the top of the distribution looks like the top of a Beta distribution:
α′=α+zβ′=β−z+1p(θ∣z)=θα′−1(1−θ)β′−1B(α,β)p(z)
Let’s continue the shenanigans, since the numerator looks like the numerator of a beta distribution, we know that it would be a proper beta distribution if we changed the denominator like this:
p(θ∣z)=θα′−1(1−θ)β′−1B(α′,β′)p(θ∣z)=θα+z−1(1−θ)β−zB(α+z,β−z+1)