The key idea behind Church and similar languages is that they allow us to express and formally reason about a large class of probabilistic models, many of which cannot be formalized in any concise way as Bayes nets.
Bayes nets express generative models, i.e. processes that generate data. To infer the states of hidden variables from observations, you condition the Bayes net and compute a distribution on the hidden variable settings using Bayesian inference or some approximation thereof. A particularly popular class of approximations is the class of sampling algorithms, e.g. Markov Chain Monte Carlo methods (MCMC) and importance sampling.
Probabilistic programs express a larger class of models, but very similar approximate inference algorithms can be used to condition a program on observations and to infer the states of hidden variables. In both machine learning and cognitive science, when you are doing Bayesian inference with some model that expresses your prior belief, you usually code both model and inference algorithm and make use of problem-specific approximations to Bayesian inference. Probabilistic programs separate model from inference by using universal inference algorithms.
If you are interested in this set of ideas in the context of cognitive science, I recommend this interactive tutorial.
Church is based on Lisp. At the lowest level, it replaces Boolean gates with stochastic digital circuits. These circuits are wired together to form Markov chains (the probabilistic counterpart of finite state machines.) At the top, it’s possible to define probabilistic procedures for generating samples from recursively defined distributions.
This confuses Church as a language for expressing generative models with ideas on how to implement such a language in hardware. There are three different ideas here:
Church as a language for generative models with sampling-based semantics
MCMC as an approximate inference method for such models (that can be implemented on traditional von Neumann architectures)
Machine architectures that are well-suited for MCMC
So this is an open call for volunteers—any brave Bayesians want to blog about a brand new computer language?
I’ll write an exposition within the next weeks if people are interested.
Thank you so much.
I’m sorry I was confused, and I’m glad someone is around who knows more.
And thank you so much for the tutorial! This is proving to be much clearer than the papers. You are a prince.
you usually code both model and inference algorithm and make use of problem-specific approximations to Bayesian inference. Probabilistic programs separate model from inference by using universal inference algorithms.
Naively, separation of model from inference algorithm seems like a terrible idea. People use problem-specific approximation algorithms because if they don’t exploit specific features of the model, inference will be completely intractable. Often this consideration is so important that people will use obviously sub-optimal models, if those models support fast inference.
The key idea behind Church and similar languages is that they allow us to express and formally reason about a large class of probabilistic models, many of which cannot be formalized in any concise way as Bayes nets.
Bayes nets express generative models, i.e. processes that generate data. To infer the states of hidden variables from observations, you condition the Bayes net and compute a distribution on the hidden variable settings using Bayesian inference or some approximation thereof. A particularly popular class of approximations is the class of sampling algorithms, e.g. Markov Chain Monte Carlo methods (MCMC) and importance sampling.
Probabilistic programs express a larger class of models, but very similar approximate inference algorithms can be used to condition a program on observations and to infer the states of hidden variables. In both machine learning and cognitive science, when you are doing Bayesian inference with some model that expresses your prior belief, you usually code both model and inference algorithm and make use of problem-specific approximations to Bayesian inference. Probabilistic programs separate model from inference by using universal inference algorithms.
If you are interested in this set of ideas in the context of cognitive science, I recommend this interactive tutorial.
This confuses Church as a language for expressing generative models with ideas on how to implement such a language in hardware. There are three different ideas here:
Church as a language for generative models with sampling-based semantics
MCMC as an approximate inference method for such models (that can be implemented on traditional von Neumann architectures)
Machine architectures that are well-suited for MCMC
I’ll write an exposition within the next weeks if people are interested.
Thank you so much. I’m sorry I was confused, and I’m glad someone is around who knows more. And thank you so much for the tutorial! This is proving to be much clearer than the papers. You are a prince.
Do eet.
Naively, separation of model from inference algorithm seems like a terrible idea. People use problem-specific approximation algorithms because if they don’t exploit specific features of the model, inference will be completely intractable. Often this consideration is so important that people will use obviously sub-optimal models, if those models support fast inference.
Yes, deriving mechanisms that take complex models and turn them into something tractable is mostly an open problem.
Please do.