Are the inference methods really advanced enough to make this kind of thing practical? I know that in statistics, MCMC methods have long been specialized enough that in order to use them for even relatively simple problems (lets say a 20 parameter linear model with non normal errors) you have to know what you’re doing. I have been interested in this field because recently there have been promising algorithms which promise to make inference much easier for problems with continuous variables (in particular gradient/hessian based algorithms analogous to newton methods on optimization). Probabilistic programs seems like a much more difficult category to work with likely to get very slow very fast. Am I missing something? What do people hope to accomplish with such a language?
Current universal inference methods are very limited, so the main advantages of using probabilistic programming languages are (1) the conceptual clarity you get by separating generative model and inference and (2) the ability to write down complex nonparametric models and immediately be able to do inference, even if it’s inefficient. Writing a full model+inference implementation in Matlab, say, takes you much longer, is more confusing and less flexible.
That said, some techniques that were developed for particular classes of problems have a useful analog in the setting of programs. The gradient-based methods you mention have been generalized to work on any probabilistic program with continuous parameters.
Interesting, I suppose that does seem somewhat useful; for discussion purposes at the very least. I am curious about how a gradient-based method can work without continuous parameters: that is counter intuitive for me. Can you throw out some keywords? Keywords for what I was talking about: Metropolis-adjusted Langevin algorithm (MALA), Stochastic Newton, any MCMC with ‘hessian’ in the name.
They don’t work without continuous parameters. If you have a probabilistic program that includes both discrete and continuous parameters, you can use gradient methods to generate MH proposals for your continuous parameters. I don’t think there are any publications that discuss this yet.
Oh, ok that makes perfect sense. Breaking inference problems into sub problems and using different methods on the sub problems seems like a common technique.
Writing a full model+inference implementation in Matlab, say, takes you much longer, is more confusing and less flexible.
Defining and implementing a whole programming language is way more work than writing a library in an existing language. A library, after all, is a language, but one you don’t have to write a parser, interpreter, or compiler for.
I was comparing the two choices people face who want to do inference in nontrivial models. You can either write the model in an existing probabilistic programming language and get inefficient inference for free or you can write model+inference in something like Matlab. Here, you may be able to use libraries if your model is similar enough to existing models, but for many interesting models, this is not the case.
Ok, I was comparing the effort at the meta-level: providing tools for computing in a given subject area either by making a new language, or by making a new library in an existing language.
Are the inference methods really advanced enough to make this kind of thing practical? I know that in statistics, MCMC methods have long been specialized enough that in order to use them for even relatively simple problems (lets say a 20 parameter linear model with non normal errors) you have to know what you’re doing. I have been interested in this field because recently there have been promising algorithms which promise to make inference much easier for problems with continuous variables (in particular gradient/hessian based algorithms analogous to newton methods on optimization). Probabilistic programs seems like a much more difficult category to work with likely to get very slow very fast. Am I missing something? What do people hope to accomplish with such a language?
Current universal inference methods are very limited, so the main advantages of using probabilistic programming languages are (1) the conceptual clarity you get by separating generative model and inference and (2) the ability to write down complex nonparametric models and immediately be able to do inference, even if it’s inefficient. Writing a full model+inference implementation in Matlab, say, takes you much longer, is more confusing and less flexible.
That said, some techniques that were developed for particular classes of problems have a useful analog in the setting of programs. The gradient-based methods you mention have been generalized to work on any probabilistic program with continuous parameters.
Interesting, I suppose that does seem somewhat useful; for discussion purposes at the very least. I am curious about how a gradient-based method can work without continuous parameters: that is counter intuitive for me. Can you throw out some keywords? Keywords for what I was talking about: Metropolis-adjusted Langevin algorithm (MALA), Stochastic Newton, any MCMC with ‘hessian’ in the name.
They don’t work without continuous parameters. If you have a probabilistic program that includes both discrete and continuous parameters, you can use gradient methods to generate MH proposals for your continuous parameters. I don’t think there are any publications that discuss this yet.
Oh, ok that makes perfect sense. Breaking inference problems into sub problems and using different methods on the sub problems seems like a common technique.
Defining and implementing a whole programming language is way more work than writing a library in an existing language. A library, after all, is a language, but one you don’t have to write a parser, interpreter, or compiler for.
I was comparing the two choices people face who want to do inference in nontrivial models. You can either write the model in an existing probabilistic programming language and get inefficient inference for free or you can write model+inference in something like Matlab. Here, you may be able to use libraries if your model is similar enough to existing models, but for many interesting models, this is not the case.
Ok, I was comparing the effort at the meta-level: providing tools for computing in a given subject area either by making a new language, or by making a new library in an existing language.