There is the probabilistic programming community which uses clean tools (programming languages) to hand construct models with many unknown parameters. They use approximate bayesian methods for inference, and they are slowly improving the efficiency/scalability of those techniques.
Then there is the neural net & optimization community which uses general automated models. It is more ‘frequentist’ (or perhaps just ad-hoc ), but there are also now some bayesian inroads there. That community has the most efficient/scalable learning methods, but it isn’t always clear what tradeoffs they are making.
And even in the ANN world, you sometimes see bayesian statistics brought in to justify regularizers or to derive stuff—such as in variational methods. But then for actual learning they take gradients and use SGD, with the understanding that SGD is somehow approximating the bayesian inference step, or at least doing something close enough.
There is the probabilistic programming community which uses clean tools (programming languages) to hand construct models with many unknown parameters. They use approximate bayesian methods for inference, and they are slowly improving the efficiency/scalability of those techniques.
Then there is the neural net & optimization community which uses general automated models. It is more ‘frequentist’ (or perhaps just ad-hoc ), but there are also now some bayesian inroads there. That community has the most efficient/scalable learning methods, but it isn’t always clear what tradeoffs they are making.
And even in the ANN world, you sometimes see bayesian statistics brought in to justify regularizers or to derive stuff—such as in variational methods. But then for actual learning they take gradients and use SGD, with the understanding that SGD is somehow approximating the bayesian inference step, or at least doing something close enough.