I only barely mentioned it in my post, but there are ways of approximating bayesian inference like MCMC. And in fact there are methods which can take advantage of stochastic gradient information, which should make them roughly as efficient as SGD.
There is also a recent paper by Deepmind, Weight Uncertainty in Neural Networks.
I only barely mentioned it in my post, but there are ways of approximating bayesian inference like MCMC. And in fact there are methods which can take advantage of stochastic gradient information, which should make them roughly as efficient as SGD.
There is also a recent paper by Deepmind, Weight Uncertainty in Neural Networks.