twanvl comments on A Fervent Defense of Frequentist Statistics

twanvl 19 Feb 2014 11:58 UTC
2 points
As I understand it, the big difference between Bayesian and frequentist methods is in what they output. A frequentist methods gives you a single prediction $z_t$, while a Bayesian method gives you a probability distribution over the predictions, $p(z_t)$. If your immediate goal is to minimize a known (or approximable) loss function, then frequentist methods work great. If you want to combine the predictions with other things as part of a larger whole, then you really need to know the uncertainty of your prediction, and ideally you need the entire distribution.

For example, when doing OCR, you have some model of likely words in a text, and a detector that tells you what character is present in an image. To combine the two, you would use the probability of the image containing a certain character and multiply it by the probability of that character appearing at this point in an English sentence. Note that I am not saying that you need to use a fully Bayesian model to detect characters, just that you somehow need to estimate your uncertainty and be able to give alternative hypotheses.

In summary, combining multiple models is where Bayesian reasoning shines. You can easily paste multiple models together and expect to get a sensible result. On the other hand, for getting the best result efficiently, state of the art frequentist methods are hard to beat. And as always, the best thing is to combine the two as appropriate.