See paulfchristiano’s examples elsewhere in this thread.
Another example would be support vector machines, which work really well in practice but aren’t Bayesian (although it’s possible that they are actually Bayesian and I just can’t figure out what prior they correspond to).
There are also neural networks, which are sort of Bayesian but (I think?) not really. I’m not actually that familiar with neural nets (or SVMs for that matter) so I could just be wrong.
ETA: It is the case that every non-dominated decision procedure is either a Bayesian procedure or the limit of Bayesian procedures (which I think could alternately be thought of as a Bayesian procedure with a potentially improper prior). So in that sense, for any frequentist procedure that is not Bayesian, there is another procedure that gets higher expected utility in all possible worlds, and is therefore strictly better. The only problem is that this is again an abstract statement about decision procedures, and doesn’t take into account the computational difficulty of actually finding the better procedure.
This paper is the closest I’ve ever seen to a fully Bayesian interpretation of SVMs; mind you, the authors still use “pseudo-likelihood” to describe the data-dependent part of the optimization criterion.
Neural networks are just a kind of non-linear model. You can perform Bayes upon them if you want.
See paulfchristiano’s examples elsewhere in this thread.
Another example would be support vector machines, which work really well in practice but aren’t Bayesian (although it’s possible that they are actually Bayesian and I just can’t figure out what prior they correspond to).
There are also neural networks, which are sort of Bayesian but (I think?) not really. I’m not actually that familiar with neural nets (or SVMs for that matter) so I could just be wrong.
ETA: It is the case that every non-dominated decision procedure is either a Bayesian procedure or the limit of Bayesian procedures (which I think could alternately be thought of as a Bayesian procedure with a potentially improper prior). So in that sense, for any frequentist procedure that is not Bayesian, there is another procedure that gets higher expected utility in all possible worlds, and is therefore strictly better. The only problem is that this is again an abstract statement about decision procedures, and doesn’t take into account the computational difficulty of actually finding the better procedure.
This paper is the closest I’ve ever seen to a fully Bayesian interpretation of SVMs; mind you, the authors still use “pseudo-likelihood” to describe the data-dependent part of the optimization criterion.
Neural networks are just a kind of non-linear model. You can perform Bayes upon them if you want.