What do you think about Wald’s complete class theorems and other similar decision-theoretic results that say that, under a fixed frequentist setting, the set of admissible algorithms coincides (barring messes with infinities) with the set of Bayesian procedures as all possible priors are considered? In other words, if you think it makes sense to strive for the “best” procedure in a context, for any fixed even if unknown definition of what’s best, and you have a frequentist procedure you think is statistically good, then there must be a corresponding Bayesian prior.
(This is an argument I’d always like to see addressed as basic disclaimer in frequentist vs bayesian discussions, I think it helps a lot to put down the framework people are reasoning under, e.g., if it’s more practical vs. theoretical.)
My own opinion on the topic (I’m pro-Bayes):
Many standard frequentist things can be derived as easily or more easily in a Bayesian way; that they are conventionally considered frequentist is an irrelevant accident of history.
In tree methods, the frequentist version comes first, but the Bayesian version, when it arrives, is better, and usable in practice.
Practically all real Bayesian methods are not purely Bayesian, there are many ad hockeries. The point is using Bayes as a guide. Even with an algorithm pulled out of the hat, it’s useful to know if it has a Bayesian interpretation, because it makes it clearer.
ML is frequentist only in the sense of trying algorithms without set rules, I don’t think that should be counted as frequentist success! It’s too generic. I have the impression the mindset of the people working in ML that know their shit is closer to Bayesian, but I am not confident in this since it’s an indirect impression. Example: information theoretic stuff is more natural with Bayes.
What do you think about Wald’s complete class theorems and other similar decision-theoretic results that say that, under a fixed frequentist setting, the set of admissible algorithms coincides (barring messes with infinities) with the set of Bayesian procedures as all possible priors are considered? In other words, if you think it makes sense to strive for the “best” procedure in a context, for any fixed even if unknown definition of what’s best, and you have a frequentist procedure you think is statistically good, then there must be a corresponding Bayesian prior.
(This is an argument I’d always like to see addressed as basic disclaimer in frequentist vs bayesian discussions, I think it helps a lot to put down the framework people are reasoning under, e.g., if it’s more practical vs. theoretical.)
My own opinion on the topic (I’m pro-Bayes):
Many standard frequentist things can be derived as easily or more easily in a Bayesian way; that they are conventionally considered frequentist is an irrelevant accident of history.
In tree methods, the frequentist version comes first, but the Bayesian version, when it arrives, is better, and usable in practice.
Practically all real Bayesian methods are not purely Bayesian, there are many ad hockeries. The point is using Bayes as a guide. Even with an algorithm pulled out of the hat, it’s useful to know if it has a Bayesian interpretation, because it makes it clearer.
ML is frequentist only in the sense of trying algorithms without set rules, I don’t think that should be counted as frequentist success! It’s too generic. I have the impression the mindset of the people working in ML that know their shit is closer to Bayesian, but I am not confident in this since it’s an indirect impression. Example: information theoretic stuff is more natural with Bayes.