Agreed about chaos, missing data, time series, and noise, but I think the next is off the mark:
Model/abstraction error? See everything under the heading of ‘model checking’ and things like model-averaging; local favorite Bayesian statistician Andrew Gelman is very active in this area, no doubt he would be quite surprised to learn that he is misapplying Bayesian methods in that area.
He might be surprised to be described as applying Bayesian methods at all in that area. Model checking, in his view, is an essential part of “Bayesian data analysis”, but it is not itself carried out by Bayesian methods. The strictly Bayesian part—that is, the application of Bayes’ theorem—ends with the computation of the posterior distribution of the model parameters given the priors and the data. Model-checking must (he says) be undertaken by other means because the truth may not be in the support of the prior, a situation in which the strict Bayesian is lost. From “Philosophy and the practice of Bayesian statistics”, by Gelman and Shalizi (my emphasis):
In contrast, Bayesian statistics or “inverse probability”—starting with a prior distribution, getting data, and moving to the posterior distribution—is associated with an inductive approach of learning about the general from particulars. Rather than testing and attempted falsification, learning proceeds more smoothly: an accretion of evidence is summarized by a posterior distribution, and scientific process is associated with the rise and fall in the posterior probabilities of various models …. We think most of this received view of Bayesian inference is wrong.
...
To reiterate, it is hard to claim that the prior distributions used in applied work represent statisticians’ states of knowledge and belief before examining their data, if only because most statisticians do not believe their models are true, so their prior degree of belief in all of Θ is not 1 but 0.
If anyone’s itching to say “what about universal priors?”, Gelman and Shalizi say that in practice there is no such thing. The idealised picture of Bayesian practice, in which the prior density is non-zero everywhere, and successive models come into favour or pass out of favour by nothing more than updating from data by Bayes theorem, is, they say, unworkable.
The main point where we disagree with many Bayesians is that we do not see Bayesian methods as generally useful for giving the posterior probability that a model is true, or the probability for preferring model A over model B, or whatever.
They liken the process to Kuhnian paradigm-shifting:
In some way, Kuhn’s distinction between normal and revolutionary science is analogous to the distinction between learning within a Bayesian model, and checking the model as preparation to discard or expand it.
but find Popperian hypothetico-deductivism a closer fit:
In our hypothetico-deductive view of data analysis, we build a statistical model out of available parts and drive it as far as it can take us, and then a little farther. When the model breaks down, we dissect it and figure out what went wrong. For Bayesian models, the most useful way of figuring out how the model breaks down is through posterior predictive checks, creating simulations of the data and comparing them to the actual data. The comparison can often be done visually; see Gelman et al. (2003, ch. 6) for a range of examples. Once we have an idea about where the problem lies, we can tinker with the model, or perhaps try a radically new design. Either way, we are using deductive reasoning as a tool to get the most out of a model, and we test the model—it is falsifiable, and when it is consequentially falsified, we alter or abandon it.
For Gelman and Shalizi, model checking is an essential part of Bayesian practice, not because it is a Bayesian process but because it is a necessarily non-Bayesian supplement to the strictly Bayesian part: Bayesian data analysis cannot proceed by Bayes alone. Bayes proposes; model-checking disposes.
I’m not a statistician and do not wish to take a view on this. But I believe I have accurately stated their view. The paper contains some references to other statisticians who, they says are more in favour of universal Bayesianism, but I have not read them.
Model-checking must (he says) be undertaken by other means because the truth may not be in the support of the prior, a situation in which the strict Bayesian is lost.
Loath as I am to disagree with Gelman & Shalizi, I’m not convinced that the sort of model-checking they advocate such as posterior p-values are fundamentally and in principle non-Bayesian, rather than practical problems. I mostly agree with “Posterior predictive checks can and should be Bayesian: Comment on Gelman and Shalizi,‘Philosophy and the practice of Bayesian statistics’”, Kruschke 2013 - I don’t see why that sort of procedure cannot be subsumed with more flexible and general models in an ensemble approach, and poor fits of particular parametric models found automatically and posterior shifted to more complex but better fitting models. If we fit one model and find that it is a bad model, then the root problem was that we were only looking at one model when we knew that there were many other models but out of laziness or limited computations we discarded them all. You might say that when we do an informal posterior predictive check, what we are doing is a Bayesian model comparison of one or two explicit models with the models generated by a large multi-layer network of sigmoids (specifically <80 billion of them)… If you’re running into problems because your model-space is too narrow—expand it! Models should be able to grow (this is a common feature of Bayesian nonparametrics).
This may be hard in practice, but then it’s just another example of how we must compromise our ideals because of our limits, not a fundamental limitation on a theory or paradigm.
Agreed about chaos, missing data, time series, and noise, but I think the next is off the mark:
He might be surprised to be described as applying Bayesian methods at all in that area. Model checking, in his view, is an essential part of “Bayesian data analysis”, but it is not itself carried out by Bayesian methods. The strictly Bayesian part—that is, the application of Bayes’ theorem—ends with the computation of the posterior distribution of the model parameters given the priors and the data. Model-checking must (he says) be undertaken by other means because the truth may not be in the support of the prior, a situation in which the strict Bayesian is lost. From “Philosophy and the practice of Bayesian statistics”, by Gelman and Shalizi (my emphasis):
...
If anyone’s itching to say “what about universal priors?”, Gelman and Shalizi say that in practice there is no such thing. The idealised picture of Bayesian practice, in which the prior density is non-zero everywhere, and successive models come into favour or pass out of favour by nothing more than updating from data by Bayes theorem, is, they say, unworkable.
They liken the process to Kuhnian paradigm-shifting:
but find Popperian hypothetico-deductivism a closer fit:
For Gelman and Shalizi, model checking is an essential part of Bayesian practice, not because it is a Bayesian process but because it is a necessarily non-Bayesian supplement to the strictly Bayesian part: Bayesian data analysis cannot proceed by Bayes alone. Bayes proposes; model-checking disposes.
I’m not a statistician and do not wish to take a view on this. But I believe I have accurately stated their view. The paper contains some references to other statisticians who, they says are more in favour of universal Bayesianism, but I have not read them.
Loath as I am to disagree with Gelman & Shalizi, I’m not convinced that the sort of model-checking they advocate such as posterior p-values are fundamentally and in principle non-Bayesian, rather than practical problems. I mostly agree with “Posterior predictive checks can and should be Bayesian: Comment on Gelman and Shalizi,‘Philosophy and the practice of Bayesian statistics’”, Kruschke 2013 - I don’t see why that sort of procedure cannot be subsumed with more flexible and general models in an ensemble approach, and poor fits of particular parametric models found automatically and posterior shifted to more complex but better fitting models. If we fit one model and find that it is a bad model, then the root problem was that we were only looking at one model when we knew that there were many other models but out of laziness or limited computations we discarded them all. You might say that when we do an informal posterior predictive check, what we are doing is a Bayesian model comparison of one or two explicit models with the models generated by a large multi-layer network of sigmoids (specifically <80 billion of them)… If you’re running into problems because your model-space is too narrow—expand it! Models should be able to grow (this is a common feature of Bayesian nonparametrics).
This may be hard in practice, but then it’s just another example of how we must compromise our ideals because of our limits, not a fundamental limitation on a theory or paradigm.