To be a Bayesian in the purest sense is very demanding. One need not only articulate a basic model for the structure of the data and the distribution of the errors around that data (as in a regression model), but all your further uncertainty about each of those parts. If you have some sliver of doubt that maybe the errors have a slight serial correlation, that has to be expressed as a part of your prior before you look at any data. If you think that maybe the model for the structure might not be a line, but might be better expressed as an ordinary differential equation with a somewhat exotic expression for dy/dx then that had better be built in with appropriate prior mass too. And you’d better not do this just for the 3 or 4 leading possible modifications, but for every one that you assign prior mass to, and don’t forget uncertainty about that uncertainty, up the hierarchy. Only then can the posterior computation, which is now rather computationally demanding, compute your true posterior.
Since this is so difficult, practitioners often fall short somewhere. Maybe they compute the posterior from the simple form of their prior, then build in one complication and compute a posterior for that and compare and, if these two look similar enough, conclude that building in more complications is unnecessary. Or maybe… gasp… they look at residuals. Such behavior is often going to be a violation of the (full) likelihood principle b/c the principle demands that the probability densities all be laid out explicitly and that we only obtain information from ratios of those.
So pragmatic Bayesians will still look at the residuals Box 1980.
To be a Bayesian in the purest sense is very demanding. One need not only articulate a basic model for the structure of the data and the distribution of the errors around that data (as in a regression model), but all your further uncertainty about each of those parts. If you have some sliver of doubt that maybe the errors have a slight serial correlation, that has to be expressed as a part of your prior before you look at any data. If you think that maybe the model for the structure might not be a line, but might be better expressed as an ordinary differential equation with a somewhat exotic expression for dy/dx then that had better be built in with appropriate prior mass too. And you’d better not do this just for the 3 or 4 leading possible modifications, but for every one that you assign prior mass to, and don’t forget uncertainty about that uncertainty, up the hierarchy. Only then can the posterior computation, which is now rather computationally demanding, compute your true posterior.
Since this is so difficult, practitioners often fall short somewhere. Maybe they compute the posterior from the simple form of their prior, then build in one complication and compute a posterior for that and compare and, if these two look similar enough, conclude that building in more complications is unnecessary. Or maybe… gasp… they look at residuals. Such behavior is often going to be a violation of the (full) likelihood principle b/c the principle demands that the probability densities all be laid out explicitly and that we only obtain information from ratios of those.
So pragmatic Bayesians will still look at the residuals Box 1980.