The upshot is that there are two things going on here that interact to produce the shattering phenomenon. First, the notion of closeness permits some very pathological models to be considered close to sensible models. Second, the optimization to find the worst-case model close to the assumed model is done in a post-data way, not in prior expectation. So what you get is this: for any possible observed data and any model, there is a model “close” to the assumed one that predicts absolute disaster (or any result) just for that specific data set, and is otherwise well-behaved.
As the authors themselves put it:
The mechanism causing this “brittleness” has its origin in the fact that, in classical Bayesian Sensitivity Analysis, optimal bounds on posterior values are computed after the observation of the specific value of the data, and that the probability of observing the data under some feasible prior may be arbitrarily small… This data dependence of worst priors is inherent to this classical framework and the resulting brittleness under finite-information can be seen as an extreme occurrence of the dilation phenomenon (the fact that optimal bounds on prior values may become less precise after conditioning) observed in classical robust Bayesian inference.
I like it when I can just point folks to something I’ve already written.
The upshot is that there are two things going on here that interact to produce the shattering phenomenon. First, the notion of closeness permits some very pathological models to be considered close to sensible models. Second, the optimization to find the worst-case model close to the assumed model is done in a post-data way, not in prior expectation. So what you get is this: for any possible observed data and any model, there is a model “close” to the assumed one that predicts absolute disaster (or any result) just for that specific data set, and is otherwise well-behaved.
As the authors themselves put it:
Thanks for your link!