My immediate reaction to the notched distributions that they use to exemplify their results is that it’s cheating—as indeed say all, including the authors. The priors giving pathological posteriors are chosen in response to the data. Any measure of closeness that puts these distributions close to the un-notched distribution is a silly measure of their suitability as priors. However, I don’t have a mathematical expression of what the right measure would be, and no-one that I’ve seen commenting has explicitly set out a reason for dismissing these “posterior priors”, although Entsophy of course does dismiss them on the blog page Cyan linked. (Whatever happened to Entsophy, BTW?) In fact, the authors defend these priors against the charge of disreputability, by arguing that varying the priors in response to the data is exactly what is done in Bayesian sensitivity analysis.
If instead of illustrating their theorems, I imagine a real-world scenario of someone using one of these notched distributions, I get something like this:
I pray to God to show himself by a miracle, then use a quantum mechanical device to generate a string of one million random digits R. I look at these digits and construct a prior such that P(God|R) is high, while P(God|R’) is low for all other R’. This prior is very close by various measures to the one that assigns uniformly low probability to the existence of God whatever string of digits I got.
Something is going wrong here, but I don’t think it’s Bayesian inference. Russell’s teapot seems relevant.
For those wanting to see the proofs of the authors theorems, they are in this other paper of theirs.
My immediate reaction to the notched distributions that they use to exemplify their results is that it’s cheating—as indeed say all, including the authors. The priors giving pathological posteriors are chosen in response to the data. Any measure of closeness that puts these distributions close to the un-notched distribution is a silly measure of their suitability as priors. However, I don’t have a mathematical expression of what the right measure would be, and no-one that I’ve seen commenting has explicitly set out a reason for dismissing these “posterior priors”, although Entsophy of course does dismiss them on the blog page Cyan linked. (Whatever happened to Entsophy, BTW?) In fact, the authors defend these priors against the charge of disreputability, by arguing that varying the priors in response to the data is exactly what is done in Bayesian sensitivity analysis.
If instead of illustrating their theorems, I imagine a real-world scenario of someone using one of these notched distributions, I get something like this:
I pray to God to show himself by a miracle, then use a quantum mechanical device to generate a string of one million random digits R. I look at these digits and construct a prior such that P(God|R) is high, while P(God|R’) is low for all other R’. This prior is very close by various measures to the one that assigns uniformly low probability to the existence of God whatever string of digits I got.
Something is going wrong here, but I don’t think it’s Bayesian inference. Russell’s teapot seems relevant.
The authors write here:
but go on to argue that it should always be subservient to non-Bayesian reasoning.