The point depends on differences between confidence intervals and credible intervals.
Roughly, frequentist confidence intervals, but not Bayesian credible intervals, have the following coverage guarantee: if you repeat the sampling and analysis procedure over and over, in the long-run, the confidence intervals produced cover the truth some percentage of the time corresponding to the confidence level. If I set a 95% confidence level, then in the limit, 95% of the intervals I generate will cover the truth.
Bayesian credible intervals, on the other hand, tell us what we believe (or should believe) the truth is given the data. A 95% credible interval contains 95% of the probability in the posterior distribution (and usually is centered around a point estimate). As Gelman points out, Bayesians can also get a kind of frequentist-style coverage by averaging over the prior. But in Wasserman’s cartoon, the target is a hard-core personalist who thinks that probabilities just are degrees of belief. No averaging is done, because the credible intervals are just supposed to represent the beliefs of that particular individual. In such a case, we have no guarantee that the credible interval covers the truth even occasionally, even in the long-run.
Take a look here for several good explanations of the difference between confidence intervals and credible intervals that are much more detailed than my comment here.
Roughly, frequentist confidence intervals, but not Bayesian credible intervals, have the following coverage guarantee: if you repeat the sampling and analysis procedure over and over, in the long-run, the confidence intervals produced cover the truth some percentage of the time corresponding to the confidence level. If I set a 95% confidence level, then in the limit, 95% of the intervals I generate will cover the truth.
Right. This is what my comment there was pointing out: in his very own example, physics, 95% CIs do not get you 95% coverage since when we look at particle physics’s 95% CIs, they are too narrow. Just like his Bayesian’s 95% credible intervals. So what’s the point?
I suspect you’re talking past one another, but maybe I’m missing something. I skimmed the paper you linked and intend to come back to it in a few weeks, when I am less busy, but based on skimming, I would expect the frequentist to say something like, “You’re showing me a finite collection of 95% confidence intervals for which it is not the case that 95% of them cover the truth, but the claim is that in the long run, 95% of them will cover the truth. And the claim about the long run is a mathematical fact.”
I can see having worries that this doesn’t tell us anything about how confidence intervals perform in the short run. But that doesn’t invalidate the point Wasserman is making, does it? (Serious question: I’m not sure I understand your point, but I would like to.)
Well, I’ll put it this way—if we take as our null hypothesis ‘these 95% CIs really did have 95% coverage’, would the observed coverage-rate have p<0.05? If it did, would you or him resort to ‘No True Scotsman’ again?
(A hint as to the answer: just a few non-coverages drive the null down to extremely low levels—think about multiplying 0.05 by 0.05...)
Yeah, I still think you’re talking past one another. Wasserman’s point is that something being a 95% confidence interval deductively entails that it has the relevant kind of frequentist coverage. That can no more fail to be true than 2+2 can stop being 4. The null, then, ought to be simply that these are really 95% confidence intervals, and the data then tell against that null by undermining a logical consequence of the null. The data might be excellent evidence that these aren’t 95% confidence intervals. Of course, figuring out exactly why they aren’t is another matter. Did the physicists screw up? Were their sampling assumptions wrong? I would guess that there is a failure of independence somewhere in the example, but again, I haven’t read the paper carefully or really looked at the data.
Anyway, I still don’t see what’s wrong with Wasserman’s reply. If they don’t have 95% coverage, then they aren’t 95% confidence intervals.
So, is your point that we often don’t know when a purportedly 95% confidence interval really is one? Or that we don’t know when the assumptions are satisfied for using confidence intervals? Those seem like reasonable complaints. I wonder what Wasserman would have to say about those objections.
So, is your point that we often don’t know when a purportedly 95% confidence interval really is one?
I’m saying that this stuff about 95% CI is a completely empty and broken promise; if we see the coverage blown routinely, as we do in particle physics in this specific case, the CI is completely useless—it didn’t deliver what it was deductively promised. It’s like have a Ouija board which is guaranteed to be right 95% of the time, but oh wait, it was right just 90% of the time so I guess it wasn’t really a Oujia board after all.
Even if we had this chimerical ’95% confidence interval’, we could never know that it was a genuine 95% confidence interval. I am reminded of Borges:
It is universally admitted that the unicorn is a supernatural being of good omen; such is declared in all the odes, annals, biographies of illustrious men and other texts whose authority is unquestionable. Even children and village women know that the unicorn constitutes a favorable presage. But this animal does not figure among the domestic beasts, it is not always easy to find, it does not lend itself to classification. It is not like the horse or the bull, the wolf or the deer. In such conditions, we could be face to face with a unicorn and not know for certain what it was. We know that such and such an animal with a mane is a horse and that such and such an animal with horns is a bull. But we do not know what the unicorn is like.
It is universally admitted that the 95% confidence interval is a result of good coverage; such is declared in all the papers, textbooks, biographies of illustrious statisticians and other texts whose authority is unquestionable...
(Given that “95% CIs” are not 95% CIs, I will content myself with honest credible intervals, which at least are what they pretend to be.)
Can someone help me understand the point being made in this response? http://normaldeviate.wordpress.com/2012/11/09/anti-xkcd/
The point depends on differences between confidence intervals and credible intervals.
Roughly, frequentist confidence intervals, but not Bayesian credible intervals, have the following coverage guarantee: if you repeat the sampling and analysis procedure over and over, in the long-run, the confidence intervals produced cover the truth some percentage of the time corresponding to the confidence level. If I set a 95% confidence level, then in the limit, 95% of the intervals I generate will cover the truth.
Bayesian credible intervals, on the other hand, tell us what we believe (or should believe) the truth is given the data. A 95% credible interval contains 95% of the probability in the posterior distribution (and usually is centered around a point estimate). As Gelman points out, Bayesians can also get a kind of frequentist-style coverage by averaging over the prior. But in Wasserman’s cartoon, the target is a hard-core personalist who thinks that probabilities just are degrees of belief. No averaging is done, because the credible intervals are just supposed to represent the beliefs of that particular individual. In such a case, we have no guarantee that the credible interval covers the truth even occasionally, even in the long-run.
Take a look here for several good explanations of the difference between confidence intervals and credible intervals that are much more detailed than my comment here.
Right. This is what my comment there was pointing out: in his very own example, physics, 95% CIs do not get you 95% coverage since when we look at particle physics’s 95% CIs, they are too narrow. Just like his Bayesian’s 95% credible intervals. So what’s the point?
I suspect you’re talking past one another, but maybe I’m missing something. I skimmed the paper you linked and intend to come back to it in a few weeks, when I am less busy, but based on skimming, I would expect the frequentist to say something like, “You’re showing me a finite collection of 95% confidence intervals for which it is not the case that 95% of them cover the truth, but the claim is that in the long run, 95% of them will cover the truth. And the claim about the long run is a mathematical fact.”
I can see having worries that this doesn’t tell us anything about how confidence intervals perform in the short run. But that doesn’t invalidate the point Wasserman is making, does it? (Serious question: I’m not sure I understand your point, but I would like to.)
Well, I’ll put it this way—if we take as our null hypothesis ‘these 95% CIs really did have 95% coverage’, would the observed coverage-rate have p<0.05? If it did, would you or him resort to ‘No True Scotsman’ again?
(A hint as to the answer: just a few non-coverages drive the null down to extremely low levels—think about multiplying 0.05 by 0.05...)
Yeah, I still think you’re talking past one another. Wasserman’s point is that something being a 95% confidence interval deductively entails that it has the relevant kind of frequentist coverage. That can no more fail to be true than 2+2 can stop being 4. The null, then, ought to be simply that these are really 95% confidence intervals, and the data then tell against that null by undermining a logical consequence of the null. The data might be excellent evidence that these aren’t 95% confidence intervals. Of course, figuring out exactly why they aren’t is another matter. Did the physicists screw up? Were their sampling assumptions wrong? I would guess that there is a failure of independence somewhere in the example, but again, I haven’t read the paper carefully or really looked at the data.
Anyway, I still don’t see what’s wrong with Wasserman’s reply. If they don’t have 95% coverage, then they aren’t 95% confidence intervals.
So, is your point that we often don’t know when a purportedly 95% confidence interval really is one? Or that we don’t know when the assumptions are satisfied for using confidence intervals? Those seem like reasonable complaints. I wonder what Wasserman would have to say about those objections.
I’m saying that this stuff about 95% CI is a completely empty and broken promise; if we see the coverage blown routinely, as we do in particle physics in this specific case, the CI is completely useless—it didn’t deliver what it was deductively promised. It’s like have a Ouija board which is guaranteed to be right 95% of the time, but oh wait, it was right just 90% of the time so I guess it wasn’t really a Oujia board after all.
Even if we had this chimerical ’95% confidence interval’, we could never know that it was a genuine 95% confidence interval. I am reminded of Borges:
It is universally admitted that the 95% confidence interval is a result of good coverage; such is declared in all the papers, textbooks, biographies of illustrious statisticians and other texts whose authority is unquestionable...
(Given that “95% CIs” are not 95% CIs, I will content myself with honest credible intervals, which at least are what they pretend to be.)