Now to the details of the paper. Based on the word “empirical” title, I thought the authors were going to look at a large number of papers with p-values and then follow up and see if the claims were replicated. But no, they don’t follow up on the studies at all! What they seem to be doing is collecting a set of published p-values and then fitting a mixture model to this distribution, a mixture of a uniform distribution (for null effects) and a beta distribution (for non-null effects). Since only statistically significant p-values are typically reported, they fit their model restricted to p-values less than 0.05. But this all assumes that the p-values have this stated distribution. You don’t have to be Uri Simonsohn to know that there’s a lot of p-hacking going on. Also, as noted above, the problem isn’t really effects that are exactly zero, the problem is that a lot of effects are lots in the noise and are essentially undetectable given the way they are studied....So, no, I don’t at all believe Jager and Leek when they write, “we are able to empirically estimate the rate of false positives in the medical literature and trends in false positive rates over time.” They’re doing this by basically assuming the model that is being questioned, the textbook model in which effects are pure and in which there is no p-hacking.
One of the authors replies in the comments:
That being said, our paper is a direct response to the original work, which defined “correct” and “incorrect” in the medical literature by the truth of the null hypothesis. We totally agree that that is a very debatable definition of correct. However, we felt it was important to point out that when using that definition you can actually estimate the rate of false discoveries with principled methods. These methods are well justified in the statistical literature and we took pains to point out our assumptions in both the paper and the supplemental material. Whether you agree with those assumptions is of course, a totally reasonable thing to talk about.
Gelman’s comments:
One of the authors replies in the comments: