In eight out of nine studies, Bem reported evidence in favor of precognition. As we have argued above, this evidence may well be illusory; in several experiments it is evident that Bem’s Exploration Method should have resulted in a correction of the statistical results. Also, we have provided an alternative, Bayesian reanalysis of Bem’s experiments; this alternative analysis demonstrated that the statistical evidence was, if anything, slightly in favor of the null hypothesis. One can argue about the relative merits of classical t-tests versus Bayesian t-tests, but this is not our goal; instead, we want to point out that the two tests yield very different conclusions, something that casts doubt on the conclusiveness of the statistical findings.
Having read about a third of Bem’s paper, and about a third of the critical review mentioned here, I have to agree that the critics are right. This is an exploration study rather than a confirmation study, and as such would require a much higher standard of statistical significance before anyone at all skeptical would be forced to rethink their stance.
To answer the OP’s question, I would want p<0.002 before I would say “It is probably either fraud or real ESP, rather than a statistical fluke.”
To be fair, though, Bem was quite upfront about the exploratory nature of his methodology. His purpose, he claimed, was to invent experimental protocols that would be easy to carry out and easy to analyze. He is making the software that he used to control the experiment publicly available, and is apparentlly hoping that researchers in dozens of psych labs around the country will attempt to replicate his findings. If anyone takes him up on that, those studies will be confirmatory, not exploratory. And if enough of them can duplicate his results, even using Bem’s statistical methods, then his results will be worth thinking about.
Paper criticizing the statistical analysis here:
http://www.ruudwetzels.com/articles/Wagenmakersetal_subm.pdf
From the conclusion:
Having read about a third of Bem’s paper, and about a third of the critical review mentioned here, I have to agree that the critics are right. This is an exploration study rather than a confirmation study, and as such would require a much higher standard of statistical significance before anyone at all skeptical would be forced to rethink their stance.
To answer the OP’s question, I would want p<0.002 before I would say “It is probably either fraud or real ESP, rather than a statistical fluke.”
To be fair, though, Bem was quite upfront about the exploratory nature of his methodology. His purpose, he claimed, was to invent experimental protocols that would be easy to carry out and easy to analyze. He is making the software that he used to control the experiment publicly available, and is apparentlly hoping that researchers in dozens of psych labs around the country will attempt to replicate his findings. If anyone takes him up on that, those studies will be confirmatory, not exploratory. And if enough of them can duplicate his results, even using Bem’s statistical methods, then his results will be worth thinking about.
I think that paper conclusively shows that Bem’s methods are incorrect; even if it doesn’t, it was a really interesting read.