The methods that Bem uses in his experimentation itself has been viewed as controversial as well. According to understood statistical methodology, Bem incorrectly provides one-sided p values when he should have used a two-sided p values.[17] This could possibly account for the marginally significant results that he produced in his experiment. A rebuttal to the Wagenmakers et al. critique by Bem and two statisticians was subsequently published in the Journal of Personality and Social Psychology.[18]
When I have seen back-and-forths like this it’s always been the pro-psi parapsychologists who understood the statistics better. Do you or does anyone know if that isn’t true in this case?
One common error made by skeptics is to say that the low prior on psi means that after a Bayesian correction any individual experiment or paper isn’t enough to drive belief in psi, so it is “not scientific evidence.” That’s overstatement: if psi were real one would combine the odds ratio of multiple experiments (insofar as they were honest and independent) and overcome that, so the individual pieces would have to be published to accumulate that evidence. It’s partly driven by scientific etiquette: the reason one can’t aggregate studies like that is because there are systematic errors, bias, and fraud. Given the sheer extent of those observed in the record, it’s very hard for a set of experiments to provide decisive evidence without some extraordinary evidence supporting their quality and honesty. Jaynes has an impressively thorough discussion of these issues in his probability textbook. The linked paper critiquing Bem didn’t make that error.
Looking at the exchange you mention here are links to the continuation:
Part of the argument was over whether to expect effect sizes to be miniscule if psi exists (Bem argues that existing research has already disconfirmed big psi effects, so that the penalty for that should be incorporated into our beliefs prior to his experiments, rather than the odds ratio stemming from the experiments. The rest was over whether Bem engaged in data-mining. Bem denies it, but has also written guides to students advocating intensive data-mining, and there are various suspicious elements in the paper that suggest it.
Both sides here seem to understand the statistics under discussion well enough, the back-and-forth is about psi and Bem’s methods or honesty, i.e. flaws in the experimental design, data mining, deception, or luck/file drawer/publication bias effects. Failures to replicate will indicate one or more of those (replications will have to be tested for systematic flaws in the replication package, of course).
Wikipedia
When I have seen back-and-forths like this it’s always been the pro-psi parapsychologists who understood the statistics better. Do you or does anyone know if that isn’t true in this case?
One common error made by skeptics is to say that the low prior on psi means that after a Bayesian correction any individual experiment or paper isn’t enough to drive belief in psi, so it is “not scientific evidence.” That’s overstatement: if psi were real one would combine the odds ratio of multiple experiments (insofar as they were honest and independent) and overcome that, so the individual pieces would have to be published to accumulate that evidence. It’s partly driven by scientific etiquette: the reason one can’t aggregate studies like that is because there are systematic errors, bias, and fraud. Given the sheer extent of those observed in the record, it’s very hard for a set of experiments to provide decisive evidence without some extraordinary evidence supporting their quality and honesty. Jaynes has an impressively thorough discussion of these issues in his probability textbook. The linked paper critiquing Bem didn’t make that error.
Looking at the exchange you mention here are links to the continuation:
http://dbem.ws/ResponsetoWagenmakers.pdf http://www.ruudwetzels.com/articles/ClarificationsForBemUttsJohnson.pdf
Part of the argument was over whether to expect effect sizes to be miniscule if psi exists (Bem argues that existing research has already disconfirmed big psi effects, so that the penalty for that should be incorporated into our beliefs prior to his experiments, rather than the odds ratio stemming from the experiments. The rest was over whether Bem engaged in data-mining. Bem denies it, but has also written guides to students advocating intensive data-mining, and there are various suspicious elements in the paper that suggest it.
Both sides here seem to understand the statistics under discussion well enough, the back-and-forth is about psi and Bem’s methods or honesty, i.e. flaws in the experimental design, data mining, deception, or luck/file drawer/publication bias effects. Failures to replicate will indicate one or more of those (replications will have to be tested for systematic flaws in the replication package, of course).
(Will continue the general discussion soon. Just airing out my brain a bit.)
Previously I claimed:
I’m thinking of, say, four or five times when I looked into it. I was wondering, does your experience agree with mine, or disagree?
Also, follow the links to the two blog posts mentioned at the bottom of this.