what sort of p-values would you need to see in that paper in order to believe with, say, 50% probability that the effect measured is real?
P-values won’t do it. Psi experiments consistently have high sounding significance because the trials are so large. Mass replication is what will make me believe.
That critique doesn’t really work for t-tests though does it? Sure, as n increases so does your chance that the finding is statistically significant, but it also reduces the chance of the data being a fluke. If you flip a fair coin a million times holding a banana in your left hand and it comes up heads 55% of the time… there’s some explaining to do.
Even if the explanation is that it wasn’t a fair coin.
Failures to set up or follow proper experimental procedures (giving hints, not fully random presentation, etc) or otherwise introducing a slight biasing effect will show an effect which is puny. With low n, this won’t be statistically significant, but with high n it will appear very statistically significant.
P-values won’t do it. Psi experiments consistently have high sounding significance because the trials are so large. Mass replication is what will make me believe.
That critique doesn’t really work for t-tests though does it? Sure, as n increases so does your chance that the finding is statistically significant, but it also reduces the chance of the data being a fluke. If you flip a fair coin a million times holding a banana in your left hand and it comes up heads 55% of the time… there’s some explaining to do. Even if the explanation is that it wasn’t a fair coin.
Failures to set up or follow proper experimental procedures (giving hints, not fully random presentation, etc) or otherwise introducing a slight biasing effect will show an effect which is puny. With low n, this won’t be statistically significant, but with high n it will appear very statistically significant.
That’s true, statistical significance isn’t the most sophisticated statistic. My rule of thumb is looking at the p and d values.