ChristianKl comments on Outside the Laboratory

ChristianKl Dec 26, 2012, 3:24 PM
0 points

Sorry, ambiguous wording. 0.05 is too weak, and should be replaced with, say, 0.005. It would be a better scientific investment to do fewer studies with twice as many subjects and have nearly all the reported results be replicable.

I’d rather prefer two studies with 0.05% on the same claim by different scientifists to one study with 0.005%. Proving replicable of scientific studies with actually replicating them is better than going for a even lower p value.
- gwern Dec 26, 2012, 8:06 PM
  7 points
  Parent
  
  I’d rather prefer two studies with 0.05% on the same claim by different scientifists to one study with 0.005%.
  
  I wouldn’t. Two studies opens the door to publication bias concerns and muddles the ‘replication’: rarely do people do a straight replication.
  
  From Nickerson in http://lesswrong.com/lw/g13/against_nhst/
  
  Experiments that are literal replications of previously published experiments are very seldom published—I do not believe I have ever seen one. Others who have done systematic searches for examples of them confirm that they are rare (Mahoney, 1976; Sterling, 1959)....PhD committees generally expect more from dissertations than the replication of someone else’s findings. Evidence suggests that manuscripts that report only replication experiments are likely to get negative reactions from journal reviewers and editors alike (Neuliep & Crandall, 1990, 1993)
  - Eliezer Yudkowsky Dec 27, 2012, 2:20 AM
    4 points
    Parent
    
    I wouldn’t. Two studies opens the door to publication bias concerns
    
    Agreed. It’s much easier for a false effect to garner two ‘statistically significant’ studies with p < .05 than to gain one statistically significant study with p < .005 (though you really want p < .0001).
  - ChristianKl Dec 27, 2012, 4:23 PM
    2 points
    Parent
    
    I wouldn’t. Two studies opens the door to publication bias concerns and muddles the ‘replication’: rarely do people do a straight replication.
    
    If you put the general significance standard at P<0.005 you will even further decrease the amount of straight replications. We need more straight replication instead of less.
    
    A single study can wrong due to systematic bias. One researcher could engage in fraud and therefore get a P<0.005 result. He could also simply be bad at blinding his subjects properly. There are many possible ways to get a P<0.005 result by messing up the underlying science in a way that you can’t see by reading a paper.
    
    Having a second researcher reproduce the effects is vital to know that the first result is not due to some error in the experiment setup of the first study.