I recently read The Cult of Statistical Significance. I realize that it’s de rigeur to quote significance, but Ziliak and McCloskey insist that I ask what’s the hypothesized size of the effect?
If we run three conditions, and end up with 4, 5, and 6 people getting some improvement, and calculate statistical significance, we obfuscate the fact that the difference is in the noise. If the same tests end up with 2, 4 and 8 people improving according to some metric, then we have stronger reason to suspect something is going on. Size matters. It’s usually more interesting than statistical significance.
I recently read The Cult of Statistical Significance. I realize that it’s de rigeur to quote significance, but Ziliak and McCloskey insist that I ask what’s the hypothesized size of the effect?
If we run three conditions, and end up with 4, 5, and 6 people getting some improvement, and calculate statistical significance, we obfuscate the fact that the difference is in the noise. If the same tests end up with 2, 4 and 8 people improving according to some metric, then we have stronger reason to suspect something is going on. Size matters. It’s usually more interesting than statistical significance.