Yes, that’s a good point. I suggest that if you are testing many hypothesis, you use the false discovery rate (here’s the useful, original pdf, cited 10,000+ times).
As an example, let’s say that you test 6 hypotheses, corresponding to different features of your zeo data. You could use a t-test for each, as above. Then aggregate and sort all the p-values in ascending order. Let’s say that they are 0.001, 0.013, 0.021, 0.030, 0.067, and 0.134.
Assume, arbitrarily, that you want the overall false discovery rate to be 0.05, which is in this context called the q-value. You would then sequentially test, from the last value to the first, whether the current p-value is less than ((the current index * the false discovery rate) / the overall number of hypotheses). You stop when you get to the first true inequality and call the p-values of the rest of the hypotheses significant.
So in this example, you would stop when you correctly call 0.030 < ((4 * 0.05) / 6), and hypotheses corresponding to the first four p-values would be called significant.
Interesting procedure. I tried it out on my melatonin and one-legged standing data, putting the results in the same footnotes as the R sessions, and no surprise, nothing survives. (A little depressing, but it’s not like there were very many p-values in the 0.01 or lower range.)
EDIT: however, one result from my vitamin D experiment did survive multiple correction!
Yes, that’s a good point. I suggest that if you are testing many hypothesis, you use the false discovery rate (here’s the useful, original pdf, cited 10,000+ times).
As an example, let’s say that you test 6 hypotheses, corresponding to different features of your zeo data. You could use a t-test for each, as above. Then aggregate and sort all the p-values in ascending order. Let’s say that they are 0.001, 0.013, 0.021, 0.030, 0.067, and 0.134.
Assume, arbitrarily, that you want the overall false discovery rate to be 0.05, which is in this context called the q-value. You would then sequentially test, from the last value to the first, whether the current p-value is less than ((the current index * the false discovery rate) / the overall number of hypotheses). You stop when you get to the first true inequality and call the p-values of the rest of the hypotheses significant.
So in this example, you would stop when you correctly call 0.030 < ((4 * 0.05) / 6), and hypotheses corresponding to the first four p-values would be called significant.
Interesting procedure. I tried it out on my melatonin and one-legged standing data, putting the results in the same footnotes as the R sessions, and no surprise, nothing survives. (A little depressing, but it’s not like there were very many p-values in the 0.01 or lower range.)
EDIT: however, one result from my vitamin D experiment did survive multiple correction!