I think I disagree with both of you here. The failure to reject a null hypothesis is a failure. It doesn’t allow or even encourage you to conclude anything.
Can you conclude that you failed to reject the null hypothesis? And if you expected to reject the null hypothesis, isn’t that failure meaningful? (Note that my language carefully included the confidence value.)
As a general comment, this is why the Bayesian approach is much more amenable to knowledge-generation than the frequentist approach. The statement “the hyperactivity increase in the experimental group was 0.36+/-2.00, and that range solidly includes 0” (with the variance of that estimate pulled out of thin air) is much more meaningful than “we can’t be sure it’s not zero.”
As a general comment, this is why Bayesian statistics is much more amenable to knowledge-generation than frequentist statistics. The statement “the hyperactivity increase in the experimental group was 0.36+/-2.00, and that range solidly includes 0” (with the variance of that estimate pulled out of thin air) is much more meaningful than “we can’t be sure it’s not zero.”
I agree with the second sentence, and the first might be true, but the second isn’t evidence for the first; interval estimation vs. hypothesis testing is an independent issue to Bayesianism vs. frequentism. There are Bayesian hypothesis tests and frequentist interval estimates.
Agreed that both have those tools, and rereading my comment I think “approach” may have been a more precise word than “statistics.” If you think in terms of “my results are certain, reality is uncertain” then the first tool you reach for is “let’s make an interval estimate / put a distribution on reality,” whereas if you think in terms of “reality is certain, my results are uncertain” then the first tool you reach for is hypothesis testing. Such defaults have very important effects on what actually gets used in studies.
And if you expected to reject the null hypothesis, isn’t that failure meaningful?
To me, but not to the theoretical foundations of the method employed.
Hypothesis testing generally works sensibly because people smuggle in intuitions that aren’t part of the foundations of the method. But since they’re only smuggling things in under a deficient theoretical framework, they’re given to mistakes, particularly when they’re applying their intuitions to the theoretical framework and not the base data.
I agree with the later comment on Bayesian statistics, and I’d go further. Scatterplot the labeled data, or show the distribution if you have tons of data. That’s generally much more productive than any particular particular confidence interval you might construct.
It would be an interesting study generative study to compare the various statistical tests on the same hypothesis versus the human eyeball. I think the eyeball will hold it’s own.
Can you conclude that you failed to reject the null hypothesis? And if you expected to reject the null hypothesis, isn’t that failure meaningful? (Note that my language carefully included the confidence value.)
As a general comment, this is why the Bayesian approach is much more amenable to knowledge-generation than the frequentist approach. The statement “the hyperactivity increase in the experimental group was 0.36+/-2.00, and that range solidly includes 0” (with the variance of that estimate pulled out of thin air) is much more meaningful than “we can’t be sure it’s not zero.”
I agree with the second sentence, and the first might be true, but the second isn’t evidence for the first; interval estimation vs. hypothesis testing is an independent issue to Bayesianism vs. frequentism. There are Bayesian hypothesis tests and frequentist interval estimates.
Agreed that both have those tools, and rereading my comment I think “approach” may have been a more precise word than “statistics.” If you think in terms of “my results are certain, reality is uncertain” then the first tool you reach for is “let’s make an interval estimate / put a distribution on reality,” whereas if you think in terms of “reality is certain, my results are uncertain” then the first tool you reach for is hypothesis testing. Such defaults have very important effects on what actually gets used in studies.
To me, but not to the theoretical foundations of the method employed.
Hypothesis testing generally works sensibly because people smuggle in intuitions that aren’t part of the foundations of the method. But since they’re only smuggling things in under a deficient theoretical framework, they’re given to mistakes, particularly when they’re applying their intuitions to the theoretical framework and not the base data.
I agree with the later comment on Bayesian statistics, and I’d go further. Scatterplot the labeled data, or show the distribution if you have tons of data. That’s generally much more productive than any particular particular confidence interval you might construct.
It would be an interesting study generative study to compare the various statistical tests on the same hypothesis versus the human eyeball. I think the eyeball will hold it’s own.