You’re correct in a broader sense that passing the F-test under one set of assumptions is strong evidence that you’ll pass it with a similar set of assumptions. But papers such as this use logic and math in order to say things precisely, and while what they claimed is supported, and similar to, what they proved, it isn’t the same thing, so it’s still an error, just as 3.9 is similar to 4 for most purposes, but it is an error to say that 2 + 1.9 = 4.
The thing is, some such reasoning has to be done in any case to interpret the paper. Even if no logical mistake was made, the F-test can’t possibly disprove a hypothesis such as “the means of these two distributions are different”. There is always room for an epsilon difference in the means to be compatible with the data. A similar objection was stated elsewhere on this thread already:
The failure to reject a null hypothesis is a failure. It doesn’t allow or even encourage you to conclude anything.
And of course it’s legitimate to give up at this step and say “the null hypothesis has not been rejected, so we have nothing to say”. But if we don’t do this, then our only recourse is to say something like: “with 95% certainty, the difference in means is less than X”. In other words, we may be fairly certain that 2 + 1.9 is less than 5, and we’re a bit less certain that 2 + 1.9 is less than 4, as well.
Incidentally, is there some standard statistical test that produces this kind of output?
You’re correct in a broader sense that passing the F-test under one set of assumptions is strong evidence that you’ll pass it with a similar set of assumptions. But papers such as this use logic and math in order to say things precisely, and while what they claimed is supported, and similar to, what they proved, it isn’t the same thing, so it’s still an error, just as 3.9 is similar to 4 for most purposes, but it is an error to say that 2 + 1.9 = 4.
The thing is, some such reasoning has to be done in any case to interpret the paper. Even if no logical mistake was made, the F-test can’t possibly disprove a hypothesis such as “the means of these two distributions are different”. There is always room for an epsilon difference in the means to be compatible with the data. A similar objection was stated elsewhere on this thread already:
And of course it’s legitimate to give up at this step and say “the null hypothesis has not been rejected, so we have nothing to say”. But if we don’t do this, then our only recourse is to say something like: “with 95% certainty, the difference in means is less than X”. In other words, we may be fairly certain that 2 + 1.9 is less than 5, and we’re a bit less certain that 2 + 1.9 is less than 4, as well.
Incidentally, is there some standard statistical test that produces this kind of output?