That critique doesn’t really work for t-tests though does it? Sure, as n increases so does your chance that the finding is statistically significant, but it also reduces the chance of the data being a fluke. If you flip a fair coin a million times holding a banana in your left hand and it comes up heads 55% of the time… there’s some explaining to do.
Even if the explanation is that it wasn’t a fair coin.
Failures to set up or follow proper experimental procedures (giving hints, not fully random presentation, etc) or otherwise introducing a slight biasing effect will show an effect which is puny. With low n, this won’t be statistically significant, but with high n it will appear very statistically significant.
That critique doesn’t really work for t-tests though does it? Sure, as n increases so does your chance that the finding is statistically significant, but it also reduces the chance of the data being a fluke. If you flip a fair coin a million times holding a banana in your left hand and it comes up heads 55% of the time… there’s some explaining to do. Even if the explanation is that it wasn’t a fair coin.
Failures to set up or follow proper experimental procedures (giving hints, not fully random presentation, etc) or otherwise introducing a slight biasing effect will show an effect which is puny. With low n, this won’t be statistically significant, but with high n it will appear very statistically significant.
That’s true, statistical significance isn’t the most sophisticated statistic. My rule of thumb is looking at the p and d values.