I agree that hard sciences are far less vulnerable to statistical pitfalls. However, I’d point at three factors other than data generation to explain it:
The hard sciences have theories that define specific, quantitative models, which makes it far easier to test the theories. Fitting a misspecified model is much less of a risk, and a model may make such a specific prediction that fewer data are needed to falsify it.
Signal-to-noise ratios are often much less in the hard sciences. Where that’s the case, you generally don’t need such advanced statistics to analyse results, and you’re more likely to notice when you do the statistics incorrectly and get a wrong answer. And even if a model doesn’t truly fit the data, it may still explain the vast majority of the variation in the data; you can get an R² of 0.999 in physics, while if you get an R² of 0.999 in the social sciences it means you did something stupid in Excel or SPSS and accidentally regressed something against itself.
In the hard sciences, one has a good chance of accounting for all of the important causes of an effect of interest. In the social sciences this is usually impossible; often one doesn’t even know the important causes of an effect, making it difficult to rule out confounding (unless one can sever unknown causal links via e.g. randomization).
I agree that hard sciences are far less vulnerable to statistical pitfalls. However, I’d point at three factors other than data generation to explain it:
The hard sciences have theories that define specific, quantitative models, which makes it far easier to test the theories. Fitting a misspecified model is much less of a risk, and a model may make such a specific prediction that fewer data are needed to falsify it.
Signal-to-noise ratios are often much less in the hard sciences. Where that’s the case, you generally don’t need such advanced statistics to analyse results, and you’re more likely to notice when you do the statistics incorrectly and get a wrong answer. And even if a model doesn’t truly fit the data, it may still explain the vast majority of the variation in the data; you can get an R² of 0.999 in physics, while if you get an R² of 0.999 in the social sciences it means you did something stupid in Excel or SPSS and accidentally regressed something against itself.
In the hard sciences, one has a good chance of accounting for all of the important causes of an effect of interest. In the social sciences this is usually impossible; often one doesn’t even know the important causes of an effect, making it difficult to rule out confounding (unless one can sever unknown causal links via e.g. randomization).