The voodoo paper starts by noting that the social neuroscience papers regularly report values that are higher then the theoretical maximum.
I find a defense of neuroscience against the Voodoo paper that ignores that the charge of the Voodoo paper that the results of the claimed social neuroscience papers achieve impossible results (you could call them paranormal), to be no good defense.
Whether or not it causes entirely spurious correlations to be reported depends on the degrees of freedom that models have. If you have a dataset with 200 patients and 2000 degrees of freedom in your mathematical model. The neuroscience folks often use statistical techniques where there’s no mathematically sound method to assess the degrees of freedom. Frequently, they run some simulated data through the model to eyeball the amount of the problem, but there are no mathematical guarantees that this will find every case when the degrees of freedom are two high.
Even if you grant it’s only modest overstating. Scientists are generally not expected to modestly overstate their results but are supposed to remove systematic effects that make them overstate their results.
Even if you think that there’s some value in predicting training data, they could still run a second test where they split their data into two a trainings data pile and an evaluation pile and run their model again and report the results. It’s not much work as they don’t need to create a new model. It’s 4 lines of R (maybe even less if you write it concisely).
The voodoo paper starts by noting that the social neuroscience papers regularly report values that are higher then the theoretical maximum.
I find a defense of neuroscience against the Voodoo paper that ignores that the charge of the Voodoo paper that the results of the claimed social neuroscience papers achieve impossible results (you could call them paranormal), to be no good defense.
Whether or not it causes entirely spurious correlations to be reported depends on the degrees of freedom that models have. If you have a dataset with 200 patients and 2000 degrees of freedom in your mathematical model. The neuroscience folks often use statistical techniques where there’s no mathematically sound method to assess the degrees of freedom. Frequently, they run some simulated data through the model to eyeball the amount of the problem, but there are no mathematical guarantees that this will find every case when the degrees of freedom are two high.
Even if you grant it’s only modest overstating. Scientists are generally not expected to modestly overstate their results but are supposed to remove systematic effects that make them overstate their results.
Even if you think that there’s some value in predicting training data, they could still run a second test where they split their data into two a trainings data pile and an evaluation pile and run their model again and report the results. It’s not much work as they don’t need to create a new model. It’s 4 lines of R (maybe even less if you write it concisely).