It seems to me quite a remarkable claim that they actually “hypothesized that 5-HTTLPR genotype would interact with intentionality in respondents who generated moral judgments”. Is that realistic, or should we believe that they probably data-dredged this one up? Either way I don’t pay much attention to anything found at p<.05.
Recent research demonstrates that participants’ willingness to endorse utilitarian actions that require personally harming an innocent victim can be affected by variables that influence brain functioning, such as lesions of the ventromedial prefrontal cortex and pharmacological challenges [6], [7]. For example, respondents who receive a selective serotonin reuptake inhibitor (citalopram) are less likely to endorse utilitarian outcomes that result in harm to an innocent victim [7]. This may be because serotonin enhances the aversive emotional response to causing others harm, perhaps through its influence on brain structures like the amygdala, insula, and ventromedial prefrontal cortex, which are implicated in moral judgments and behavior [6], [8].
Endogenous serotonin neurotransmission is influenced by a functional 5′ promoter polymorphism of the serotonin transporter (5-HTT) in the human serotonin transporter gene SLC6A4, called 5-HTTLPR [9]. Relative to carriers of the long (L) form of the polymorphism, carriers of the short (S) form show reduced transcription, expression and function of 5-HTT, which influences the reuptake of serotonin from the synaptic cleft [10]. S-carriers are also more emotionally reactive to aversive stimuli than are L-carriers [11]. This difference may reflect S-carriers’ increased activation in subcortical structures like the amygdala that are associated with negative affect and/or reduced prefrontal modulation of these structures by the prefrontal cortex [11].
This lab also has several previous studies dealing with both trolly-like problems and 5-HTTLPR.
If they have previous studies specifically addressing the relationship between 5-HTTLPR and utilitarian calculations, for which I’ll take your word, that does indeed increase my degree of belief that they formed that specific hypothesis and that one only before performing the experiment.
Sorry, that sentence was unclear- they have previous studies dealing with trolley-like problems and previous studies dealing with 5-HTTLPR’s relation to things like fear recognition and strength of perceived rewards and punishments. They don’t, as far as I know, have studies dealing with the relationship between 5-HTTLPR and trolley-like problems until now.
I believe you. Suppose they were just looking for any genotype that might interact, and then there is a chance given all the zillions of genotypes that any genotype will show interaction just due to random chance.
If that’s the case, that the interaction occurred by random chance, then you wouldn’t expect the interaction to occur in a second experiment. But then—is it really necessary to do two experiments? One could just look for interactions in the first half of the observations and then see if they are repeated in the second. Even so, you would still have a probability of some false positives but the probability would be smaller. So finally, I expect that any experimentalist working with such data would be aware of all of this and would look for a signal with a false positive probability of less than .05, so it would all be factored in the statistical analyses.
… so in the paper, is the probability of the interaction being significant calculated before or after considering there is some chance that any random genotype might show p<.05?
As I understand them, proper scientific standards demand that researchers pick a hypothesis and then test it using pre-chosen statistical methods, and ideally whether or not the null hypothesis is rejected at this level of significance they report their experimental results by publishing a paper. Yudkowsky suggests that experimenters should be asked to publish papers before they conduct their experiments, to ensure that this good practice is upheld (and also suggests that Bayesian likelihood ratios should be published instead of frequentist statistics).
The (relatively) benign problem that is often seen is the “file drawer” effect where only papers that reject the null hypothesis (i.e. have interesting results) are published. The more serious allegation is that researchers have data-dredged, i.e. conducted an experiment and then looked for plausible hypotheses that they can claim to have “tested” and found a positive result, or run various statistical analyses until they found one that suited them (it’s very optimistic to claim that this doesn’t happen!) As you say, in this kind of investigation there are bound to be many such spurious results to exploit, particularly in a field where 5% significance is the standard.
I was questioning the likelihood that the researchers really thought they had good cause to conduct an experiment to test the hypothesis that some apparently obscure genotype was causally related to people making utilitarian moral judgements. It doesn’t pass the laugh test for me, but that might just be my ignorance.
Is the salient fact.
It seems to me quite a remarkable claim that they actually “hypothesized that 5-HTTLPR genotype would interact with intentionality in respondents who generated moral judgments”. Is that realistic, or should we believe that they probably data-dredged this one up? Either way I don’t pay much attention to anything found at p<.05.
It is very realistic. From the introduction.
This lab also has several previous studies dealing with both trolly-like problems and 5-HTTLPR.
If they have previous studies specifically addressing the relationship between 5-HTTLPR and utilitarian calculations, for which I’ll take your word, that does indeed increase my degree of belief that they formed that specific hypothesis and that one only before performing the experiment.
Sorry, that sentence was unclear- they have previous studies dealing with trolley-like problems and previous studies dealing with 5-HTTLPR’s relation to things like fear recognition and strength of perceived rewards and punishments. They don’t, as far as I know, have studies dealing with the relationship between 5-HTTLPR and trolley-like problems until now.
I believe you. Suppose they were just looking for any genotype that might interact, and then there is a chance given all the zillions of genotypes that any genotype will show interaction just due to random chance.
If that’s the case, that the interaction occurred by random chance, then you wouldn’t expect the interaction to occur in a second experiment. But then—is it really necessary to do two experiments? One could just look for interactions in the first half of the observations and then see if they are repeated in the second. Even so, you would still have a probability of some false positives but the probability would be smaller. So finally, I expect that any experimentalist working with such data would be aware of all of this and would look for a signal with a false positive probability of less than .05, so it would all be factored in the statistical analyses.
… so in the paper, is the probability of the interaction being significant calculated before or after considering there is some chance that any random genotype might show p<.05?
As I understand them, proper scientific standards demand that researchers pick a hypothesis and then test it using pre-chosen statistical methods, and ideally whether or not the null hypothesis is rejected at this level of significance they report their experimental results by publishing a paper. Yudkowsky suggests that experimenters should be asked to publish papers before they conduct their experiments, to ensure that this good practice is upheld (and also suggests that Bayesian likelihood ratios should be published instead of frequentist statistics).
The (relatively) benign problem that is often seen is the “file drawer” effect where only papers that reject the null hypothesis (i.e. have interesting results) are published. The more serious allegation is that researchers have data-dredged, i.e. conducted an experiment and then looked for plausible hypotheses that they can claim to have “tested” and found a positive result, or run various statistical analyses until they found one that suited them (it’s very optimistic to claim that this doesn’t happen!) As you say, in this kind of investigation there are bound to be many such spurious results to exploit, particularly in a field where 5% significance is the standard.
I was questioning the likelihood that the researchers really thought they had good cause to conduct an experiment to test the hypothesis that some apparently obscure genotype was causally related to people making utilitarian moral judgements. It doesn’t pass the laugh test for me, but that might just be my ignorance.