So far, this is probably fairly obvious. Its also fairly clear that, unless everyone believes you to be a perfect Bayesian reasoner, it is certainly possible that by holding certain beliefs you are signalling moral stances even though this should be independent.
When I worried that the correlation between testosterone and politics means that political opinions are hopelessly biased by emotions
So what should I conclude about your attitude towards men from your use of “testosterone” in that sentence?
Well, ideally you would conclude that I was thinking about the digit ratios measured in the LW survey, which collates with testosterone but not estrogen.
Estrogen does affect politics too, and when an experiment proved this and was reported in popular science magazines (scientific american, I think) the feminists lost their minds and demanded that the reporter be fired, despite the fact that both the reporter and the scientists were female.
Estrogen does affect politics too, and when an experiment proved this and was reported in popular science magazines (scientific american, I think) the feminists lost their minds and demanded that the reporter be fired, despite the fact that both the reporter and the scientists were female.
What do you think of Gelman’s criticism of the paper as, on scientific grounds, complete tosh? Or as he puts it, after a paragraph of criticisms that amount to that verdict, “the evidence from their paper isn’t as strong as they make it out to be”?
Well, the statistical criticisms they mention seem less damning than the statistical problems of the average psych paper.
Beyond all that, I found the claimed effects implausibly large. For example, they report that, among women in relationships, 40% in the ovulation period supported Romney, compared to 23% in the non-fertile part of their cycle.
This does seem rather large, unless they specifically targeted undecided swing voters. But its far from the only psych paper with unreasonably large effect size.
Basically, this paper probably actually only constitutes weak evidence, like most of psycology. But it sounds good enough to be published.
Incidentally, I have a thesis in mathematical psychology due in in a few days, in which I (among other things) fail to replicate a paper published in Nature, no matter how hard I massage the data.
Well, the statistical criticisms they mention seem less damning than the statistical problems of the average psych paper.
Talk about faint praise!
But its far from the only psych paper with unreasonably large effect size.
It’s far from the only psych paper Gelman has slammed either.
Basically, this paper probably actually only constitutes weak evidence, like most of psycology.
Such volumes of faint praise!
But it sounds good enough to be published.
The work of Ioannidis and others is well-known, and it’s clear that the problems he identifies in medical research apply as much or more to psychology. Statisticians such as Gelman pound on junk papers. And yet people still consider stuff like the present paper (which I haven’t read, I’m just going by what Gelman says about it) to be good enough to be published. Why?
Gelman says, and I quote, ”...let me emphasize that I’m not saying that their claims (regarding the effects of ovulation) are false. I’m just saying that the evidence from their paper isn’t as strong as they make it out to be.” I think he would say this about 90%+ of papers in psych.
The work of Ioannidis and others is well-known, and it’s clear that the problems he identifies in medical research apply as much or more to psychology.
Medical research has massive problems of its own, because of the profit motive to fake data.
Statisticians such as Gelman pound on junk papers. And yet people still consider stuff like the present paper (which I haven’t read, I’m just going by what Gelman says about it) to be good enough to be published. Why?
Well, my cynical side would like to say that it’s not in anyone’s interests to push for higher standards—rocking the boat will not advance anyone’s career.
But maybe we’re holding people to unreasonably high standards. Expecting one person to be able to do psychology and neuroscience and stats and computer programming seems like an unreasonable demand, and yet this is what is expected. Is it any wonder that some people who are very good at psychology might screw up the stats?
I had wondered about whether the development of some sort of automated stats program would help. By this, I mean that instead of inputting the data and running a t-test manually, the program determines whether the data is approximately normally distributed, whether taking logs will transform it to a normal distribution, and so forth, before running the appropriate analysis and spitting out a write-up which can be dropped straight into the paper.
It would save a lot of effort and avoid a lot of mistakes. If there is a consensus that certain forms of reporting are better than others, e.g.
Instead, what do we get? Several pages full of averages, percentages, F tests, chi-squared tests, and p-values, all presented in paragraph form. Better to have all possible comparisons in one convenient table.
Then the program could present the results in an absolutely standard format.
Expecting one person to be able to do psychology and neuroscience and stats and computer programming seems like an unreasonable demand
Most papers have multiple authors. If you need to do heavy lifting in stats, bring a statistician on board.
whether the development of some sort of automated stats program would help
I don’t think so. First, I can’t imagine it being flexible enough (and if it’s too flexible its reason for existence is lost) and second it will just be gamed. People like Gelman think that the reliance on t-tests is a terrible idea, anyway, and I tend to agree with him.
My preference is for a radical suggestion: make papers openly provide their data and their calculations (e.g. as a download). After all, this is supposed to be science, right?
This “radical” suggestion is now a funding condition of at least some UK research councils (along with requirements to publish publically funded work in open access forms). A very positive move.… If enforced.
Most papers have multiple authors. If you need to do heavy lifting in stats, bring a statistician on board.
I don’t think this just applies to heavy lifting—basic stats are pretty confusing given that most seem to rely on the assumption of a normal distribution, which is a mathematical abstraction that rarely occurs in real life. And in reality, people don’t bring specialists on board, at least not that I have seen.
My preference is for a radical suggestion: make papers openly provide their data and their calculations (e.g. as a download). After all, this is supposed to be science, right?
I understand why this was not done back when journals were printed on paper, but it really should be done now.
basic stats are pretty confusing given that most seem to rely on the assumption of a normal distribution
If a psych researcher finds “basic stats” confusing, he is not qualified to write a paper which looks at statistical interpretations of whatever results he got. He should either acquire some competency or stop pretending he understands what he is writing.
Many estimates do rely on the assumption of a normal distribution in the sense that these estimates have characteristics (e.g. “unbiased” or “most efficient”) which are mathematically proven in the normal distribution case. If this assumption breaks down, these characteristics are no longer guaranteed. This does not mean that the estimates are now “bad” or useless—in many cases they are still the best you could go given the data.
To give a crude example, 100 is guaranteed to be biggest number in the [1 .. 100] set of integers. If your set of integers is “from one to about a hundred, more or less”, 100 is no longer guaranteed to be the biggest, but it’s still not a bad estimate of the biggest number in that set.
If a psych researcher finds “basic stats” confusing, he is not qualified to write a paper which looks at statistical interpretations of whatever results he got. He should either acquire some competency or stop pretending he understands what he is writing.
The problem is that psychology and statistics are different skills, and someone who is talented at one may not be talented at the other.
To give a crude example, 100 is guaranteed to be biggest number in the [1 .. 100] set of integers. If your set of integers is “from one to about a hundred, more or less”, 100 is no longer guaranteed to be the biggest, but it’s still not a bad estimate of the biggest number in that set.
I take your point, but you can no longer say that 100 is the biggest number with 95% confidence, and this is the problem.
someone who is talented at one may not be talented at the other.
You don’t need to be talented, you only need to be competent. If you can’t pass even that low bar, maybe you shouldn’t publish papers which use statistics.
you can no longer say that 100 is the biggest number with 95% confidence, and this is the problem.
I don’t see any problem here.
First, 95% is an arbitrary number, it’s pure convention that does not correspond to any joint in the underlying reality.
Second, the t-test does NOT mean what most people think it means. See e.g. this or this.
Third, and most important, your certainty level should be entirely determined by the data. If your data does not support 95% confidence, then it does not. Trying to pretend otherwise is fraud.
I had wondered about whether the development of some sort of automated stats program would help. By this, I mean that instead of inputting the data and running a t-test manually, the program determines whether the data is approximately normally distributed, whether taking logs will transform it to a normal distribution, and so forth, before running the appropriate analysis and spitting out a write-up which can be dropped straight into the paper.
Sounds like the mythical Photoshop “Make Art” button.
Estrogen does affect politics too, and when an experiment proved this and was reported in popular science magazines (scientific american, I think) the feminists lost their minds and demanded that the reporter be fired, despite the fact that both the reporter and the scientists were female.
Now consider what kind of publication biases incidents like that introduce.
You may have heard accusations that conservatives are “anti-science”. Most of said “anti-science” behavior is conservatives applying a filter to scientific results attempting to correct for the above bias.
Of course this doesn’t give one a licence to simply ignore science that disagrees with one’s politics. Perhaps a ratio of two PC papers are as reliable as one non-PC paper? Very difficult to properly calibrate I would think, and of course the reliability varies from field to field.
Estrogen does affect politics too, and when an experiment proved this and was reported in popular science magazines (scientific american, I think) the feminists lost their minds and demanded that the reporter be fired, despite the fact that both the reporter and the scientists were female.
The problem is that the experiment likely didn’t prove it. A single experiment doesn’t prove anything. Then the reporter overstate the results with is quite typical for science reporters and people complained.
The problem is that the experiment likely didn’t prove it.
Yes, it is true that there are massive problems in failure to replicate in psychology, not to mention bad statistics etc. However, a single experiment is still evidence in favour.
Then the reporter overstate the results
Actually, the reporter understated the results, for instance by including this quote from an academic who disgrees:
“There is absolutely no reason to expect that women’s hormones affect how they vote any more than there is a reason to suggest that variations in testosterone levels are responsible for variations in the debate performances of Obama and Romney,” said Susan Carroll, professor of political science and women’s and gender studies at Rutgers University, in an e-mail.
Carroll sees the research as following in the tradition of the “long and troubling history of using women’s hormones as an excuse to exclude them from politics and other societal opportunities.”
Thing is, Prof. Carroll is not a neuroscientist. So what gives her the right to tell neuroscientists that they are wrong about neuroscience?
Yes, it is true that there are massive problems in failure to replicate in psychology, not to mention bad statistics etc. However, a single experiment is still evidence in favour.
Whether the reporter should be fired is not only about the quality of the experiment.
Thing is, Prof. Carroll is not a neuroscientist. So what gives her the right to tell neuroscientists that they are wrong about neuroscience?
Whether the article clearly communicates the scientific knowledge that exists. Most mainstream media article about science don’t.
Yes, obviously she has the legal right to argue about things she has no understanding of, and equally obviously I was not talking about legal rights.
If the journalist quotes her, that likely means he called her on the phone and ask her for her opinion.
If you think he should have asked somebody different then the journalist is at fault.
Thing is, Prof. Carroll is not a neuroscientist. So what gives her the right to tell neuroscientists that they are wrong about neuroscience?
Is that what she’s saying? My charitable reading suggests that Prof. Carroll is saying that either hormones don’t affect politics, or else they have an effect for both sexes. Her problem appears to be with the experiment singling out women and their hormones.
As a political scientist, I’m sure she’s familiar with the shameful historical record of science being used to justify some rather odious public policies (racism, eugenics, forced sterilization, etc.). I don’t think she’s as concerned with the actual science as with what people might do with the result, especially if it gets sensationalized.
I think what she’s saying is “You wouldn’t say that men’s hormones affect politics, so why would you say that women’s hormones do?”
But what she doesn’t realise, because she failed to actually talk to actual neuroscientists, is that most neuroscientists would say that hormones affect both men and women.
The reason why the experiment singled out women probably isn’t sexism, its probably because its better career wise to do one paper on women and one on men rather than combining it into one paper, as this gets you twice the number of publications.
Again, I’m trying to see this from a different perspective:
To us, it’s an issue of science. We respect science because we understand it. We can read that study and get the gist of what it’s saying and what it’s not saying. To practitioners of the Dark Arts, however, truth is not an end in itself but merely one more aspect of a debate, to be exploited or circumvented as the situation requires.
In the realm of public debate, science can either be infallible truth or else a complete fabrication (depending on whether it supports your position). Think about it: one study, long since repudiated, fueled the anti-vaccination movement which has been chipping away at decades of progress and may lead to the new outbreaks of diseases we long ago stopped caring about. The proponents may point to that study and say “Aha! Science says vaccines cause autism” while dismissing the mountain of opposing evidence as a conspiracy by Big Pharma.
So what does this have to do with Dr. Carroll’s concerns?
The reason why the experiment singled out women probably isn’t sexism, its probably because its better career wise to do one paper on women and one on men rather than combining it into one paper, as this gets you twice the number of publications.
This. She fears the study about the effects of men’s hormones gets ignored, while the study on women’s hormones gets spun, exaggerated, and sensationalized into another iteration of “women are irrational and hysterical.” It’s a lot harder to do this with one study about people in general than two different studies.
EDIT: The point here is that once a scientific paper gets published, neither the author nor the scientific community get to decide how the research is used or presented.
To practitioners of the Dark Arts, however, truth is not an end in itself but merely one more aspect of a debate, to be exploited or circumvented as the situation requires.
I broadly agree with what you say, however the dark arts are called dark for a reason.
Ironically, while the counter-argument generally used against this is “Its sexist psudoscience!” there is a perfectly valid explanation which is neither demeaning to women nor dissagreeing with experimental results—simply that hormones affect both men and women’s opinions.
Why be so quick to resort to the dark side when there is a perfectly good light-side explanation?
Ironically, while the counter-argument generally used against this is “Its sexist psudoscience!” there is a perfectly valid explanation which is neither demeaning to women nor dissagreeing with experimental results—simply that hormones affect both men and women’s opinions.
I agree with this completely. I was merely trying to see what kind of mindset would produce Dr. Carroll’s reaction and some politics/Dark Arts was the best I could come up with.
So what should I conclude about your attitude towards men from your use of “testosterone” in that sentence?
Well, ideally you would conclude that I was thinking about the digit ratios measured in the LW survey, which collates with testosterone but not estrogen.
Estrogen does affect politics too, and when an experiment proved this and was reported in popular science magazines (scientific american, I think) the feminists lost their minds and demanded that the reporter be fired, despite the fact that both the reporter and the scientists were female.
EDIT: and the article was, in fact, censored.
Are you referring to this article “The Fluctuating Female Vote: Politics, Religion, and the Ovulatory Cycle”? As discussed here?
Yes, I am.
What do you think of Gelman’s criticism of the paper as, on scientific grounds, complete tosh? Or as he puts it, after a paragraph of criticisms that amount to that verdict, “the evidence from their paper isn’t as strong as they make it out to be”?
Well, the statistical criticisms they mention seem less damning than the statistical problems of the average psych paper.
This does seem rather large, unless they specifically targeted undecided swing voters. But its far from the only psych paper with unreasonably large effect size.
Basically, this paper probably actually only constitutes weak evidence, like most of psycology. But it sounds good enough to be published.
Incidentally, I have a thesis in mathematical psychology due in in a few days, in which I (among other things) fail to replicate a paper published in Nature, no matter how hard I massage the data.
Talk about faint praise!
It’s far from the only psych paper Gelman has slammed either.
Such volumes of faint praise!
The work of Ioannidis and others is well-known, and it’s clear that the problems he identifies in medical research apply as much or more to psychology. Statisticians such as Gelman pound on junk papers. And yet people still consider stuff like the present paper (which I haven’t read, I’m just going by what Gelman says about it) to be good enough to be published. Why?
Gelman says, and I quote, ”...let me emphasize that I’m not saying that their claims (regarding the effects of ovulation) are false. I’m just saying that the evidence from their paper isn’t as strong as they make it out to be.” I think he would say this about 90%+ of papers in psych.
Yes. I think he would too. So much the worse for psychology.
And yet people are willing to take its pronouncements seriously.
Medical research has massive problems of its own, because of the profit motive to fake data.
Well, my cynical side would like to say that it’s not in anyone’s interests to push for higher standards—rocking the boat will not advance anyone’s career.
But maybe we’re holding people to unreasonably high standards. Expecting one person to be able to do psychology and neuroscience and stats and computer programming seems like an unreasonable demand, and yet this is what is expected. Is it any wonder that some people who are very good at psychology might screw up the stats?
I had wondered about whether the development of some sort of automated stats program would help. By this, I mean that instead of inputting the data and running a t-test manually, the program determines whether the data is approximately normally distributed, whether taking logs will transform it to a normal distribution, and so forth, before running the appropriate analysis and spitting out a write-up which can be dropped straight into the paper.
It would save a lot of effort and avoid a lot of mistakes. If there is a consensus that certain forms of reporting are better than others, e.g.
Then the program could present the results in an absolutely standard format.
Most papers have multiple authors. If you need to do heavy lifting in stats, bring a statistician on board.
I don’t think so. First, I can’t imagine it being flexible enough (and if it’s too flexible its reason for existence is lost) and second it will just be gamed. People like Gelman think that the reliance on t-tests is a terrible idea, anyway, and I tend to agree with him.
My preference is for a radical suggestion: make papers openly provide their data and their calculations (e.g. as a download). After all, this is supposed to be science, right?
This “radical” suggestion is now a funding condition of at least some UK research councils (along with requirements to publish publically funded work in open access forms). A very positive move.… If enforced.
I don’t think this just applies to heavy lifting—basic stats are pretty confusing given that most seem to rely on the assumption of a normal distribution, which is a mathematical abstraction that rarely occurs in real life. And in reality, people don’t bring specialists on board, at least not that I have seen.
I understand why this was not done back when journals were printed on paper, but it really should be done now.
If a psych researcher finds “basic stats” confusing, he is not qualified to write a paper which looks at statistical interpretations of whatever results he got. He should either acquire some competency or stop pretending he understands what he is writing.
Many estimates do rely on the assumption of a normal distribution in the sense that these estimates have characteristics (e.g. “unbiased” or “most efficient”) which are mathematically proven in the normal distribution case. If this assumption breaks down, these characteristics are no longer guaranteed. This does not mean that the estimates are now “bad” or useless—in many cases they are still the best you could go given the data.
To give a crude example, 100 is guaranteed to be biggest number in the [1 .. 100] set of integers. If your set of integers is “from one to about a hundred, more or less”, 100 is no longer guaranteed to be the biggest, but it’s still not a bad estimate of the biggest number in that set.
The problem is that psychology and statistics are different skills, and someone who is talented at one may not be talented at the other.
I take your point, but you can no longer say that 100 is the biggest number with 95% confidence, and this is the problem.
You don’t need to be talented, you only need to be competent. If you can’t pass even that low bar, maybe you shouldn’t publish papers which use statistics.
I don’t see any problem here.
First, 95% is an arbitrary number, it’s pure convention that does not correspond to any joint in the underlying reality.
Second, the t-test does NOT mean what most people think it means. See e.g. this or this.
Third, and most important, your certainty level should be entirely determined by the data. If your data does not support 95% confidence, then it does not. Trying to pretend otherwise is fraud.
Sounds like the mythical Photoshop “Make Art” button.
It has been pointed out long time ago that a programmer’s keyboard really needs to have a DWIM (Do What I Mean) key...
Now consider what kind of publication biases incidents like that introduce.
Well, one would hope that journals would continue to publish, but the public understanding of science is inevitably going to suffer.
How about what’s actually likely to happen, as opposed to what one would hope would happen.
What is likely to happen is that publication bias increases against non-PC results.
Correct.
You may have heard accusations that conservatives are “anti-science”. Most of said “anti-science” behavior is conservatives applying a filter to scientific results attempting to correct for the above bias.
Of course this doesn’t give one a licence to simply ignore science that disagrees with one’s politics. Perhaps a ratio of two PC papers are as reliable as one non-PC paper? Very difficult to properly calibrate I would think, and of course the reliability varies from field to field.
The problem is that the experiment likely didn’t prove it. A single experiment doesn’t prove anything. Then the reporter overstate the results with is quite typical for science reporters and people complained.
Yes, it is true that there are massive problems in failure to replicate in psychology, not to mention bad statistics etc. However, a single experiment is still evidence in favour.
Actually, the reporter understated the results, for instance by including this quote from an academic who disgrees:
Thing is, Prof. Carroll is not a neuroscientist. So what gives her the right to tell neuroscientists that they are wrong about neuroscience?
Whether the reporter should be fired is not only about the quality of the experiment.
The journalist in this case.
What criteria would you advocate then?
Yes, obviously she has the legal right to argue about things she has no understanding of, and equally obviously I was not talking about legal rights.
Whether the article clearly communicates the scientific knowledge that exists. Most mainstream media article about science don’t.
If the journalist quotes her, that likely means he called her on the phone and ask her for her opinion. If you think he should have asked somebody different then the journalist is at fault.
Is that what she’s saying? My charitable reading suggests that Prof. Carroll is saying that either hormones don’t affect politics, or else they have an effect for both sexes. Her problem appears to be with the experiment singling out women and their hormones.
As a political scientist, I’m sure she’s familiar with the shameful historical record of science being used to justify some rather odious public policies (racism, eugenics, forced sterilization, etc.). I don’t think she’s as concerned with the actual science as with what people might do with the result, especially if it gets sensationalized.
I think what she’s saying is “You wouldn’t say that men’s hormones affect politics, so why would you say that women’s hormones do?”
But what she doesn’t realise, because she failed to actually talk to actual neuroscientists, is that most neuroscientists would say that hormones affect both men and women.
The reason why the experiment singled out women probably isn’t sexism, its probably because its better career wise to do one paper on women and one on men rather than combining it into one paper, as this gets you twice the number of publications.
Again, I’m trying to see this from a different perspective:
To us, it’s an issue of science. We respect science because we understand it. We can read that study and get the gist of what it’s saying and what it’s not saying. To practitioners of the Dark Arts, however, truth is not an end in itself but merely one more aspect of a debate, to be exploited or circumvented as the situation requires.
In the realm of public debate, science can either be infallible truth or else a complete fabrication (depending on whether it supports your position). Think about it: one study, long since repudiated, fueled the anti-vaccination movement which has been chipping away at decades of progress and may lead to the new outbreaks of diseases we long ago stopped caring about. The proponents may point to that study and say “Aha! Science says vaccines cause autism” while dismissing the mountain of opposing evidence as a conspiracy by Big Pharma.
So what does this have to do with Dr. Carroll’s concerns?
This. She fears the study about the effects of men’s hormones gets ignored, while the study on women’s hormones gets spun, exaggerated, and sensationalized into another iteration of “women are irrational and hysterical.” It’s a lot harder to do this with one study about people in general than two different studies.
EDIT: The point here is that once a scientific paper gets published, neither the author nor the scientific community get to decide how the research is used or presented.
This describes Dr. Carroll very well.
I broadly agree with what you say, however the dark arts are called dark for a reason.
Ironically, while the counter-argument generally used against this is “Its sexist psudoscience!” there is a perfectly valid explanation which is neither demeaning to women nor dissagreeing with experimental results—simply that hormones affect both men and women’s opinions.
Why be so quick to resort to the dark side when there is a perfectly good light-side explanation?
I agree with this completely. I was merely trying to see what kind of mindset would produce Dr. Carroll’s reaction and some politics/Dark Arts was the best I could come up with.
Reporters do this all the time. And yet they only get punished for it if the result is politically incorect.
Yes, reporters get away with a lot. That doesn’t make it better.