This is a linkpost for Gideon Lewis-Kraus’s New Yorker article on the (alleged) Ariely and Gino data fraud scandals. I’ve been following this situation off-and-on for a while (and even more so after the original datacolada blog posts). The basic story is that multiple famous professors in social psychology (specializing in dishonesty) have been caught with blatant data fraud. The field to a large extent tried to “protect their own,” but in the end the evidence became too strong. Francesca Gino has since retreated to attempting to sue datacolada (the investigators).
Despite the tragic nature of the story, I consider this material hilarious high entertainment, in addition to being quite educational.
The writing is also quite good, as I’ve come to expect from Gideon Lewis-Kraus (who locals might have heard of from his in-depth profiles on Slate Star Codex, Will MacAskill, and the FTX crash).
Some quotes:
If you tortured the data long enough, as one grim joke went, it would confess to anything. They called such techniques “p-hacking.” As they later put it, “Everyone knew it was wrong, but they thought it was wrong the way it’s wrong to jaywalk.” In fact, they wrote, “it was wrong the way it’s wrong to rob a bank.”
Ziani [a young grad student] found Gino’s results implausible, and assumed that they had been heavily p-hacked. She told me, “This crowd is used to living in a world where you have enough degrees of freedom to do whatever you want and all that matters is that it works beautifully.” But an adviser strongly suggested that Ziani “build on” the paper, which had appeared in a top journal. When she expressed her doubts, the adviser snapped at her, “Don’t ever say that!” Members of Ziani’s dissertation committee couldn’t understand why this nobody of a student was being so truculent. In the end, two of them refused to sign off on her degree if she did not remove criticisms of Gino’s paper from her dissertation. One warned Ziani not to second-guess a professor of Gino’s stature in this way. In an e-mail, the adviser wrote, “Academic research is like a conversation at a cocktail party. You are storming in, shouting ‘You suck!’ ”
A former senior researcher at the lab told me, “He assured us that the effect was there, that this was a true thing, and I was convinced he completely believed it.”
[...]
The former senior researcher said, “How do you swim through that murky area of where is he lying? Where is he stretching the truth? What is he forgetting or misremembering? Because he does all three of those things very consistently. So when it really matters—like with the auto insurance—which of these three things is it?”
(Meme made by myself)
What a quote:
I heard a pretty haunting take about how long it took to discover steroids in bike races. Apparently, there was a while where a “few bad apples” narrative remained popular even when an ostensibly “one of the good ones” guy was outperforming guys discovered to be using steroids.
I’m not sure how dire or cynical we should be about academic knowledge or incentives. I think it’s more or less defensible to assume that no one with a successful career is doing anything real until proven otherwise, but it’s still a very extreme view that I’d personally bet against. Of course also things vary so much field by field.
My current (maybe boring) view is that any academic field where the primary mode of inquiry is applied statistics (much of the social sciences and medicine) is suss. The fields where the primary tool is mathematics (pure mathematics, theoretical CS, game theory, theoretical physics) still seems safe, and the fields where the primary tool is computers (distributed systems, computational modeling in various fields) are reasonably safe. ML is somewhere in between computers and statistics.
Fields where the primary tool is just looking around and counting (demography, taxonomy, astronomy(?)) are probably safe too? I’m confused about how to orient towards the humanities.
I don’t think this is a sufficiently complete way of looking at things. It could make sense when the problem was thought to be “replication crisis via p-hacking” but it turns out things are worse than this.
The research methodology in biology doesn’t necessarily have room for statistical funny business but there are all these cases of influential Science/Nature papers that had fraud via photoshop.
Gino and Ariely’s papers might have been statistically impeccable, the problem is they were just making up data points.
there is fraud in experimental physics and applied sciences too from time to time.
I don’t know much about what opportunities there are for bad research practices in the humanities. The only thing I can think of is citing a source that doesn’t say what is claimed. This seems like a particular risk when history or historical claims are involved, or when a humanist wants to refer to the scientific literature. The spectacular claim that Victorian doctors treated “hysteria” using vibrators turns out to have resulted from something like this.
Outside cases like that, I think the humanities are mostly “safe” like math in that they just need some kind of internal consistency, whether that is presenting a sound argument, or a set of concepts and descriptions that people find to be harmonious or fruitful.
In this incident something was true because the “experts” decided it must be true. That’s humanities in (almost?) every incident.
Given that the replication rate of findings isn’t zero, it seems that some researchers aren’t completely fraudulent and at least partly “real work”.
An interesting question is how many failed replications are due to fraud. Are 20%? 50% or 80% of the studies that don’t replicate fraudulent?
Unfortunately, it’s not that easy. Even a stopped clock is right twice a day: there is a base rate of being ‘right by accident’. Meehl talks a lot about this: if you are in particle physics and you’re predicting the 15th digit of some constant, then the base rate is ~0% and so it’s hard to be right by accident; if you’re in psychology, you’re usually predicting whether some number is less than or bigger than zero (because it’s never zero) because most predictions are so ill-defined that they could be just about any size and no one will blink an eye, and so your base rate of being ‘right’ by accident is… substantially higher than 0%. So depending on you define ‘replication’, it’d be easy to have anywhere up to 50% replication rates with completely fraudulent fields of research.
Given your past writing I’m a bit surprised by that position. I thought you wrote that in a lot of cases, the causative effect of most interventions is very little.
The effect is of course not zero but I would expect them most of the time not to have effect sizes that are strong enough to show up.
If publishable true effects were that easy to come by, fraud wouldn’t be really needed.
I’m assuming that the replications have been sufficiently powered as to always exclude the null—if only because they are reaching the ‘crud factor’ level.
They aren’t easy to come by, which is why Many Labs and the replication efforts exist pretty much solely because Arnold chose to bankroll them. The history of meta-science and psychology in the 2010s would look rather different if one Texan billionaire had different interests.
Here are some Manifold questions about this situation (most from me):
It’s just one of the suspects (Gino) who is suing, right?
Yes, sorry. Will edit.