Looking at the article and the abstract, it seems equally plausible that we have just gotten a lot better recently at detecting fraud and plagiarism and error. Is there any reason that the study excludes that possibility?
Wish I’d thought to ask this question myself. The paper doesn’t have the data to answer this, but with a little work one can search PubMed for retractions. Interestingly, I get 2,418 hits while the paper mentions only 2,047.
I grabbed the 441 retraction records for 2011 (in MEDLINE format for easier processing) and slapped together a script to extract the references. There were 458 (some retractions are for multiple publications) and my script pulled 415 publication years from them. Some papers had no year because they were referenced only by DOI and the DOI didn’t include a year; some papers had two publication years because they had a formal publication date and a DOI with a year (presumably these papers are the ones that appear online before getting a formal volume number & page allocation).
A tally of the original publication years for those 415 retractions:
1998: 2
1999: 8
2000: 7
2001: 13
2002: 20
2003: 11
2004: 13
2005: 33
2006: 33
2007: 31
2008: 32
2009: 66
2010: 109
2011: 37
Looks like really old papers aren’t being retracted, which surprises me! I would’ve expected at least a handful from the 1980s, but last year’s retractions stop dead at 1997. Somehow I doubt the shitty pre-1998 papers have all already been retracted.
Scientists have gotten worse at hiding their fraud, or at judging when to be fraudulent.
The total number of scientists and publications increases over time, so the total amount of fraud also increases (the original paper is paywalled, but the abstract doesn’t say that they correct for that)
The proportion of papers published each year that enters PubMed rises with time. PubMed was created in 1996. I don’t know when the database behind it was created, but probably later than 1950. Then the older articles in the database are the ones that were best remembered. Insignificant articles, and probably ones that were retracted or shown to be wrong, simply aren’t in the database.
Perhaps there is a way for articles to be removed from PubMed if retracted (?)
Many retractions are done between the article submission but before publication (because the authors or reviewers notice issues), and so before they enter PubMed. Some fraud thus goes unreported in PubMed.
The Internet’s dissemination and archival of information makes it easier to discover fraud and less likely the discovery will be hushed up or forgotten
Not literally true, but I wouldn’t be surprised that the expansion of access to electronic articles, and the expansion of people with access to see them, has resulted in a 3-10x greater read rate for the important articles.
Possible reasons for a scientist to be fraudulent is glory and fierce competition (which is usually for jobs and grants). Both factors existed prominently in the past as well as nowadays. On the other hand, as buybuydandavis points out, there are good reasons to believe that we’ve become better at spotting suspect scientific articles.
Looking at the article and the abstract, it seems equally plausible that we have just gotten a lot better recently at detecting fraud and plagiarism and error. Is there any reason that the study excludes that possibility?
this can be tested. are old papers being detected as fraud?
Wish I’d thought to ask this question myself. The paper doesn’t have the data to answer this, but with a little work one can search PubMed for retractions. Interestingly, I get 2,418 hits while the paper mentions only 2,047.
I grabbed the 441 retraction records for 2011 (in MEDLINE format for easier processing) and slapped together a script to extract the references. There were 458 (some retractions are for multiple publications) and my script pulled 415 publication years from them. Some papers had no year because they were referenced only by DOI and the DOI didn’t include a year; some papers had two publication years because they had a formal publication date and a DOI with a year (presumably these papers are the ones that appear online before getting a formal volume number & page allocation).
A tally of the original publication years for those 415 retractions:
1998: 2
1999: 8
2000: 7
2001: 13
2002: 20
2003: 11
2004: 13
2005: 33
2006: 33
2007: 31
2008: 32
2009: 66
2010: 109
2011: 37
Looks like really old papers aren’t being retracted, which surprises me! I would’ve expected at least a handful from the 1980s, but last year’s retractions stop dead at 1997. Somehow I doubt the shitty pre-1998 papers have all already been retracted.
[Edited to fix list and 2003 figure.]
Other alternative possibilities:
Scientists have gotten worse at hiding their fraud, or at judging when to be fraudulent.
The total number of scientists and publications increases over time, so the total amount of fraud also increases (the original paper is paywalled, but the abstract doesn’t say that they correct for that)
The proportion of papers published each year that enters PubMed rises with time. PubMed was created in 1996. I don’t know when the database behind it was created, but probably later than 1950. Then the older articles in the database are the ones that were best remembered. Insignificant articles, and probably ones that were retracted or shown to be wrong, simply aren’t in the database.
Perhaps there is a way for articles to be removed from PubMed if retracted (?)
Many retractions are done between the article submission but before publication (because the authors or reviewers notice issues), and so before they enter PubMed. Some fraud thus goes unreported in PubMed.
The Internet’s dissemination and archival of information makes it easier to discover fraud and less likely the discovery will be hushed up or forgotten
Even if that’s true, creating the perception that there are more people who cheat, is likely to encourage more people to cheat.
Is there a reason to believe we’ve got 3-10x better at detecting fraud in the past decade?
Given enough eyeballs, all bugs are shallow.
Not literally true, but I wouldn’t be surprised that the expansion of access to electronic articles, and the expansion of people with access to see them, has resulted in a 3-10x greater read rate for the important articles.
Well, is there a reason to believe scientists have become 3x-10x more fraudulent in the past decade?
Possible reasons for a scientist to be fraudulent is glory and fierce competition (which is usually for jobs and grants). Both factors existed prominently in the past as well as nowadays. On the other hand, as buybuydandavis points out, there are good reasons to believe that we’ve become better at spotting suspect scientific articles.
I like to include ‘money’ in lists regarding motives for fraud too. There is plenty of that floating about in (certain kinds of) science.