How is it that capable thinkers and writers destroy their careers by publishing plagiarized paragraphs, sometimes with telling edits that show they didn’t just “forget to put quotes around it?”
Here is my mistake-theory hypothesis:
Authors know the outlines of their argument, but want to connect it with the literature. At this stage, they’re still checking their ideas against the data and theory, not trying to produce a polished document. So in their lit review, they quickly copy/paste relevant quotes into a file. They don’t bother to put quotations or links around it, because at this stage, they’re still just blitzing along producing private notes for themselves and exploring what’s been said before.
After they’ve assembled a massive quantity of these notes, which are interspersed with their own writing, they go back to produce their thesis. However, they’ve assimilated the “average style” of the literature they’re reading, and it’s hard to tell the quotes they pulled from the quotes they wrote.
Occasionally, they mistake a quote they pulled from the literature for a note that’s in their own words. They use this “accidentally plagiarized” text in their own thesis. They aren’t even cognizant of this risk—they fully believe themselves to be reliably capable of discerning their own writing from somebody else’s.
As they edit the thesis, they edit the word choice of the accidentally plagiarized content to give it more polish.
They publish the thesis with the tweaked, accidentally plagiarized material that they fully believe is their own writing.
In an arena where plagiarism is harmful, I’d call this “negligence theory” rather than “mistake theory”. This isn’t just a misunderstanding or incorrect belief, it’s a sloppiness in research that (again, in domains where it matters) should cost the perpetrator a fair bit of standing and trust.
It matters a lot what they do AFTERWARD, too. Admitting it, apologizing, and publishing an updated version is evidence that it WAS a simple unintentional mistake. Hiding it, repeating the problem, etc. are either malice or negligence.
Edit: there’s yet another possibility, which is “intentional use of ideas without attribution”. In some kinds of writing, the author can endorse a very slight variant of someone else’s phrasing, and just use it as their own. It’s certainly NICER to acknowledge the contribution from the original source, but not REQUIRED except in formal settings.
In an arena where plagiarism is harmful, I’d call this “negligence theory” rather than “mistake theory”. This isn’t just a misunderstanding or incorrect belief, it’s a sloppiness in research...
It’s sloppy, but my question is whether it’s unusually sloppy. That’s the difference between a mistake and negligence.
Compare this to car accidents. We expect that there’s an elevated proportion of “consistently unusually sloppy driving” among people at fault for causing car accidents relative to the general driving population. For example, if we look at the population of people who’ve been at fault for a car accident, we will find a higher-than-average level of drunk driving, texting while driving, tired driving, speeding, dangerous maneuvers, and so on.
However, we might also want to know the absolute proportion of at-fault drivers who are consistently unusually sloppy drivers, relative to those who are average or better-than-average drivers who had a “moment of sloppy driving” that happened to result in an accident.
As a toy example, imagine the population is:
1⁄4 consistently good drivers. As a population, they’re responsible for 5% of accidents.
1⁄2 average drivers. As a population, they’re responsible for 20% of accidents.
1⁄4 consistently bad drivers. As a population, they’re responsible for 75% of accidents.
In this toy example, good and average drivers are at fault for about 10% of all accidents.
When we see somebody commit an accident, this should, as you say, make us see them as substantially more likely to be a bad driver. It is also good incentives to punish this mistake in proportion to the damage done, the evidence about underlying factors (i.e. drunk driving), the remorse they display, and their previous driving record.
However, we should also bear in mind that there’s a low but nonzero chance that they’re not a bad driver, they just got unlucky.
Plagiarism interventions
Shifting back to plagiarism, the reason it can be useful to bear in mind that low-but-nonzero chance of a plagiarism “good faith error” is that it suggests interventions to lower the rate of that happening.
For example, I do all my research reading on my computer. Often, I copy/paste quotes from papers and PDFs into a file. What if there was a copy/paste feature that would also preserve the URL (from a browser) or the filename (from a downloaded PDF), and paste it in along with the text?
Alternatively, what if document editors had a feature where, when you copy/pasted text into them, they popped up an option to conveniently note the source or auto-format it as a quotation?
This is more speculative, but if we could digitize and make scientific papers and books publicly available, students could use plagiarism detectors as a precaution to make sure they haven’t committed “mistaken plagiarism” prior to publication. Failing to do so would be clearly negligent.
True—“harm reduction” is a tactic that helps with negligence or mistake, and less so with true adversarial situations. It’s worth remembering that improvements are improvements, even if only for some subset of infractions.
I don’t particularly worry about plagiarism very often—I’m not writing formal papers, but most of my internal documents benefit from a references appendix (or inline) for where data came from. I’d enjoy a plugin that does “referenceable copy/paste”, which includes the URL (or document title, or, for some things, a biblio-formatted source).
Mistake theory on plagiarism:
How is it that capable thinkers and writers destroy their careers by publishing plagiarized paragraphs, sometimes with telling edits that show they didn’t just “forget to put quotes around it?”
Here is my mistake-theory hypothesis:
Authors know the outlines of their argument, but want to connect it with the literature. At this stage, they’re still checking their ideas against the data and theory, not trying to produce a polished document. So in their lit review, they quickly copy/paste relevant quotes into a file. They don’t bother to put quotations or links around it, because at this stage, they’re still just blitzing along producing private notes for themselves and exploring what’s been said before.
After they’ve assembled a massive quantity of these notes, which are interspersed with their own writing, they go back to produce their thesis. However, they’ve assimilated the “average style” of the literature they’re reading, and it’s hard to tell the quotes they pulled from the quotes they wrote.
Occasionally, they mistake a quote they pulled from the literature for a note that’s in their own words. They use this “accidentally plagiarized” text in their own thesis. They aren’t even cognizant of this risk—they fully believe themselves to be reliably capable of discerning their own writing from somebody else’s.
As they edit the thesis, they edit the word choice of the accidentally plagiarized content to give it more polish.
They publish the thesis with the tweaked, accidentally plagiarized material that they fully believe is their own writing.
In an arena where plagiarism is harmful, I’d call this “negligence theory” rather than “mistake theory”. This isn’t just a misunderstanding or incorrect belief, it’s a sloppiness in research that (again, in domains where it matters) should cost the perpetrator a fair bit of standing and trust.
It matters a lot what they do AFTERWARD, too. Admitting it, apologizing, and publishing an updated version is evidence that it WAS a simple unintentional mistake. Hiding it, repeating the problem, etc. are either malice or negligence.
Edit: there’s yet another possibility, which is “intentional use of ideas without attribution”. In some kinds of writing, the author can endorse a very slight variant of someone else’s phrasing, and just use it as their own. It’s certainly NICER to acknowledge the contribution from the original source, but not REQUIRED except in formal settings.
Negligence vs. mistake
It’s sloppy, but my question is whether it’s unusually sloppy. That’s the difference between a mistake and negligence.
Compare this to car accidents. We expect that there’s an elevated proportion of “consistently unusually sloppy driving” among people at fault for causing car accidents relative to the general driving population. For example, if we look at the population of people who’ve been at fault for a car accident, we will find a higher-than-average level of drunk driving, texting while driving, tired driving, speeding, dangerous maneuvers, and so on.
However, we might also want to know the absolute proportion of at-fault drivers who are consistently unusually sloppy drivers, relative to those who are average or better-than-average drivers who had a “moment of sloppy driving” that happened to result in an accident.
As a toy example, imagine the population is:
1⁄4 consistently good drivers. As a population, they’re responsible for 5% of accidents.
1⁄2 average drivers. As a population, they’re responsible for 20% of accidents.
1⁄4 consistently bad drivers. As a population, they’re responsible for 75% of accidents.
In this toy example, good and average drivers are at fault for about 10% of all accidents.
When we see somebody commit an accident, this should, as you say, make us see them as substantially more likely to be a bad driver. It is also good incentives to punish this mistake in proportion to the damage done, the evidence about underlying factors (i.e. drunk driving), the remorse they display, and their previous driving record.
However, we should also bear in mind that there’s a low but nonzero chance that they’re not a bad driver, they just got unlucky.
Plagiarism interventions
Shifting back to plagiarism, the reason it can be useful to bear in mind that low-but-nonzero chance of a plagiarism “good faith error” is that it suggests interventions to lower the rate of that happening.
For example, I do all my research reading on my computer. Often, I copy/paste quotes from papers and PDFs into a file. What if there was a copy/paste feature that would also preserve the URL (from a browser) or the filename (from a downloaded PDF), and paste it in along with the text?
Alternatively, what if document editors had a feature where, when you copy/pasted text into them, they popped up an option to conveniently note the source or auto-format it as a quotation?
This is more speculative, but if we could digitize and make scientific papers and books publicly available, students could use plagiarism detectors as a precaution to make sure they haven’t committed “mistaken plagiarism” prior to publication. Failing to do so would be clearly negligent.
True—“harm reduction” is a tactic that helps with negligence or mistake, and less so with true adversarial situations. It’s worth remembering that improvements are improvements, even if only for some subset of infractions.
I don’t particularly worry about plagiarism very often—I’m not writing formal papers, but most of my internal documents benefit from a references appendix (or inline) for where data came from. I’d enjoy a plugin that does “referenceable copy/paste”, which includes the URL (or document title, or, for some things, a biblio-formatted source).