Figuring out that a paper contains fake research requires a lot of domain knowledge. For instance, I have read enough software engineering papers to spot fake research, but would have a lot of trouble spotting fake research in related fields, e.g., database systems. What counts as fake research, everybody has their own specific opinions.
My approach, based on experience reading very many software engineering, is to treat all papers as having a low value (fake or otherwise) until proven otherwise.
Emailing the author asking for a copy of their data is always interesting; around a third don’t reply, and a third have lost/not kept the data.
Spotting fake research is a (very important) niche topic. A more generally useful proposal would be to teach people how to read papers. Reading one paper might almost be worse than reading none at all, because of the false feeling of knowing it gives the reader. I always tell people to read the thesis from which the paper was derived (if there is one); a thesis provides a lot more context and is a much easier read than a paper (which is a very condensed summary of the thesis). Researchers much prefer to have their paper cited, because thesis citations don’t ‘count’.
Is a Fake journal club worth the effort? It’s possible to spend more time debunking a paper than was spent doing the original research, and for nothing to happen.
Knowing if North Korea is going to do a hydrogen bomb test this year also requires a lot of domain knowledge, and one can invest arbitrary effort into obtaining new data like smuggling oneself into North Korea or interrogating defectors, and may in fact require knowledge it is impossible to obtain outside a particular skull in North Korea. Yet, calibration training still exists and will improve forecasts on both North Korea and on how many M&Ms are in that big jar over there.
This would definitely teach something,but I’m not sold that it actually teaches useful skills of detecting weaknesses in papers. Failures in research are drawn from their own special distribution, which is very different than the sampling-from-GPT distribution.
Part of this can be blamed on GPT-3 not being smart enough—it doesn’t understand (e.g.) magnetic permeability, and so in trying to write about it it will inevitably make mistakes that no human academic would. But then if our language model were smart enough to talk convincingly about magnetic permeability, the journal club members would still be stuck looking for tells that are going to be inhuman unless you’ve somehow assembled a dataset of bad papers to train a classifier on.
I think that doing this with real papers (that have failed to stand the test of time, but grad students probably won’t know that) is actually a lot better, because their mistakes are drawn from the distribution you actually need to learn. It also provides you with a richer supervised signal—you can learn not only that a paper was wrong, but also what process led to it having the contents it did, given that it didn’t reflect reality.
A database of such teaching examples, submitted by professors, would be interesting but would probably get very contentious.
You don’t learn it from journals; few journals embrace the Pottery Barn rule, and the process of getting a criticism published, much less a retraction, would put Kafka to shame.
I can’t figure out how the Pottery Barn rule is relevant to this sentence.
ie. you break the scientific literature (by publishing something so bad it should be retracted), you are responsible for fixing it (by investigating it, retracting it, and publicizing that).
Whereas the way journals actually work is ‘heads I win tails everyone else loses’: they get all the prestige and fame (and extremely lucrative financial payments) from publishing influential research, while taking no responsibility whatsoever for publishing bad research (regardless of the harms, eg. Wakefield), not lifting a finger to investigate any claims or criticisms, and generally adopting an attitude that anything which has been “published after peer review” has therefore been handed down from Mount Sinai and critics should have to work like dogs to justify even getting a few paragraphs into the journal buried where no one can read it and never mentioned anywhere else.
Figuring out that a paper contains fake research requires a lot of domain knowledge. For instance, I have read enough software engineering papers to spot fake research, but would have a lot of trouble spotting fake research in related fields, e.g., database systems. What counts as fake research, everybody has their own specific opinions.
My approach, based on experience reading very many software engineering, is to treat all papers as having a low value (fake or otherwise) until proven otherwise.
Emailing the author asking for a copy of their data is always interesting; around a third don’t reply, and a third have lost/not kept the data.
Spotting fake research is a (very important) niche topic. A more generally useful proposal would be to teach people how to read papers. Reading one paper might almost be worse than reading none at all, because of the false feeling of knowing it gives the reader. I always tell people to read the thesis from which the paper was derived (if there is one); a thesis provides a lot more context and is a much easier read than a paper (which is a very condensed summary of the thesis). Researchers much prefer to have their paper cited, because thesis citations don’t ‘count’.
Is a Fake journal club worth the effort? It’s possible to spend more time debunking a paper than was spent doing the original research, and for nothing to happen.
Knowing if North Korea is going to do a hydrogen bomb test this year also requires a lot of domain knowledge, and one can invest arbitrary effort into obtaining new data like smuggling oneself into North Korea or interrogating defectors, and may in fact require knowledge it is impossible to obtain outside a particular skull in North Korea. Yet, calibration training still exists and will improve forecasts on both North Korea and on how many M&Ms are in that big jar over there.
This would definitely teach something, but I’m not sold that it actually teaches useful skills of detecting weaknesses in papers. Failures in research are drawn from their own special distribution, which is very different than the sampling-from-GPT distribution.
Part of this can be blamed on GPT-3 not being smart enough—it doesn’t understand (e.g.) magnetic permeability, and so in trying to write about it it will inevitably make mistakes that no human academic would. But then if our language model were smart enough to talk convincingly about magnetic permeability, the journal club members would still be stuck looking for tells that are going to be inhuman unless you’ve somehow assembled a dataset of bad papers to train a classifier on.
I think that doing this with real papers (that have failed to stand the test of time, but grad students probably won’t know that) is actually a lot better, because their mistakes are drawn from the distribution you actually need to learn. It also provides you with a richer supervised signal—you can learn not only that a paper was wrong, but also what process led to it having the contents it did, given that it didn’t reflect reality.
A database of such teaching examples, submitted by professors, would be interesting but would probably get very contentious.
I can’t figure out how the Pottery Barn rule is relevant to this sentence.
ie. you break the scientific literature (by publishing something so bad it should be retracted), you are responsible for fixing it (by investigating it, retracting it, and publicizing that).
Whereas the way journals actually work is ‘heads I win tails everyone else loses’: they get all the prestige and fame (and extremely lucrative financial payments) from publishing influential research, while taking no responsibility whatsoever for publishing bad research (regardless of the harms, eg. Wakefield), not lifting a finger to investigate any claims or criticisms, and generally adopting an attitude that anything which has been “published after peer review” has therefore been handed down from Mount Sinai and critics should have to work like dogs to justify even getting a few paragraphs into the journal buried where no one can read it and never mentioned anywhere else.
Ah, gotcha. Thank you.
This post is about journal papers, not answering real world questions (although many authors would claim this is what they are doing).
With regard to nuclear weapons, Dominic Cummins’ recent post is well worth a read, the book he recommends “The Fallacies of Cold War Deterrence and a New Direction” is even more worth reading.
Is MAD doctrine fake research, or just research that might well be very wrong?
It may be also worth splitting out “correct reasoning based on invalid assumptions” and “invalid reasoning based on valid assumptions”.