How to write an academic paper, according to me
Disclaimer: this is entirely a personal viewpoint, formed by a few years of publication in a few academic fields. EDIT: Many of the comments are very worth reading as well.
Having recently finished a very rushed submission (turns out you can write a novel paper in a day and half, if you’re willing to sacrifice quality and sanity), I’ve been thinking about how academic papers are structured—and more importantly, how they should be structured.
It seems to me that the key is to consider the audience. Or, more precisely, to consider the audiences—because different people will read you paper to different depths, and you should cater to all of them. An example of this is the “inverted pyramid” structure for many news articles—start with the salient facts, then the most important details, then fill in the other details. The idea is to ensure that a reader who stops reading at any point (which happens often) will nevertheless have got the most complete impression that it was possible to convey in the bit that they did read.
So, with that model in mind, lets consider the different levels of audience for a general academic paper (of course, some papers just can’t fit into this mould, but many can):
Title readers
The least important audience. An interesting title may draw casual browsers in, but those likely aren’t very valuable readers. Most people encountering an academic article will either be looking for it, or will have had it referred to them from some source. They will likely read more of it. So the main role of the title is to not put off these readers, and to clarify what the paper is about, and what field it belongs in. Witty titles are perfectly acceptable, as long as it fulfils those criteria. So in-jokes for the whole academic field are perfectly acceptable, in-jokes for a narrow subfield are not—unless you’re not aiming beyond that subfield.
Abstract readers
The most important audience of all. Most people reading a paper will only read the abstract, and will then proceed to dismiss the paper or accept it and move on. The abstract thus plays three roles:
It presents the paper’s results. The abstract must be crystal-clear on what the paper says; abstract readers must be able to describe the results correctly.
It establishes the credibility of the result. It can do this by briefly outlying the methods used, and by its general tone. It must thus be serious, and use the correct vocabulary for the field. No room for impressive rhetoric here—dry and descriptive is the model of the abstract.
It can draw the reader into looking into the paper proper. Because of the first two points, it cannot achieve this by teasers or rhetoric. Instead it must present strong results that cause the reader to want to read more.
Skimmers
This audience will skim through the paper to see what it says. Most crucial for them is the introduction and, depending on the field, possibly the conclusion or discussion section. These must tell the skimmers everything there is to know about the paper—what the problem is, what the results are, what methods were used, why these results are valid, why they are important. As long as all these points are covered, rhetoric and wit can be used, in moderation, to make the reading more enjoyable and salient. But be careful to use these in moderation, lest you give the impression that the paper’s results depend on rhetorical tricks. Rhetoric is the flavouring, giving out the information above is the main goal.
Full readers
These are those readers who will go through the whole paper, though they may skim some parts along the way. The important thing here is to get the structure absolutely clear—it must be easy for them to see what the crucial steps or arguments are, what implies what, what relies on what. To do this, lay out the structure of the argument and of the paper clearly in the introduction or in the second section. Emphasise the important results through the paper (consider the layout for this, it can often be used to draw attention to the main points), and connect them together (“combining this with the results of section 2.3x.iii...”). Some rhetoric can be used around these important results, especially if it emphasises their importance.
Deep readers
These are your greatest fans or your more hated critics. They will go through the whole paper, taking your argument apart to understand it completely and figure out how it ticks. No fancy rhetoric for them, just careful attention to detail, clarity, and rigour. In mathematical terms, these are the people who will be reading the proofs of your minor lemmas. Don’t waste space with anything that doesn’t help you establish your argument or your results. These are the lawyers among your readers, looking for the tiniest of flaws. Don’t give them any of these, and don’t try to hide them with weak arguments.
Writing the paper
The different audiences above give a structure to the paper, but they can also give a structure to writing process. Looking back, I realise that I start by writing for the full readers, getting the important points and structure correct. Then I fill in the details for the deep readers. I then write the introduction (and conclusion, if appropriate) for the skimmers, and conclude with the abstract for the most important audience. The title can be chosen at any point in this process.
Hope this helps! I think I’ve been following this advice implicitly for a long time, and it’s got me a few publications. Feel free to ignore it, of course, or to post your own preferred approach.
Excellent advice, both in the post and in the comments. I only wanted to add that at least some readers (that I guess belong somewhere in between the skimmer and full reader categories) read the figure captions (and look at the figures, obviously) besides reading introduction and/or conclusions, as a way to see directly, but rapidly, the main results of the paper and how they are demonstrated. This obviously depends on the field, and I can only know for sure that it happens in my own field(s), stochastic processes/modelling of biological processes/other related fields.
I personally also do it for biology papers, because I do not trust the conclusions, but I’m not sure biologists do this.
I was also a bit surprised by Stuart’s lack of emphasis on figures. Having worked in 2 biology labs, I think most of the people I know who read or write a lot of papers agree that the figures are the most important thing to “read” first and the first thing to “write”. When you have lots of data in a table (or ten), that is where the truth is, but it will tend to be very hard to interpret without scatterplots, error bars, tree diagrams, color coding, maps, and suchlike things.
One of the interesting things about the “figure first” advice is that an author (here I agree with Stuart) should write the first draft of a text starting with the details and building to the summary, but this is the opposite of the order in which an efficient reader should approach the same text. But next to the text is the figures, and here the order in which they are approached is probably the same. Look at them first, construct them first.
Maybe, the abstract is more important for online paywall considerations, like if it is all that many readers can get, and the abstract has to communicate that they should work to find the paper somewhere else? But if I’m reading a paper copy of Science or Nature then I go to the abstracts after the figures, personally. And even online, when I wanted to know whether the natural reservoir of Ebola had been found, the figure was the key thing and I found it via image search.
Now that I think of it… in my last startup, one of the founders would sometimes post to a blog for marketing purposes, and he made sure every single post had an image, because he had discovered by looking at the analytics that image searches that match “alt text” can pull in organic eyeballs like crazy.
Interesting. It really seems to be field thing—neither the maths nor the philosophy I did were much into figures.
I think our field of philosophy, and that of xrisk, could very much benefit from more/better figures, but this might be the biologist in me speaking. Look at how often Nick Bostrom’s (really quite simplistic) xrisk “scope versus intensity” graph is used/reproduced.
Several of my favourite mathematics papers have excellent diagrams.
And some of my best friends use diagrams… but...
The standard formula you are typically taught in science is IMRaD: Introduction, Methods, Results, and Discussion. This of course mainly works for papers that are experimental, but I have always found it a useful zeroth iteration for structure when writing reviews and more philosophical papers: (1) explain what it is about, why it is important, and what others have done. (2) explain how the problem is or can be studied/solved. (3) explain what this tells us. (4) explain what this new knowledge means in the large, the limitations of what we have done and learned, as well as where we ought to go next.
Experienced academics also scan the reference section to see who is cited. This is a surface level analysis of whether the author has done their homework, and where in the literature the paper is situated. It is a crude trick, but fairly effective in saving time. It also leads to a whole host of biases, of course.
Different disciplines work in different ways. In medicine everybody loves to overcite (“The brain [1] is an organ commonly found in the head [2,3], believed to be important for cognition [4-18,23].”) Computer science is lighter on citations and more forgiving of self-cites (the typical paper cites Babbage/Turing, a competing algorithm, and two tech reports and a conference poster by the author about the earlier version of the algorithm). Philosophy tends to either be very low on citations (when dealing with ideas), or have nitpicky page and paragraph citations (when dealing with what someone really argued).
The OP wrote:
Ugh! The vomitous mass of facts and details. I can’t stand articles like that. A little quote starts ringing through my mind “When you talk like this, I can’t help but wonder, do you have a point?”
This is closer to what I would advise.
Start with motivating the reader by identifying a known problem and your contribution to the solution for it. Let him know what’s in the pot of gold at the end of the rainbow, so that he might want to get there.
Up front, tell him the payoff of reading the paper. Then he might be motivated to continue reading.
Then describe the path you’ll be taking him, so that he can track the progress to that pot of gold.
The path should include a formulation of problem, a description of current approaches, a description of your own approach, a comparison of the basic approaches of each, a comparison of the performance of each, and a summary of what was found in the pot of gold and how we found it.
The history of the problem and it’s solutions are something you might add in a longer paper.
I can’t stand articles that leave me wondering where they’re going and why. It goes beyond motivating with a payoff to simply being able to follow what is being presented. If I don’t know where we’re going and why, it’s very hard for me to follow and evaluate the paper. If you’re not going to give me a map, at least identify a purpose.
Done properly, it’s like watching an interlaced image load in. First pass, tell the story in one sentence. Second pass, use a four sentence paragraph. Third pass, four paragraphs. Recurse as needed.
Order still matters on each pass.
Lawyers write memos to other lawyers and the clients to tell them the answer to questions. The most common format is:
Question presented—one sentence, maybe two
Short answer—Ideally yes or no, but usually a couple sentences.
Facts—A description of all of the background behind the question
Analysis or discussion—The reasoning to get from the question, contextualized by the facts, to the answer
Conclusion—A plea for more billable hours. Ahem. I mean a statement of your level of certainty with respect to your answer, and avenues of research that would lend more certainty.
I’ve heard of this format as being called IRAC—issue, rule, analysis, conclusion (where the facts get thrown into the analysis).
Within the analysis, there are many ways of organizing the material, many of which I think are partially redundant of the format of the memo. Here’s an example: http://www.law.cuny.edu/legal-writing/students/memorandum/memorandum-1.html.
Max L.
Reminds me of How to Get a Paper Accepted at OOPSLA by Kent Beck
Excerpt:
I notice that some commenters are presenting the sections “Introduction, Methods, Results, Discussion” as the general structure of an academic paper. This is indeed the structure I have encountered most often, and it is a very good structure for academic writing, but whenever I find a paper with this exact structure I wonder “Where is the conclusion?”. As Stuart mentions most academics read the title, abstract and then quit, but in papers with no Conclusion section I can’t help but sympathise with the readers for this behaviour! After reading the abstract the reader might want a bit more clarity and information about the exact claims made by the authors of the paper, and if it turns out that you have to work your way through the whole Discussion section or the raw data of the Results section just to get more clarity than that one line in the abstract then I consider that to be a good moment to stop reading the paper. As far as I know it’s not common practice to have a separate section summarising the interpreted results, but I personally enjoy reading papers with a Conclusions section far more than those without. Why not make life easy for the mildly-interested reader?
Most academics write for other academics in their own fields, so the conventions of the field matter. For instance, in my mathematics, I almost never saw a conclusion or a discussion. Math papers tend to peter out, with the minor lemmas coming at the end, or maybe some “suggestions for further research”. The important bits in the main text were always introduction and main result (generally to be found in section 3, following the format: intro-definitions-main result-supporting lemma).
Yes, as I mentioned most fields (as far as I know) do not have a separate section for the conclusions, and in a mathematical paper (with proper layout and proper section numbering/naming) such a section would indeed not be all that useful. But in the experimental and theoretical physics papers, as well as the biology papers and some papers in medicine, the Results section is full of (raw) experimental data and/or calculations, and the Discussions section contains several pages about possible improvements to the presented model/setup and sometimes the strengths of the used method over previous attempts. The important conclusions are hidden somewhere amongst this multi-page defense of the authors’ approach to the problem, which isn’t optimal. My teacher used to say: “If your audience didn’t remember your main point, then your presentation has failed.”. In most experimental fields a short summary of the most remarkable conclusions would be helpful to remind readers of the implications of your research, and often I have found this section to be missing (not just absent but also desired).
But I stress that this is just my personal experience, and even if changing the layout improves readability it might be better (career-wise) to stick to the conventions of your field.
I think this viewpoint is correct for reviewers, but not necessarily citers.
As a citer, my attention is distributed in this order: Abstract, Figures, Result & Methods, THEN Introduction & Conclusions & Discusion. In my view, everything other than “this is what i found when I did this” is extra information.
I don’t care nearly as much why you went doing that, nor do I care what you think the results mean, unless I’m actually stumped for explanations. Typically that sort of information is either implicit or in the abstract anyway.
I think this is because reviewers want to know what point you are making, where as people looking to cite stuff are typically trying to support a point rather than understand a point.
Again, this may be field-dependent -in mathematics, reading the paper without reading the intro first is a world of hardness.
The title may be a little more important than you think, minor tipbit from a friend who works for a company analysing citations and public interest in science. (so no good citation to back this up)
A question mark ”?” in the title correlates with approx ~10% lower mentions on twitter and slightly lower citations, a colon “:” approx 10% more.
Interesting. Of course, the confounders are potentially huge—titles with ? are probably weaker results.
Very true, they’re also likely less attention grabbing.
Interesting. I would have expected that to be the other way round, since in my experience colons are more common in very lengthy or jargon-laden titles, and pithy ones often have question marks.
I’m not sure whether that’s true. Quite often when doing literature searches I end up with more papers than I have time to read. Going through the citation list of a paper often gives you quickly more papers than you can look at.
I have seen at least one math paper where the title was suggestive of a more general result than actually delivered in the paper. I wish the title of the paper was given as much thought as the abstract. In the case I’m thinking of, a well placed ‘some’ or ‘certain’ in the title would have fixed it.
When reading social science/economics papers I always makes sure to understand the details of the method used and the exact definitions the authors are using. Also important to check is the magnitude of any affect found and the sample size (though this should be in the abstract). I have found that too many times the abstracts are extremely misleading. The author’s choice of metrics matters. And many common words have no obvious precise definition (examples: “inequality” “economic growth”). In many cases I still skip alot the paper but after seeing so many social science authors use extremely misleading defintions/methods I am afraid of spreading misniformation to myself or others.
I personally wish authors made it super easy to find exactly what they did and made the exact defintiions they are using instantly visible. So I would recomend people do this in their own writing. This ordering:
Introduction, Methods, Results, Discussion )
Is great if the methods and results sections are clearly labeled and well written. But sadlt many papers do not follow this model very closely :(
This is excellent. I’ve had some vague ideas along these lines, but nothing this comprehensive and precise. Very helpful.
In a sense, the paper consists of three parts—title, abstract, and text—whereas there are five types of readers, according to your classificatory schema (though how to delineate these types of course is a bit arbitrary). One question is whether one should have even more layers, to clarify exactly what a skimmer and full reader should read. (This does exist to some extent—e.g. footnotes and appendices presumably are not for skimmers—but one could develop this further.) For instance, each section of the text could start off with a “mini-abstract” which the skimmers could focus on.
I get the sense that today’s article formats are intended to satisfy deep readers (aside from the title and abstract readers) and that more could be done to help, e.g., skimmers. This is just a hunch, though, and I’d be interested in hearing whether people agree with this.
In some journals there is a text box with up to four take home message sentences summarizing what the paper gives us. It is even easier to skim than the abstract, and typically stated in easy (for the discipline) language. I quite like it, although one should recognize that many papers have official conclusions that are a bit at variance with the actual content (or just a biased glass half-full/half-empty interpretation).
I would agree. I think a strong Title and Abstract are important for research purposes. I was able to do more effective research in grad school with those things and I worked to make my papers the same way. In this age of search engine indexing your paper is more likely to be found if those things are strong. I think picking good keywords for those as well is a good idea so the work gets read.
When I skim an empirical paper (typically in psychology), I look at the abstract, then the figures (graphs & tables) to see the study design & results, then the methods section to see what the researchers actually did, and maybe also the results section to clear up lingering questions.
All of the main results of the paper should appear in a graph, table, which should be able to clearly convey the experimental design, the pattern of results (including effect sizes & statistical significance), and the sort of statistical analysis that was done. The figures are basically a souped up version of the abstract, which should be able to basically stand on their own to convey the study (or at least when supported by the abstract and their captions).
The methods section should make it possible to replace all of the abstract labels with concrete descriptions, e.g. “people who had this sentence included in their instructions agreed more with these statements.” (Sometimes the good stuff is in an appendix.) I want to be able to picture what the study involved from the point of view of the research subjects. This helps a lot with assessing the plausibility of the results, seeing possible alternative explanations, and with getting a sense of how much to generalize from these studies.
The results section is the place to look to get more details on analyses that were too complicated to be clearly conveyed in the figures and to check on whether their statistical analyses are kosher. How exactly did they get those “composite scores” that they have in the table? Do the results still hold if they control for this variable? Did they run this additional analysis which could help rule out that alternative explanation? Etc. (Sometimes the good stuff is in a footnote.)