How to write an academic paper, according to me

post by Stuart_Armstrong · 2014-10-15T12:29:02.681Z · LW · GW · Legacy · 27 comments

Contents

  Title readers
  Abstract readers
  Skimmers
  Full readers
  Deep readers
  Writing the paper
None
27 comments

Disclaimer: this is entirely a personal viewpoint, formed by a few years of publication in a few academic fields. EDIT: Many of the comments are very worth reading as well.

Having recently finished a very rushed submission (turns out you can write a novel paper in a day and half, if you're willing to sacrifice quality and sanity), I've been thinking about how academic papers are structured - and more importantly, how they should be structured.

It seems to me that the key is to consider the audience. Or, more precisely, to consider the audiences - because different people will read you paper to different depths, and you should cater to all of them. An example of this is the "inverted pyramid" structure for many news articles - start with the salient facts, then the most important details, then fill in the other details. The idea is to ensure that a reader who stops reading at any point (which happens often) will nevertheless have got the most complete impression that it was possible to convey in the bit that they did read.

So, with that model in mind, lets consider the different levels of audience for a general academic paper (of course, some papers just can't fit into this mould, but many can):

 

Title readers

The least important audience. An interesting title may draw casual browsers in, but those likely aren't very valuable readers. Most people encountering an academic article will either be looking for it, or will have had it referred to them from some source. They will likely read more of it. So the main role of the title is to not put off these readers, and to clarify what the paper is about, and what field it belongs in. Witty titles are perfectly acceptable, as long as it fulfils those criteria. So in-jokes for the whole academic field are perfectly acceptable, in-jokes for a narrow subfield are not - unless you're not aiming beyond that subfield.

 

Abstract readers

The most important audience of all. Most people reading a paper will only read the abstract, and will then proceed to dismiss the paper or accept it and move on. The abstract thus plays three roles:

  1. It presents the paper's results. The abstract must be crystal-clear on what the paper says; abstract readers must be able to describe the results correctly.
  2. It establishes the credibility of the result. It can do this by briefly outlying the methods used, and by its general tone. It must thus be serious, and use the correct vocabulary for the field. No room for impressive rhetoric here - dry and descriptive is the model of the abstract.
  3. It can draw the reader into looking into the paper proper. Because of the first two points, it cannot achieve this by teasers or rhetoric. Instead it must present strong results that cause the reader to want to read more.

 

Skimmers

This audience will skim through the paper to see what it says. Most crucial for them is the introduction and, depending on the field, possibly the conclusion or discussion section. These must tell the skimmers everything there is to know about the paper - what the problem is, what the results are, what methods were used, why these results are valid, why they are important. As long as all these points are covered, rhetoric and wit can be used, in moderation, to make the reading more enjoyable and salient. But be careful to use these in moderation, lest you give the impression that the paper's results depend on rhetorical tricks. Rhetoric is the flavouring, giving out the information above is the main goal.

 

Full readers

These are those readers who will go through the whole paper, though they may skim some parts along the way. The important thing here is to get the structure absolutely clear - it must be easy for them to see what the crucial steps or arguments are, what implies what, what relies on what. To do this, lay out the structure of the argument and of the paper clearly in the introduction or in the second section. Emphasise the important results through the paper (consider the layout for this, it can often be used to draw attention to the main points), and connect them together ("combining this with the results of section 2.3x.iii..."). Some rhetoric can be used around these important results, especially if it emphasises their importance.

 

Deep readers

These are your greatest fans or your more hated critics. They will go through the whole paper, taking your argument apart to understand it completely and figure out how it ticks. No fancy rhetoric for them, just careful attention to detail, clarity, and rigour. In mathematical terms, these are the people who will be reading the proofs of your minor lemmas. Don't waste space with anything that doesn't help you establish your argument or your results. These are the lawyers among your readers, looking for the tiniest of flaws. Don't give them any of these, and don't try to hide them with weak arguments.

 

Writing the paper

The different audiences above give a structure to the paper, but they can also give a structure to writing process. Looking back, I realise that I start by writing for the full readers, getting the important points and structure correct. Then I fill in the details for the deep readers. I then write the introduction (and conclusion, if appropriate) for the skimmers, and conclude with the abstract for the most important audience. The title can be chosen at any point in this process.

Hope this helps! I think I've been following this advice implicitly for a long time, and it's got me a few publications. Feel free to ignore it, of course, or to post your own preferred approach.

27 comments

Comments sorted by top scores.

comment by tegid · 2014-10-16T06:35:43.957Z · LW(p) · GW(p)

Excellent advice, both in the post and in the comments. I only wanted to add that at least some readers (that I guess belong somewhere in between the skimmer and full reader categories) read the figure captions (and look at the figures, obviously) besides reading introduction and/or conclusions, as a way to see directly, but rapidly, the main results of the paper and how they are demonstrated. This obviously depends on the field, and I can only know for sure that it happens in my own field(s), stochastic processes/modelling of biological processes/other related fields.

I personally also do it for biology papers, because I do not trust the conclusions, but I'm not sure biologists do this.

Replies from: JenniferRM, Stuart_Armstrong
comment by JenniferRM · 2014-10-16T07:14:55.929Z · LW(p) · GW(p)

I was also a bit surprised by Stuart's lack of emphasis on figures. Having worked in 2 biology labs, I think most of the people I know who read or write a lot of papers agree that the figures are the most important thing to "read" first and the first thing to "write". When you have lots of data in a table (or ten), that is where the truth is, but it will tend to be very hard to interpret without scatterplots, error bars, tree diagrams, color coding, maps, and suchlike things.

One of the interesting things about the "figure first" advice is that an author (here I agree with Stuart) should write the first draft of a text starting with the details and building to the summary, but this is the opposite of the order in which an efficient reader should approach the same text. But next to the text is the figures, and here the order in which they are approached is probably the same. Look at them first, construct them first.

Maybe, the abstract is more important for online paywall considerations, like if it is all that many readers can get, and the abstract has to communicate that they should work to find the paper somewhere else? But if I'm reading a paper copy of Science or Nature then I go to the abstracts after the figures, personally. And even online, when I wanted to know whether the natural reservoir of Ebola had been found, the figure was the key thing and I found it via image search.

Now that I think of it... in my last startup, one of the founders would sometimes post to a blog for marketing purposes, and he made sure every single post had an image, because he had discovered by looking at the analytics that image searches that match "alt text" can pull in organic eyeballs like crazy.

comment by Stuart_Armstrong · 2014-10-16T08:44:41.015Z · LW(p) · GW(p)

Interesting. It really seems to be field thing - neither the maths nor the philosophy I did were much into figures.

Replies from: Sean_o_h, owencb
comment by Sean_o_h · 2014-10-16T10:11:08.968Z · LW(p) · GW(p)

I think our field of philosophy, and that of xrisk, could very much benefit from more/better figures, but this might be the biologist in me speaking. Look at how often Nick Bostrom's (really quite simplistic) xrisk "scope versus intensity" graph is used/reproduced.

comment by owencb · 2014-10-17T11:20:53.501Z · LW(p) · GW(p)

Several of my favourite mathematics papers have excellent diagrams.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2014-10-17T13:18:58.142Z · LW(p) · GW(p)

And some of my best friends use diagrams... but...

comment by Arenamontanus · 2014-10-15T13:02:21.254Z · LW(p) · GW(p)

The standard formula you are typically taught in science is IMRaD: Introduction, Methods, Results, and Discussion. This of course mainly works for papers that are experimental, but I have always found it a useful zeroth iteration for structure when writing reviews and more philosophical papers: (1) explain what it is about, why it is important, and what others have done. (2) explain how the problem is or can be studied/solved. (3) explain what this tells us. (4) explain what this new knowledge means in the large, the limitations of what we have done and learned, as well as where we ought to go next.

Experienced academics also scan the reference section to see who is cited. This is a surface level analysis of whether the author has done their homework, and where in the literature the paper is situated. It is a crude trick, but fairly effective in saving time. It also leads to a whole host of biases, of course.

Different disciplines work in different ways. In medicine everybody loves to overcite ("The brain [1] is an organ commonly found in the head [2,3], believed to be important for cognition [4-18,23].") Computer science is lighter on citations and more forgiving of self-cites (the typical paper cites Babbage/Turing, a competing algorithm, and two tech reports and a conference poster by the author about the earlier version of the algorithm). Philosophy tends to either be very low on citations (when dealing with ideas), or have nitpicky page and paragraph citations (when dealing with what someone really argued).

Replies from: buybuydandavis
comment by buybuydandavis · 2014-10-16T10:07:55.733Z · LW(p) · GW(p)

The OP wrote:

An example of this is the "inverted pyramid" structure for many news articles - start with the salient facts, then the most important details, then fill in the other details.

Ugh! The vomitous mass of facts and details. I can't stand articles like that. A little quote starts ringing through my mind "When you talk like this, I can't help but wonder, do you have a point?"

(1) explain what it is about, why it is important, and what others have done.

This is closer to what I would advise.

Start with motivating the reader by identifying a known problem and your contribution to the solution for it. Let him know what's in the pot of gold at the end of the rainbow, so that he might want to get there.

Up front, tell him the payoff of reading the paper. Then he might be motivated to continue reading.

Then describe the path you'll be taking him, so that he can track the progress to that pot of gold.

The path should include a formulation of problem, a description of current approaches, a description of your own approach, a comparison of the basic approaches of each, a comparison of the performance of each, and a summary of what was found in the pot of gold and how we found it.

The history of the problem and it's solutions are something you might add in a longer paper.

I can't stand articles that leave me wondering where they're going and why. It goes beyond motivating with a payoff to simply being able to follow what is being presented. If I don't know where we're going and why, it's very hard for me to follow and evaluate the paper. If you're not going to give me a map, at least identify a purpose.

Replies from: Luke_A_Somers
comment by Luke_A_Somers · 2014-10-16T17:32:51.683Z · LW(p) · GW(p)

Ugh! The vomitous mass of facts and details. I can't stand articles like that. A little quote starts ringing through my mind "When you talk like this, I can't help but wonder, do you have a point?"

Done properly, it's like watching an interlaced image load in. First pass, tell the story in one sentence. Second pass, use a four sentence paragraph. Third pass, four paragraphs. Recurse as needed.

Order still matters on each pass.

comment by MaximumLiberty · 2014-10-15T19:06:31.940Z · LW(p) · GW(p)

Lawyers write memos to other lawyers and the clients to tell them the answer to questions. The most common format is:

  • Question presented -- one sentence, maybe two

  • Short answer -- Ideally yes or no, but usually a couple sentences.

  • Facts -- A description of all of the background behind the question

  • Analysis or discussion -- The reasoning to get from the question, contextualized by the facts, to the answer

  • Conclusion -- A plea for more billable hours. Ahem. I mean a statement of your level of certainty with respect to your answer, and avenues of research that would lend more certainty.

I've heard of this format as being called IRAC -- issue, rule, analysis, conclusion (where the facts get thrown into the analysis).

Within the analysis, there are many ways of organizing the material, many of which I think are partially redundant of the format of the memo. Here's an example: http://www.law.cuny.edu/legal-writing/students/memorandum/memorandum-1.html.

Max L.

comment by Gunnar_Zarncke · 2014-10-15T15:32:23.455Z · LW(p) · GW(p)

Reminds me of How to Get a Paper Accepted at OOPSLA by Kent Beck

Excerpt:

One startling sentence. [You] need to find the one thing you want to say that will catch their interest. [...] Find the most interesting thing you have done and write it down, [...] You want the reader's eyes to open wide when they realize what it is you've just said. I think some people are reluctant to boil their message down to one startling sentence because it opens them up to concrete criticism. [...] You can be proven wrong. Wait! You spent five years proving it was easy. Make your case.

Divide your paper into four sections. The first describes the problem to be solved. When the [comittee] member is done reading it, they should understand why it is a problem, and believe that it is important to solve. The second section describes your problem. You are convincing the [comittee] member that your solution really could solve the problem. This section is sometimes supplemented with a section between the defence and related work which describes implementation details. The third section is your defence of why your solution really solves the problem. The [comittee] member reading it should be convinced that the problem is actually solved, and that you have thought of all reasonable counter arguments. The final section describes what other people have done in the area. Upon reading this section, the [comittee] member should be convinced that what you have done is novel.

I try to have four sentences in my abstract. The first states the problem. The second states why the problem is a problem. The third is my startling sentence. The fourth states the implication of my startling sentence.

comment by TheMajor · 2014-10-15T18:01:45.540Z · LW(p) · GW(p)

I notice that some commenters are presenting the sections "Introduction, Methods, Results, Discussion" as the general structure of an academic paper. This is indeed the structure I have encountered most often, and it is a very good structure for academic writing, but whenever I find a paper with this exact structure I wonder "Where is the conclusion?". As Stuart mentions most academics read the title, abstract and then quit, but in papers with no Conclusion section I can't help but sympathise with the readers for this behaviour! After reading the abstract the reader might want a bit more clarity and information about the exact claims made by the authors of the paper, and if it turns out that you have to work your way through the whole Discussion section or the raw data of the Results section just to get more clarity than that one line in the abstract then I consider that to be a good moment to stop reading the paper. As far as I know it's not common practice to have a separate section summarising the interpreted results, but I personally enjoy reading papers with a Conclusions section far more than those without. Why not make life easy for the mildly-interested reader?

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2014-10-15T18:38:35.283Z · LW(p) · GW(p)

Most academics write for other academics in their own fields, so the conventions of the field matter. For instance, in my mathematics, I almost never saw a conclusion or a discussion. Math papers tend to peter out, with the minor lemmas coming at the end, or maybe some "suggestions for further research". The important bits in the main text were always introduction and main result (generally to be found in section 3, following the format: intro-definitions-main result-supporting lemma).

Replies from: TheMajor
comment by TheMajor · 2014-10-15T21:17:42.056Z · LW(p) · GW(p)

Yes, as I mentioned most fields (as far as I know) do not have a separate section for the conclusions, and in a mathematical paper (with proper layout and proper section numbering/naming) such a section would indeed not be all that useful. But in the experimental and theoretical physics papers, as well as the biology papers and some papers in medicine, the Results section is full of (raw) experimental data and/or calculations, and the Discussions section contains several pages about possible improvements to the presented model/setup and sometimes the strengths of the used method over previous attempts. The important conclusions are hidden somewhere amongst this multi-page defense of the authors' approach to the problem, which isn't optimal. My teacher used to say: "If your audience didn't remember your main point, then your presentation has failed.". In most experimental fields a short summary of the most remarkable conclusions would be helpful to remind readers of the implications of your research, and often I have found this section to be missing (not just absent but also desired).

But I stress that this is just my personal experience, and even if changing the layout improves readability it might be better (career-wise) to stick to the conventions of your field.

comment by someonewrongonthenet · 2014-10-18T00:05:59.765Z · LW(p) · GW(p)

I think this viewpoint is correct for reviewers, but not necessarily citers.

As a citer, my attention is distributed in this order: Abstract, Figures, Result & Methods, THEN Introduction & Conclusions & Discusion. In my view, everything other than "this is what i found when I did this" is extra information.

I don't care nearly as much why you went doing that, nor do I care what you think the results mean, unless I'm actually stumped for explanations. Typically that sort of information is either implicit or in the abstract anyway.

I think this is because reviewers want to know what point you are making, where as people looking to cite stuff are typically trying to support a point rather than understand a point.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2014-10-20T09:09:35.003Z · LW(p) · GW(p)

Again, this may be field-dependent -in mathematics, reading the paper without reading the intro first is a world of hardness.

comment by HungryHobo · 2014-10-17T13:55:07.314Z · LW(p) · GW(p)

The title may be a little more important than you think, minor tipbit from a friend who works for a company analysing citations and public interest in science. (so no good citation to back this up)

A question mark "?" in the title correlates with approx ~10% lower mentions on twitter and slightly lower citations, a colon ":" approx 10% more.

Replies from: Stuart_Armstrong, Vulture
comment by Stuart_Armstrong · 2014-10-17T15:35:39.523Z · LW(p) · GW(p)

Interesting. Of course, the confounders are potentially huge - titles with ? are probably weaker results.

Replies from: HungryHobo
comment by HungryHobo · 2014-10-24T13:26:37.048Z · LW(p) · GW(p)

Very true, they're also likely less attention grabbing.

comment by Vulture · 2014-10-17T15:43:42.836Z · LW(p) · GW(p)

Interesting. I would have expected that to be the other way round, since in my experience colons are more common in very lengthy or jargon-laden titles, and pithy ones often have question marks.

comment by ChristianKl · 2014-10-16T10:51:27.101Z · LW(p) · GW(p)

The least important audience. An interesting title may draw casual browsers in, but those likely aren't very valuable readers. Most people encountering an academic article will either be looking for it, or will have had it referred to them from some source.

I'm not sure whether that's true. Quite often when doing literature searches I end up with more papers than I have time to read. Going through the citation list of a paper often gives you quickly more papers than you can look at.

Replies from: Dan_Moore
comment by Dan_Moore · 2014-10-16T13:36:28.342Z · LW(p) · GW(p)

I have seen at least one math paper where the title was suggestive of a more general result than actually delivered in the paper. I wish the title of the paper was given as much thought as the abstract. In the case I'm thinking of, a well placed 'some' or 'certain' in the title would have fixed it.

comment by Princess_Stargirl · 2014-10-15T16:39:23.691Z · LW(p) · GW(p)

When reading social science/economics papers I always makes sure to understand the details of the method used and the exact definitions the authors are using. Also important to check is the magnitude of any affect found and the sample size (though this should be in the abstract). I have found that too many times the abstracts are extremely misleading. The author's choice of metrics matters. And many common words have no obvious precise definition (examples: "inequality" "economic growth"). In many cases I still skip alot the paper but after seeing so many social science authors use extremely misleading defintions/methods I am afraid of spreading misniformation to myself or others.

I personally wish authors made it super easy to find exactly what they did and made the exact defintiions they are using instantly visible. So I would recomend people do this in their own writing. This ordering:

Introduction, Methods, Results, Discussion )

Is great if the methods and results sections are clearly labeled and well written. But sadlt many papers do not follow this model very closely :(

comment by Stefan_Schubert · 2014-10-15T13:27:08.496Z · LW(p) · GW(p)

This is excellent. I've had some vague ideas along these lines, but nothing this comprehensive and precise. Very helpful.

In a sense, the paper consists of three parts - title, abstract, and text - whereas there are five types of readers, according to your classificatory schema (though how to delineate these types of course is a bit arbitrary). One question is whether one should have even more layers, to clarify exactly what a skimmer and full reader should read. (This does exist to some extent - e.g. footnotes and appendices presumably are not for skimmers - but one could develop this further.) For instance, each section of the text could start off with a "mini-abstract" which the skimmers could focus on.

I get the sense that today's article formats are intended to satisfy deep readers (aside from the title and abstract readers) and that more could be done to help, e.g., skimmers. This is just a hunch, though, and I'd be interested in hearing whether people agree with this.

Replies from: Arenamontanus
comment by Arenamontanus · 2014-10-15T14:57:48.066Z · LW(p) · GW(p)

In some journals there is a text box with up to four take home message sentences summarizing what the paper gives us. It is even easier to skim than the abstract, and typically stated in easy (for the discipline) language. I quite like it, although one should recognize that many papers have official conclusions that are a bit at variance with the actual content (or just a biased glass half-full/half-empty interpretation).

comment by cameroncowan · 2014-10-17T23:59:50.517Z · LW(p) · GW(p)

I would agree. I think a strong Title and Abstract are important for research purposes. I was able to do more effective research in grad school with those things and I worked to make my papers the same way. In this age of search engine indexing your paper is more likely to be found if those things are strong. I think picking good keywords for those as well is a good idea so the work gets read.

comment by Unnamed · 2014-10-16T07:46:40.462Z · LW(p) · GW(p)

When I skim an empirical paper (typically in psychology), I look at the abstract, then the figures (graphs & tables) to see the study design & results, then the methods section to see what the researchers actually did, and maybe also the results section to clear up lingering questions.

All of the main results of the paper should appear in a graph, table, which should be able to clearly convey the experimental design, the pattern of results (including effect sizes & statistical significance), and the sort of statistical analysis that was done. The figures are basically a souped up version of the abstract, which should be able to basically stand on their own to convey the study (or at least when supported by the abstract and their captions).

The methods section should make it possible to replace all of the abstract labels with concrete descriptions, e.g. "people who had this sentence included in their instructions agreed more with these statements." (Sometimes the good stuff is in an appendix.) I want to be able to picture what the study involved from the point of view of the research subjects. This helps a lot with assessing the plausibility of the results, seeing possible alternative explanations, and with getting a sense of how much to generalize from these studies.

The results section is the place to look to get more details on analyses that were too complicated to be clearly conveyed in the figures and to check on whether their statistical analyses are kosher. How exactly did they get those "composite scores" that they have in the table? Do the results still hold if they control for this variable? Did they run this additional analysis which could help rule out that alternative explanation? Etc. (Sometimes the good stuff is in a footnote.)