Abstracts should be either Actually Short™, or broken into paragraphs
It looks to me like academia figured out (correctly) that it’s useful for papers to have an abstract that makes it easy to tell-at-a-glance what a paper is about. They also figured out that abstract should be about a paragraph. Then people goodharted on “what paragraph means”, trying to cram too much information in one block of text. Papers typically have ginormous abstracts that should actually broken into multiple paragraphs.
I think LessWrong posts should probably have more abstracts, but I want them to be nice easy-to-read abstracts, not worst-of-all-worlds-goodharted-paragraph abstracts. Either admit that you’ve written multiple paragraphs and break it up accordingly, or actually streamline it into one real paragraph.
Sorry to pick on the authors of this particular post, but my motivating example today was bumping into the abstract for the Natural Abstractions: Key claims, Theorems, and Critiques. It’s a good post, it’s opening summary just happened to be written in an academic-ish style that exemplified the problem. It opens with:
TL;DR: John Wentworth’s Natural Abstraction agenda aims to understand and recover “natural” abstractions in realistic environments. This post summarizes and reviews the key claims of said agenda, its relationship to prior work, as well as its results to date. Our hope is to make it easier for newcomers to get up to speed on natural abstractions, as well as to spur a discussion about future research priorities. We start by summarizing basic intuitions behind the agenda, before relating it to prior work from a variety of fields. We then list key claims behind John Wentworth’s Natural Abstractions agenda, including the Natural Abstraction Hypothesis and his specific formulation of natural abstractions, which we dub redundant information abstractions. We also construct novel rigorous statements of and mathematical proofs for some of the key results in the redundant information abstraction line of work, and explain how those results fit into the agenda. Finally, we conclude by critiquing the agenda and progress to date. We note serious gaps in the theoretical framework, challenge its relevance to alignment, and critique John’s current research methodology.
There are 179 words. They blur together, I have a very hard time parsing it. If this were anything other than an abstract I expect you’d naturally write it in about 3 paragraphs:
TL;DR: John Wentworth’s Natural Abstraction agenda aims to understand and recover “natural” abstractions in realistic environments. This post summarizes and reviews the key claims of said agenda, its relationship to prior work, as well as its results to date. Our hope is to make it easier for newcomers to get up to speed on natural abstractions, as well as to spur a discussion about future research priorities.
We start by summarizing basic intuitions behind the agenda, before relating it to prior work from a variety of fields. We then list key claims behind John Wentworth’s Natural Abstractions agenda, including the Natural Abstraction Hypothesis and his specific formulation of natural abstractions, which we dub redundant information abstractions. We also construct novel rigorous statements of and mathematical proofs for some of the key results in the redundant information abstraction line of work, and explain how those results fit into the agenda.
Finally, we conclude by critiquing the agenda and progress to date. We note serious gaps in the theoretical framework, challenge its relevance to alignment, and critique John’s current research methodology.
If I try to streamline this without losing info, it’s still hard to get it into something less than 3 paragraphs (113 words)
We review John Wentworth’s Natural Abstraction agenda. We aim to help newcomers to get up to speed on natural abstractions, and to spur discussion about future research priorities.
We summarize the basic intuitions behind the agenda, and its key claims, including both ’the natural abstraction hypothesis”, and his specific formulation of natural abstractions, which we dub “redundant information abstractions.” We connect it to prior work, and also construct novel rigorous statements for some key results in the redundant information abstraction line of work.
Finally, we conclude by critiquing the agenda and progress to date. We note serious gaps in the theoretical framework, challenge its relevance to alignment, and critique John’s current research methodology.
If I’m letting myself throw out significant information, I can get it down to 69 words. I’m not thrilled with this as a paragraph, but my eyes don’t completely glaze over it.
We review John Wentworth’s Natural Abstraction agenda. We conceptualize as having two major claims – the “universal abstraction hypothesis” and the “redundant information hypothesis.” We construct rigorous statements for some key results in the redundant information abstraction line of work, and explain how those results fit into the agenda. We conclude with a critique, noting serious gaps in the theoretical framework, challenge its relevance to alignment, and critique John’s current research methodology.
I think what I actually want in most cases is a very short abstract (1 long sentence or 3 short sentences), followed by a few paragraphs.
I do notice that once you start letting the abstract be multiple paragraphs, it ends up not that different from the introduction to the post.
For comparison:
Introduction
The Natural Abstraction Hypothesis (NAH) says that our universe abstracts well, in the sense that small high-level summaries of low-level systems exist, and that furthermore, these summaries are “natural”, in the sense that many different cognitive systems learn to use them. There are also additional claims about how these natural abstractions should be formalized. We thus split up the Natural Abstraction Hypothesis into two main components that are sometimes conflated:
The Universality Hypothesis: Natural abstractions exist, i.e. many cognitive systems learn similar abstractions.
The Redundant Information Hypothesis: Natural abstractions are well described mathematically as functions of redundant or conserved information.
Closely connected to the Natural Abstraction Hypothesis are several mathematical results as well as plans to apply natural abstractions to AI alignment. We’ll call all of these views together the natural abstractions agenda.
The natural abstractions agenda has been developed by John Wentworth over the last few years. The large number of posts on the subject, which often build on each other by each adding small pieces to the puzzle, can make it difficult to get a high-level overview of the key claims and results. Additionally, most of the mathematical definitions, theorems, and proofs are stated only informally, which makes it easy to mix up conjectures, proven claims, and conceptual intuitions if readers aren’t careful.
In this post, we
survey some existing related work, including in the academic literature,
summarize the key conceptual claims behind the natural abstractions agenda and break them down into specific subclaims,
formalize some of the key mathematical claims and provide formal proofs for them,
outline the high-level plan for how the natural abstractions agenda aims to help with AI alignment,
and critique the agenda by noting gaps in the theory, issues with the relation to alignment, and methodological criticisms.
All except the last of these sections are our attempt to describe John’s views, not our own. That said, we attempt to explain things in the way that makes the most sense to us, which may differ from how John would phrase them somewhat. And while John met with us to clarify his thinking, it’s still possible we’re simply misunderstanding some of his views. The final section discusses our own views: we note some of our agreements but focus on the places where we disagree or see a need for additional work.
In the remainder of this introduction, we provide some high-level intuitions and motivation, and then survey existing distillations and critiques of the natural abstractions agenda. Readers who are already quite familiar with natural abstractions may wish to skip directly to the next section.
Honestly I’m not sure the abstract really adds that much over this. This is 430 words. The original abstract is 179, about 42% as long. The parts of the abstract that nail down “literally what are all the things we included in this post” don’t really seem to add much that I wouldn’t get by skimming the bullet points in the intro. And it’s much easier to read in the intro. (I also bet you could streamline the intro somewhat, which would further reduce the benefit of having an abstract in the first place)
Rather than copying academic abstract style, I’d rather people basically write good introductions, where the first paragraph helps you make a decision about whether to read the rest of intro, and the rest of the intro helps you decide whether to read the rest of the piece.
In this case, I’d maybe just replace the abstract with:
We review John Wentworth’s Natural Abstraction agenda, summarizing it’s key claims and critiquing it’s relevance to alignment. We aim to help newcomers to get up to speed on natural abstractions, offer criticism, and to spur discussion about future research priorities.
and then jump into the introduction, which covers the rest of the information.
- Natural Abstractions: Key claims, Theorems, and Critiques by 16 Mar 2023 16:37 UTC; 228 points) (
- 30 May 2023 17:37 UTC; 4 points) 's comment on LIMA: Less Is More for Alignment by (
- 30 Oct 2023 17:22 UTC; 1 point) 's comment on Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations by (
I agree that formatting abstracts as single paragraph blocks is surprisingly bad for comprehension; I think it is because abstracts are deceptively difficult for the reader, as they tend to invoke a lot of extremely novel & unusual keywords/concepts and make new claims within the space of a few sentences (not infrequently dumping in many numbers & statistical results into parentheticals, which might have a dozen stats in less space than this), and that they are deceptively easy for the authors to read because they suffer from the curse of expertise. Once the reader has paid the cognitive tax of recalling and organizing all the concepts, then suddenly the abstract stops being so confusing.
Introspecting the experience, it feels as if the lack of explicit keywords like ‘Results:’ or their equivalent paragraph-breaks, is ‘the straw that breaks the camel’s back’. It’s not that it is inherently difficult to understand a single run-on paragraph, it’s that it is an extra burden at the worst possible time. (The same run-on paragraph would be read effortlessly a few paragraphs later, after much of the terminology has been introduced.)
I have sometimes tried to read a single-paragraph abstract, found my eyes glazing over as I lose track of the topic amidst the flurry of jargon (is this sentence part of the intro or is it methodology or...), and have to force myself back to the start, read it sentence by sentence, and wait for my understanding to catch up, at which point the abstract suddenly makes sense and I feel a bit frustrated with myself. (As a generalist, I read all sorts of abstracts and have to pay the ‘abstract tax’ each time, so I’ve been sensitized to the ways in which, say, CS & math abstracts tend to be much harder to read than explicitly-standardized keyworded medical abstracts reporting a clinical trial, and machine learning abstracts intermediate because they usually follow the standard organization but without keyword markers.)
This is also why it is so painful to read a series of 1-paragraph abstracts: you are being slammed in the face repeatedly by ultra-dense prose which rubs salt into the wounds by removing the typographical affordances you have been trained to expect.
What I do on Gwern.net is:
use a large set of regexp rewrites to try to reformat keyword-delimited abstracts into a consistent set of keywords. Every journal seems to have its own twist on the standard format of Introduction/Methods/Results/Conclusion, and they all suck and are made worse for the inconsistencies.
wrote a simple
paragraphizer.py
GPT-3 API script which runs automatically on new annotations: if there are no newlines in the abstract, it calls the API with the abstract, asks it for a newline-split version, and if the new version with newlines removed== old version, returns it. It often fails, and I’m not sure why, because the task seems semantically quite simple. Probably the prompt is bad or I don’t use enough shots.Deliberately add newlines to all abstracts I annotate by hand, sometimes rearranging the abstract to fit the standard format better.
Have a lint check for abstracts which detects if they lack newlines (not quite as easy as it sounds since it’s all in HTML, so you have to take into account that it’s not as simple as merely detecting whether there’s more than one
p
element—lists, blockquotes, tables etc), and prints out a warning so I will go and manually insert newlines.I always find the processed versions to be much more readable than the originals, and I hope it helps readers navigating a sea of references.
Have you considered switching to GPT-3.5 or −4? You can get much better results out of much less prompt engineering. GPT-4 is expensive but it’s worth it.
It’s currently at −003 and not the new ChatGPT 3.5 endpoint because when I dropped in the chat model name, the code errored out—apparently it’s under a
chat/
path and so the installed OA Py library errors out. I haven’t bothered to debug it any further (do I need to specify the engine name aschat/turbo-gpt-3
or do I need to upgrade the library to some new version or what). I haven’t even tried GPT-4 - I have the API access, just been too fashed and busy with other site stuff.(Technical-wise, we’ve been doing a lot of Gwern.net refactoring and cleanup and belated documentation—I’ve written like 10k words the past month or two just explaining the link icon history, redirect & link archiving system, and the many popup system iterations and what we’ve learned.)
The better models do require using the chat endpoint instead of the completion endpoint. They are also, as you might infer, much more strongly RL trained for instruction following and the chat format specifically.
I definitely think it’s worth the effort to try upgrading to gpt-3.5-turbo, and I would say even gpt-4, but the cost is significantly higher for the latter. (I think 3.5 is actually cheaper than davinci.)
If you’re using the library you need to switch from Completion to ChatCompletion, and the API is slightly different—I’m happy to provide sample code if it would help, since I’ve been playing with it myself, but to be honest it all came from GPT-4 itself (using ChatGPT Plus.) If you just describe what you want (at least for fairly small snippets), and ask GPT-4 to code it for you, directly in ChatGPT, you may be pleasantly surprised.
(As far as how to structure the query, I would suggest something akin to starting with a “user” chat message of the form “please complete the following:” followed by whatever completion prompt you were using before. Better instructions will probably get better results, but that will probably get something workable immediately.)
Yeah, I will at some point, but frontend work with Said always comes first. If you want to patch it yourself, I’d definitely try it.
https://github.com/gwern/gwern.net/pull/6
It would be exaggerating to say I patched it; I would say that GPT-4 patched it at my request, and I helped a bit. (I’ve been doing a lot of that in the past ~week.)
Do you have a link to a specific part of the gwern site highlighting this, and/or a screenshot?
What’s there to highlight, really? The point is that it looks like a normal abstract… but not one-paragraph. (I’ve mused about moving in a much more aggressive Elicit-style direction and trying to get a GPT to add the standardized keywords where valid but omitted. GPT-4 surely can do that adequately.)
I suppose if you want a comparison, skimming my newest, the first entry right now is Sánchez-Izquierdo et al 2023 and that is an example of reformatting an abstract to add linebreaks which improve its readability:
This is not a complex abstract and far from the worst offender, but it’s still harder to read than it needs to be.
It is written in the standard format, but the writing is ESL-awkward (the ‘one of those’ clause is either bad grammar or bad style), the order of points is a bit messy & confusing (defining the hazard ratio—usually not written in caps—before the point of the meta-analysis or what it’s updating? horse/cart), and the line-wrapping does one no favors. Explicitly breaking it up into intro/method/results/conclusion makes it noticeably more readable.
(In addition, this shows some of the other tweaks I usually make: like being explicit about what ‘Calvin’ is, avoiding the highly misleading ‘significance’ language, avoiding unnecessary use of obsolete Roman numerals (newsflash, people: we have better, more compact, easier-to-read numbers—like ‘1’ & ‘2’!), and linking fulltext rather than contemptuously making the reader fend for themselves even though one could so easily have linked it).
I’m one of the authors on the natural abstractions review you discuss and FWIW I basically agree with everything you say here. Thanks for the feedback!
We’ve shortened our abstract now:
At 62 words, it’s still a bit longer than your final short version but almost 3x shorter than our original version.
Also want to highlight that I strongly agree having TL;DRs at all is good. (Or Intros were the first 1-2 paragraphs are a good TL;DR, like in your post here).
Oh yay. Thanks! Yeah that’s much better.
I suspect you think this because papers are generally written for a specialist audience in mind. I skim many abstracts in my field a day to keep up to date with literature, and I think they’re quite readable even though many are a couple hundred words long. This is because generally speaking authors are just matter-of-factly saying what they did and what they found; if you don’t get tripped up on jargon there’s really nothing difficult to comprehend. If anything, your 69 word version reads more like a typical abstract I see day-to-day than the more verbose version you had earlier; way too much filler to be a good abstract. For example, sentences like these ones rarely show up in abstracts:
Or, put more bluntly, papers really just aren’t textbooks or press articles. They are written to be understandable to specialists in the field, and maybe adjacent fields (a PRL paper would be written to address all physicists, for example), but there’s simply no effort made towards making them easy to understand for others. Look at what I consider to be a fairly typical abstract: https://arxiv.org/abs/2101.05078
It’s really just ‘We designed A. It works like this. We describe A and associated subsystems in detail in the paper. We characterise A by doing B, C, D, and E. The performance agrees with simulation.” There are bad abstracts everywhere, of course, but I disagree that they’re the norm. Many abstracts are quite reasonable, and effectively just say ‘Here’s what we did, and here’s what we found’.
I buy that people who read abstracts all day get better at reading them, but I’m… pretty sure they’re just kinda objectively badly formatted and this’d at least save time learning to scan it.
Like looking at the one you just linked
Would you really rather read that than:
I think once you think about breaking it into paragraphs, there are further optimizations that are pretty obvious (like, the middle paragraph reads like a bunch of bullet-points and would probably be easier to parse in that format).
I predict this’d be at least somewhat good for the specialists who are the primary audience for the thing, as well as “I think it’s dumb for papers to only be legible to other specialists. Don’t dumb things down for the masses obviously, but, like, do some basic readability passes so that people trying to get up-to-speed on a field have an easier time”.
I genuinely don’t see a difference either way, except the second one takes up more space. This is because, like I said, the abstract is just a simple list of things that are covered, things they did, and things they found. You can put it in basically any format, and as long as it’s a field you’re familiar with so your eyes don’t glaze over from the jargon and acronyms, it really doesn’t make a difference.
Or, put differently, there’s essentially zero cognitive load to reading something like this because it just reads like a grocery list to me.
Regarding the latter:
I generally agree. The problem isn’t so much that scientists aren’t trying. Science communication is quite hard, and to be quite honest scientists are often not great writers simply because it takes a lot of time and training to become a good writer, and a lifetime is only 80 years. You have to recognise that scientists generally try quite hard to make papers readable, they/we are just often shitty writers and often are even non-native speakers (I am a native speaker, though of course internationally most scientists aren’t). There are strong incentives to make papers readable since if they aren’t readable they won’t get, well, read, and you want those citations.
The reality I think is if you have a stronger focus on good writing, you end up with a reduced focus on science, because the incentives are already aligned quite strongly for good writing.
I predict most people will have an easier time reading the second one that the first one, holding their jargon-familiarity constant. (the jargon basically isn’t at all a crux for me at all)
(I bet if we arranged some kind of reading comprehension test you would turn out to do better at reading-comprehension for paragraph-broken abstracts vs single-block abstracts. I’d bet this at like 70% confidence for you-specifically, and… like 97% confidence for most college-educated people)
A few reasons I expect this to be true (other than just generalizing from my example and hearing a bunch of people complain about Big Blocks of Text)
Keeping track of where you are in the text.
If you’re reading a long block of text, and then get distracted for any reason, you have to relocate where you left off to keep reading. A long block of text doesn’t give you any hand-holds for doing that.
Pausing and digesting
I (and I think most people) can only digest so much information at once. Paragraph breaks are a way for the author to signal “here is a place you might want to pause briefly and consolidate your thoughts slightly before moving on.”
The paragraph-break is a both a signal that “now is maybe a time to do that”, and it also helps you avoid losing your place after doing so (see previous section)
Skimming
Often when I start reading a paragraph, I’m like “okay, I roughly get this. I don’t really need to fully absorb this info, I want to move on to the next bit.” This could be either because I’m hunting for a specific set of information, or because I’m just trying to build up a high-level understanding of what the text is saying before reading it thoroughly. Paragraphs give me some hand-holds for skimming, because they typically group information in a sensible way.
In the example you link, I think there’s basically with sections of text, one of which saying overall what the topic is, one of which saying “what things do we describe in our paper”, and one roughly describing what the overall results of the paper was. Having it separate paragraphs helps me, say, skip the results summary if I’ve already gotten a sense for what the overall paper was about.
Sure, it could easily be that I’m used to it, and so it’s no problem for me. It’s hard to judge this kind of thing since at some level it’s very subjective and quite contingent on what kind of text you’re used to reading.
When I wrote my thesis my abstract was broken into 4 paragraphs. The examiners suggested making it all one paragraph because “an abstract should be just one paragraph”. But the university template required the abstract to have a page to itself, and I thought the page breaks helped so kept them. Arguably the abstract could have been shorter, but for a thesis like document its harder, because a thesis (in practice) is kind of a mash of different things you did over several years crammed together, so it doesn’t have “a main point”.
I would add an option to use GPT-4 API to show a post summary, offloading it from a human. For the above abstract the bot suggests the following:
The text is about John Wentworth’s Natural Abstraction agenda. This is an effort to understand and recover natural abstractions in realistic environments. The post provides a summary and review of the agenda, including its relationship to prior work and results. The goal is to help people understand natural abstractions and discuss future research priorities.
The post summarizes the intuition behind the agenda and relates it to previous work in various fields. It then lists key claims, including the Natural Abstraction Hypothesis and redundant information abstractions. The post also includes mathematical proofs for some of the key results in the redundant information abstraction line of work.
However, the post also critiques the agenda and its progress to date. It notes gaps in the theoretical framework and challenges its relevance to alignment. Additionally, it critiques John’s current research methodology.
And in the year 2031 of the Common Era, all abstracts on LessWrong are suddenly replaced with the line:
Later this would be judged to mark the beginning of Year 1 of the Silicon Dominate.
I agree, it’s time for LessWrong to start integrating ChatGPT (go devs!). There’s a wait list to access the GPT-4 API, although maybe LessWrong can get themselves to the front of the line faster. GPT-3.5 turbo might suffice.
IMO ~170 words is a decent length for a well-written abstract (well maybe ~150 is better), and the problem is that abstracts are often badly written. Steve Easterbrook has a great guide on writing scientific abstracts; here’s his example template which I think flows nicely:
I still claim this should be three paragraphs. In this breaking at section 4 and section 6 seems to carve it at reasonable joints.
Yeah that seems reasonable! (Personally I’d prefer a single break between sentence 3 and 4)
Yes, with one linebreak, I’d put it at (4). With 2 linebreaks, I’d put it at 4+5. With 3 breaks, 4/5/6. (Giving the full standard format: introduction/background, method, results, conclusion.) If I were annotating that, I would go with 3 breaks.
I wouldn’t want to do a 4th break, and break up 1-3 at all, unless (3) was unusually long and complex and dug into the specialist techniques more than usual so there really was a sort of ‘meaningless super universal background of the sort of since-the-dawn-of-time-man-has-yearned-to-x’ vs ‘ok real talk time, you do X/Y/Z but they all suck for A/B/C reasons; got it? now here’s what you actually need to do:’ genuine background split making it hard to distinguish where the waffle ends and the meat begins.
Some academic journals also have abstracts broken up into separate sections.
Yay. Good news.
See also Using the “executive summary” style: writing that respects your reader’s time and Reasoning transparency.
(Writing this because it might help me with my actual job one day)
I don’t belong to the target audience of such posts. But that’s why I qualify as a newcomer, whee) and if I tried to make the abstract more academia-styled, I’d get something like this:
John Wentworth’s Natural Abstraction agenda aims to understand and recover “natural” abstractions in realistic environments. We introduce the conceptual framework around it and review its key claims, relationship to prior work in a number of fields, and results to date. Of particular interest are the Natural Abstraction Hypothesis and Wensworth’s specific formulation of natural abstractions (here called “redundant information abstractions”). We re-define and draw mathematical proofs for some of the amassed key results. We then discuss the agenda to date including the gaps in theoretical framework and challenge its methodology and relevance to alignment research.
What is an agenda? Is it a technical or a common-speech word? (and what are realistic environments, for that matter)
“Our hope is to make it easier for newcomers to get up to speed on natural abstractions, as well as to spur a discussion about future research priorities.”—unnecessary. It’s what people more-or-less usually do anyway.
I’d like to change the “specific formulation of natural abstractions” to something more precise, but I don’t now the subject.
“and explain how those results fit into the agenda” = discuss, but people don’t say it because it’s just expected of them because math has to be put in context.
(just a personal wish) the word “alignment” should preferably be spelled “Alignment” if it’s a term or followed by “research”.