Subscripting Typographic Convention For Citations/Dates/Sources/Evidentials: A Proposal
Reviving an old General Semantics proposal: borrowing from scientific notation and using subscripts like ‘Gwern2020’ for denoting sources (like citation, timing, or medium) might be a useful trick for clearer writing, compared to omitting such information or using standard cumbersome circumlocutions.
One question I forgot: how should multi-author citations, currently denoted by ‘et al’ or ‘et al.‘, be handled? That notation is pretty ridiculous: not only does it take up 6 letters and is natural language which should be a symbol, it’s ambiguous & hard to machine-parse, and it’s not even English*! Writing ‘Foo et al2010’ or ‘Fooet al 2010’ doesn’t look very nice, and it makes the subscripting far less compact.
My current suggestion is to do the obvious thing: when you elide or omit something in English or technical writing, how do you express that? Why, with an ellipsis ‘…’, of course. So one would just write ‘Foo…2010’ or possibly ‘Foo…2010’.
Horizontal ellipsis aren’t the only kind: there are several others in Unicode, including midline ‘⋯’ and vertical ‘⋮’ and even down right diagonal ellipsis ‘⋱’, so one could imagine doing ‘Foo⋯2010’ or Foo⋮2010″ or ‘Foo⋱2010’.
The vertical ellipsis is nice but unfortunately it’s hard to see the first/top dot because it almost overlaps with the final letter. The midline ellipsis is very middling, and doesn’t really have any virtue. But I particularly like the last one, down-right-diagonal ellipsis, because it works visually so well—it leads the eye down and to the right and is clear about it being an entire phrase, so to speak.
* Actually, it’s not even Latin because it’s an abbreviation for the actual Latin phrase, et alii (to save you one character and also avoid any question of conjugating the Latin—this shit is fractal, is what I’m saying), but as pseudo-Latin, that means that many will italicize it, as foreign words/phrases usually are—but now that is even more work, even more visual clutter, and introduces ambiguity with other uses of italics like titles. Truly a nasty bit of work.
This is a cool idea. However, are you actually using the subscript in two confusingly different ways? In I_2010, it seems you’re talking about you, indexed to the year 2020, whereas in {Abdul Bey}_2000, it seems you’re citing a book. It would be pretty bad for people to see a bunch of the first kind of case, and then expect citations, but only get them half of the time.
I don’t think they’re confusingly different. See the “A single unified notation...” part. Distinguishing the two typographically is codex chauvinism.
On a side note: it really would be nice if we could have normal Markdown subscripts/superscripts supported on LW. It’s not like we don’t discuss STEM topics all the time, and using Latex is overkill and easy to get wrong if you don’t regularly write Tex.
Seems reasonable to me. We use markdown-it for markdown conversion, so does this plugin look like what you would want?
https://github.com/markdown-it/markdown-it-sub
If so, I think I can probably get around to adding that to our markdown plugins sometime this week or early next week.
Yes, seems sensible: hard to go wrong if you copy the Pandoc syntax. You’ll need to add a mention of this to the LW docs, of course, because the existing docs don’t mention sub/superscript either way, and users might assume that LW still copies the Reddit behavior of no-support.
I’m quite excited about things like this. This specific proposal seems reasonable to me, I definitely prefer it over A!B syntax, which I’ve found confusing.
I previously pondered possible annotations to express uncertainty.
I’m quite curious when it will be possible to use ML systems to make automatic annotations. I could imagine some possible browser extensions that could really augment reading ability and clarity.
Overall, typographic innovations like all typography are better the less they stand out yet do their work. At least in somewhat academic text with references and notation subscripting appears to blend right in. I suspect the strength of the proposal is that one can flexibly apply it for readers and tone: sometimes it makes sense to say “I~2020~ thought”, sometimes “I thought in 2020″.
I am seriously planning to use it for inflation adjustment in my book, and may (publisher and test-readers willing) apply it more broadly in the text.
Yes, this relies heavily on the fact that subscripts are small/compact and can borrow meaning from their STEM uses. Doing it as superscripts, for example, probably wouldn’t work as well, because we don’t use superscripts for this sort of thing & already use superscripts heavily for other things like footnotes, while some entirely new symbol or layout is asking to fail & would make it harder to fall back to natural language. (If you did it as, say, a third column, or used some sort of 2-column layout like in some formal languages.)
How are you doing inflation adjustment? I mocked up a bunch of possibilities and I wasn’t satisfied with any of them. If you suppress one of the years, you risk confusing the reader given that it’s a new convention, but if you provide all the variables, it ensure comprehension but is busy & intrusive.
Note for anyone who (like me) wanted to know what the Kesselman Estimative Words are:
Almost certain: 86-99%
Highly likely: 71-85%
Likely: 56-70%
Chances a little better [or less] than even: 46-55%
Unlikely: 31-45%
Highly unlikely: 16-30%
Remote: 1-15%
This seems like a solid improvement over X!Y notation. X!Y seems to not fit my brain in the same way that XのY seems to not fit my brain, and mentally substituting “’s” for “の” helps only partially.
A better question, I think, would be this: “When is it worth it to use this one weird trick to boost the clarity of a work?”
It seems worth it in nerdy circles (i.e. among people who’re already familiar with subscripting) for passages that are dense with jumping around in time as in your chosen example, but I’d expect these sorts of passages to be rare, regardless of the expected readership.
Also, it’s unclear why “on Facebook” deserves to be compressed into an evidential. At the very least, “FB” isn’t immediately obvious what it refers to, whereas a date is easier to figure out from context.
But if passages aren’t dense with that or other uses, then you wouldn’t need to use subscripting much, by definition....
Perhaps you meant, “assuming that it remains a unique convention, most readers will have to pay a one-time cost of comprehension/dislike as overhead, and only then can gain from it; so you’ll need them to read a lot of it to pay off, and such passages may be quite rare”? Definitely a problem. A bit less of one if I were to start using it systematically, though, since I could assume that many readers will have read one of my other writings using the convention and had already paid the price.
Because it brings out the contrast: one is based on first-hand experience & observation, and the other is later socially-performative kvetching for an audience such as family or female acquaintances. The medium is the message, in this case.
I waffled on whether to make it ‘FB’ or ‘Facebook’. I thought “FB” as an abbreviation was sufficiently widely known at this point to make it natural. But maybe not, if even LWers are thrown by it.
Agreed.
Agreed so far…
You’ll need a bunch in a single passage. If you don’t need to disambiguate a large hairball of differently-timed people (like in My Best and Worst Mistake), then you probably shouldn’t bother in general. Put another way, you’re going to want to have a dense, if localized, cluster of people-times that need disambiguating for this to be a better idea than using parentheticals.
I’m struggling to see how this is an improvement over “on FB” or “on Facebook” for either the reader or the writer, assuming you don’t want to bury-but-still-mention the medium/audience.
Not without context or some other way to reduce the universe of things “FB” might refer to. “My wife complained on FB” is probably enough of a determiner most of the time for most people (unless I’m really underslept), but an “FB” subscript isn’t immediately obvious to people who aren’t used to that sort of thing.
Would you say that about citations? “Oh, you only use one source in this paragraph, so just omit the author/year/title. The reader can probably figure it out from mentions elsewhere if they really need to anyway.” That the use of subscripts is particularly clear when you have a hairball of references (in an example constructed to show benefits) doesn’t mean solitary uses are useless.
It’s a matter of emphasis. Yes, you can write it out longhand, much as you can write out any equation or number long hand as not 22230 but “twenty-two divided by two-hundred-and-thirty” if necessary. Natural language is Turing-complete, so to speak: anything you do in a typographic way or a DSL like equations can be done as English (and of course, prior to the invention of various notations, people did write out equations like that, as painful as it is trying to imagine doing algebra while writing everything out without the benefit of even equal-signs). But you usually shouldn’t.
Is the mention of being Facebook in that example so important it must be called out like that? I didn’t think so. It seemed like the kind of snark a husband might make in passing. Writing it out feels like ‘explaining the joke’. Snark doesn’t work if you need to surround it in flashing neon lights with arrows pointing inward saying “I am being sarcastic and cynical and ironic here”. You can modify the example in your head to something which puts less emphasis on Facebook, if you feel strongly about it.
General:
?
Page 80 of the pdf, marked 71 in the text page numbers.
Content:
TL:DR;
“Much more rational” is a tall order. While there is supposedly empirical evidence for the weak version of the hypothesis, it’s hard to find and obtain, particularly anything substantial or recent. (See ‘At length’ for more on this. Or don’t, it’s mostly a dead end after Forms.))
In the post, this part rendered with the 2 asa subscript but the rest (020) as it appears when quoted here.
The first and the third were easier to read than the second.
At length:
The wikipedia page on Linguistic Relativity has this to say on the Sapir-Whorf hypothesis:
The brief section Forms:
So what is this one source that there is empirical evidence?
This* (Supposedly retrieved** as of 2011, though it’s google books, so I’m guessing it hasn’t changed.)
It doesn’t allow direct text copying, so here is some of it retyped***:
According to this summary of the book:
So a dead end. Back to wikipedia:
Empirical research section:
It’s not clear who Lucy is.
Other domains section:
...
Apparently Ruby was inspired by this, but the connection isn’t clear. While there are constructed languages for a few purposes, there are no links to studies on their effects.
That last source, Notation as a tool of thought, has a wayback link.
* https://books.google.com/books?id=2vmHpB2YvXsC&lpg=PP1&pg=PT69#v=onepage&q&f=false
The name of the book is Living Language: An Introduction to Linguistic Anthropology.
** That date is for the book as a whole, and this is wikipedia.
*** The pages appear to contain images rather than text:
https://books.google.com/books/content?id=2vmHpB2YvXsC&pg=PT69&img=1&zoom=3&hl=en&sig=ACfU3U1tulPKjAg_KelXQ3O6ckJ-FkWxDg&w=1280
https://books.google.com/books/content?id=2vmHpB2YvXsC&pg=PT70&img=1&zoom=3&hl=en&sig=ACfU3U0HMCn2cbjKZQP854GXpS-mXdQhTw&w=1280
https://books.google.com/books/content?id=2vmHpB2YvXsC&pg=PT71&img=1&zoom=3&hl=en&sig=ACfU3U3KpjR6RmdrfV8bwSMTe8delOcDPA&w=1280