On the Role of Proto-Languages

Link post

I’ve been fascinated recently by historical linguistics, and in particular reconstructed proto-languages such as Proto-Indo-European.

But what even is a “proto-language”?

Well, if we go for the literal operationalization, Wikipedia defines a proto-language as:

a postulated ancestral language from which a number of attested languages are believed to have descended by evolution, forming a language family.
[…]
It is by definition a linguistic reconstruction formulated by applying the comparative method to a group of languages featuring similar characteristics.

So a proto-language is supposed to be the closest common ancestor of a language family (or subfamily), and it is reconstructed using the comparative method (which as the name suggests, works by comparing the different daughter languages to reconstruct (mostly) sound structure (phonology) and word structure (morphology).

But they’re a weird kind of thing.

First, proto-languages are not attested. This means that we have no example of writing in any proto-language. When old dead languages like Sumerian get deciphered (definitely writing a blog post on that one soon!), we have literal texts written in these language from which we can proceed. But that’s not true for any proto-language — instead these are theoretical models of hypothetical parent languages.^[1]

Then, proto-languages are even less “languages” per say than most current languages. If you dig into the research on variations and dialects of any current language (let’s say English, even just American English), you end up noticing quickly how many levels of idealization are required to reduce all of that to a single language. You need to idealize variations within one speaker, between speakers, changes happening over time… And with proto-languages, because of the methodology itself, additional layers of idealization are added.

Notably, the comparative method struggles to separate periods of a reconstructed language; so even if the language did exist, the proto-language mixes together features that might never have cohabitated.

One of the book I was reading actually has a nice analogy for this:

The most helpful metaphor to explain this is the ‘constellation’ analogy. Constellations of stars in the night sky, such as The Plough or Orion, make sense to the observer as points on a sphere of a fixed radius around the earth. We see the constellations as two-dimensional, dot-to-dot pictures, on a curved plane. But in fact, the stars are not all equidistant from the earth: some lie much further away than others. Constellations are an illusion and have no existence in reality. In the same way, the asterisk-heavy ‘star-spangled grammar’ of reconstructed PIE may unite reconstructions which go back to different stages of the language. Some reconstructed forms may be much older than others, and the reconstruction of a datable lexical item for PIE does not mean that the spoken IE parent language must be as old (or as young) as the lexical form.

James Clackson, Indo-European Linguistics: An Introduction, p.16

To summarize, a proto-language is, as far as we know, only a theoretical hidden entity which is not directly visible in the historical record, and has even more idealization baked into it than colloquial discussions of existing languages such as “English” or “French”.

So the natural question for me is: what on Earth are they good for?

I’m not that interested in a reason that seems to be enough for some historical linguists: studying the proto-language as an end in itself. I find this fun, but that’s not enough to make me find proto-languages exciting and surprising — I want something more.

Thinking about this from my own frame, proto-languages make the most sense as compression moves of a language family. Specifically, I see them as reductions: the family is reduced to the proto-language plus a series of historical transformations (such as the sound laws).

So in a way, a proto-language is a posited initial state such that the daughter languages can be derived from it parcimoniously.

The first benefit of such a compression is to even show that a reduction of this kind is possible. My understanding is that most of the debates about genetic relationships between languages rely on the ability to reconstruct a sensible proto-language. That is, in showing that the similarities and regularities are better compressed through derivation from a common ancestor, rather than through many other phenomena such as borrowing.

Many have assumed that demonstrations of linguistic kinship rely mostly on observation of similarities among compared languages, but this is not sufficient, since similarities can be due to several things, accident (chance), borrowing, onomatopoeia, sound symbolism, nursery forms, and universals and typologically commonplace traits, as well as genetic relationship (inheritance from a common ancestor). This being the case, the burden of proof on anyone proposing a genetic relationship among languages is to show that the evidence presented in favor of the hypothesized relationship cannot as easily be explained by these other non-genetic factors.

\- Lyle Campbell and William J. Poser, Language Classification: History and Method, p.165

This is why Indo-European and Afroasiatic are recognized as language families, but not Indo-Pacific or Borean.^[2]

Another powerful use of reduction is to predict what happens in unseen (or unexplored) situations — this is the main use I highlighted in my previous post.

And although this is clearly not the main focus of research on proto-languages, when you dig a bit, you find specialized and technical examples of such predictions.^[3]

The most famous one is the laryngeal theory started by de Saussure. My very vague understanding is that de Saussure studied the current state of PIE (Proto-Indo-European) at the time, and inferred that it could be further compressed by assuming special consonants (laryngeals) that get pronounced in very different ways depending what other sounds exist around them.^[4] Although these consonants are complex, the ways in which they behave can explain cleanly the different behaviours of different daughter languages, as priviledging one or the other expression of it, and forgetting the rest of the complexity.

The problem was that at the time, there was no direct evidence for these consonants in any of the known indo-european languages. So even if it was a nice compression, it was mostly considered a theoretical oddity.

Until, that is, the Hittite language was rediscovered. Because it contained these sounds, or sounds very close.

This is the kind of impressive prediction you expect in physics, not in linguistics!^[5]

And if it’s the most famous and most popular of these examples, it doesn’t look like the only one. When I asked around to various linguists, I got a handful of examples:

Reconstruction of the previous form of a diminutive suffix in Hopi, that was then confirmed by rediscovering an old dictionary of Hopi written in a phonetic alphabet from the Church of Latter-day Saints.^[6]
Reconstruction of which sounds did not rhyme in Old-Chinese (which is basically a closer-to-being-attested-than-most proto-language) but ended up fusioning and rhyming in Middle Chinese. These then explain which modern rhymes never appear in old known rhyming dictionaries.^[7]
Compression and clarification of patterns in Japanese and Ryukyuan dialects after starting to including the latter in the reconstruction of Proto-Japanese.^[8]

The precision and distribution of these across many language families makes me expect that with the right specialized knowledge, I could unearth a lot more of these predictions from reduction by proto-languages.

More broadly, all of this thinking highlights why I find linguistics (both in its synchronic and diachronic shapes) a really interesting wellspring of examples and ideas to study epistemic regularities: it strikes a middle ground between the incredibly fruitful epistemic landscapes of physics and chemistry, and the barren ones of most social sciences like sociology, or some medical sciences like nutrition.

So studying it can both reveal powerful epistemic regularities through the successes of compression and prediction (and maybe even intervention and design with language planning and conlangs); and yet also show how various forms of irreducible complexity and noise limit the generality of these regularities, and introduce many complex factors that need to be handled together.

Another way to say this: stay tuned for more linguistics-inspired posts!

^
And whether they actually represent an existing language is a point of debate and contention. See here for an introduction to the discussion.
^
Note that even if we accept that the ability of reconstructing a proto-language is necessary to show a genetic relationship between languages, failing to do so doesn’t mean that the genetic hypothesis is necessarily wrong. For example, we might just not have enough evidence about the languages for the reconstruction to take place.
^
In this case, the prediction is about things we don’t already know, or we don’t already see in the current data, rather than about future data.
^
In my fragmentary understanding, it’s even more advanced than just reconstructing PIE: de Saussure apparently applied (what would become) internal reconstruction, which is another method internal to one language, trying to reconstruct a previous version of that language.
^
This example is the one that sparked my initial interest in historical linguistics for epistemology.
^
There is no clear summary of this that I’ve found. Instead, the analysis of the lost dictionary is found in this book, with a one paragraph mention of the validated reconstruction on page 68.
^
See this reconstruction specifically. I’ve heard it does other impressive things too, but I haven’t been able to dig enough into the text to check them.
^
I get this from a detailed reddit comment, but I expect that more details can be found in this book on proto-Japanese (which I haven’t studied).