I tend to follow the linguist, McWhorter, on historical trends in languages over time, in believing (controversially!) that undisrupted languages become weirder over time, and only gains learnability through pragmatic pressures, as in trading, slavery, conquest, etc which can increase the number of a language’s second language learners (who edit for ease of learning as they learn).
A huge number of phonemes? Probably its some language in the mountains with little tourism, trade, or conquest for the last 8,000 years. Every verb conjugates irregularly? Likely to be found in the middle of a desert. And so on.
The normal, undisrupted, pattern is for every generation to make mistakes and play around, decorating the language with entropic silliness, and accidentally causing future children to “only really learn to speak fully properly” at older and older and older ages… until around 11 or 12 or 13 or 14 puberty strikes, and kids stop diligently learning any random bullshit the older people say based on trust. English competency arrives around age 8 because English is a toy language created by waves and waves and waves of trade, conquest, and cultural admixture. We have a lot of room to get much weirder and stay within traditional human bounds.
((That is, we have a lot of room for English, left alone, to mutate, IF this broader theory is correct. It might not be.
A way to test the larger theory would be to anthropologically construct a way of predicting from first principles when puberty tends to start in human subpopulations (because we have strong suggestions that diet and social patterns can change it), then reconstruct the predicted value of puberty onset over historical timescales, then correlate that to modern relatively easily measured “age until language mastery” for many modern languages.
That would confirm most of the theory. The other thing you’d need to track is the percentage of speakers who are speaking after learning any given language as a second language. High rates of this should simplify a tongue and cut against the other process that adds complexity by default.))
To show how weird English is: English is the only proto indo european language that doesn’t think the moon is female (“la luna”) and spoons are male (“der Löffel”). I mean… maybe not those genders specifically in every language. But some gender in each language.
I just looked up Gujurati, which is also descended from Proto-Indo-European and moon (chandri (“ચંદ્રા”)) is feminine and ladle (chamcho (“ચમચો”)) is masculine… but teaspoon (chamchi (“ચમચી”)) is feminine(!)… so… yeah… that one retained gender and also has gender/semantic conflation! :-)
Except in English. The moon is a rock in English, not a girl. And a spoon is a tool, not a boy. Because English is a weird rare toy language (practically a creole, implying that it was a pidgin for many), that doesn’t force people to memorize reams of playful historical bullshit, in order to “sound like they speak it properly” :-)
“English” traces all the way back to a language (with gendered declined nouns and verb conjugation) spoken by Eurasian Charioteers in 7000BC or whatever and at each step most of the changes were all just “part of the stream of invective”.
...
Regarding word count specifically…
Something you find over and over and over again in language is agglutinating grammar where entire sentences are just. One. Word. But not like that… rather: Asinglebigwordcanbeusedtocommunicate oneideafromamongavastarray.
These languages are also often irregular! (6) Like the language was already agglutinative 1000 years ago, (9) and then people spent the next ten centuries making it more pronounceable, and punny, and fun??? (16)
> These words are not normal either! (6) Like language was already coherent 1000 years ago, (8) and people spent the last decade trying to make it more sensible, and cool??? (14)
The above paragraph round trips through “Google’s understanding of Inuktut”, which (I think?) is a simplified language arising from systematizing and averaging out dialects starting from relatively normally complex languages like Inuktitut… and basically all of those polar languages are agglutinative, and have been at least for centuries.
I brought that one paragraph back to English to suggest roughly how much was lost by Google’s translation.
The parenthetic numbers show “words per clause” through the process:
6-->3-->5! 9-->6-->8, 16-->9-->14???
So here’s my (half silly) proposal: maybe English experienced catastrophic simplifications between ~600AD and ~1500AD and then became preternaturally frozen once it was captured in text by the rise of printing, literacy, industrialization, and so on. The starting point itself was relatively unnatural, I think.
So then, in recent history, maybe what we’re seeing is just a looooong and slooooow motion trend (that’ll take a millennium or three to complete at this rate (unless we abandon literacy or something, and free the language from the strictures of printing and mass education?)) where English is still slowly trying to become an agglutinative language with irregular morphology?
Like (here’s the deep crazy idea:) like maybe every language wants to ultimately be after >200 generations of accumulated youthful ignorance, cryptogenic wordplay, lazy mouths, and no writing?
For example: I just made up the word “cryptogenic” to be “having a genesis in a desire to be hard to understand” (which I considered myself to have a right to do, since english has a productive morphology) but when I looked up other skilled speakers have deployed it in other ways… Oxford thinks it means “(of a disease) of obscure or uncertain origin” and most of the usages are for “diseases not yet subjectively diagnosed by the doctor during the course of treatment (rather than diseases whose etiology is a known mystery to standard medical science)”. It gets used like “Knowing the cause of a cryptogenic stroke can help prevent recurrent stroke” (source is the metadata summary of this webpage).
Whereas I’m claiming that many words are cryptogenic in the sense that they started out, like “skibidi”, within youth culture because kids liked that grownups didn’t know what it means. If “skibidi” catches on, and gains an intergeneratioanlly re-usable meaning (maybe related to being scared in a fun way? or yet-another-adjective like hep? or whatever?) then it will have been partly possible because kids liked having their own words that “parents just don’t understand”.
This is hard for English, because it is written. And because many second language speakers learn English every year.
But one thing that English can do (despite enormous pressures to be learnable and written in a stable way) is boil itself down to stock phrases for entire sentences. Later, these stock phrases could eventually agglutinate into single words, maybe, or at least they might if global civilization and travel and communication collapses in a way that leaves literally any humans alive, but trapped in tiny local regions with low literacy for many generations… which is a very specific and unlikely possible future. (Prolly we either get wildly richer and become transhuman or else just all end up dead to predatory posthumans.)
To show how weird English is: English is the only proto indo european language that doesn’t think the moon is female (“la luna”) and spoons are male (“der Löffel”). I mean… maybe not those genders specifically in every language. But some gender in each language.
Persian is ungendered too. They don’t even have gendered pronouns.
Thank you for the correction! I didn’t realize Persian descended from PIE too. Looking at the likely root cause of my ignorance, I learned that Kurdish and Pashto are also PIE descended. Pashto appears to have noun gender, but I’m getting hints that at least one dialect of Kurdishalso might not?!
If Sorani doesn’t have gendered nouns then I’m going to predict (1) maybe Kurdish is really old and weird and interesting (like branching off way way long ago with more time to drift) and/or (2) there was some big trade/empire/mixing simplification that happened “more recently” with divergence later?
If neither of those are true, then my larger heuristic about “why English is weird” might have a deep abstract counter example, and deserve lower credence.
Persian is a language of empire and social mixing, so its “similar simplification” doesn’t actually function as a strong counter-example to the broader thesis, but it is still great to be surprised :-)
This is interesting. I think English concentrates its weirdness in pronunciation, which is very irregular. Although adult native speakers don’t realize it, this presents a serious learning difficulty for non-native speakers and young English-speaking children. Studies show that English-speaking students need more years of learning to master their language (at least for reading) than French students do, who themselves need more years than young Italian, Spanish or Finnish students (Stanislas Dehaene, Reading in the brain).
I think most of that is actually a weirdness in our orthography. To linguists, languages are, fundamentally a thing that happens in the mouth and not on the page. In the mouth, the hardest thing is basically rhoticism… the “tongue curling back” thing often rendered with “r”. The Irish, Scottish, and American accents retain this weirdness, but a classic Boston, NYC, or southern British accents tends to drop it.
The Oxford English Dictionary gives two IPA transcriptions for “four”: the American /fɔr/ makes sense to me and has an “r” in it, but the British is /fɔː/ has just totally given up on curling the tongue or trying to pretend in the dictionary that this is happening in human mouths.
That tongue curl is quite hard. Quite a few five year olds in rural Idaho (and maybe regions where rhotic dialects are maintained) often struggle with it, and are corrected by teachers and parents (and maybe made fun of by peers) for not speaking properly… for spontaneously adopting “a New York Accent” due a very common a childhood “speech impediment”. Many ESL speakers drop it, hence the city dialects dropping it, not just in practice in the mouth, but officially.
English orthography is kind of a disaster, I agree. It attempts to shoehorn a german/celtic/french/norse pidgin-or-creole into the latin letter system, and … yeah. Tough task. It was never going to be clean.
If I was going to offer a defense of the status quo here, I’d say that there is no flat/simple orthography to switch to.
Every accent would need its own separate “spelling reform” and their texts would be less mutually intelligible, and it would hurt science and the letters quite a lot, and also probably lead to faster drift into a world where “English” denotes a language family rather than a language.
Interestingly, Interslavic is an attempt to “design by hand” a similar thing for slavic speakers to what English still has bascially for free: common words with stable spellings and meanings, and huge tolerance for how they are pronounced. Once you see the overarching vision for “a written language system” with these properties as a desirable end point… since English is already at that desirable end point, why change it? <3
I don’t think children have any more difficulty learning to speak English than other languages. The difficulty comes in learning to spell in writing and, to a lesser extent, learning to pronounce written words when writing. Btw, there’s actually much more regularity in English spelling/pronunciation than may appear, and than is routinely taught. Much of the “weirdness” is the result of historical processes which are fairly regular in themselves, once you know the rules.
I tend to follow the linguist, McWhorter, on historical trends in languages over time, in believing (controversially!) that undisrupted languages become weirder over time, and only gains learnability through pragmatic pressures, as in trading, slavery, conquest, etc which can increase the number of a language’s second language learners (who edit for ease of learning as they learn).
A huge number of phonemes? Probably its some language in the mountains with little tourism, trade, or conquest for the last 8,000 years. Every verb conjugates irregularly? Likely to be found in the middle of a desert. And so on.
The normal, undisrupted, pattern is for every generation to make mistakes and play around, decorating the language with entropic silliness, and accidentally causing future children to “only really learn to speak fully properly” at older and older and older ages… until around 11 or 12 or 13 or 14 puberty strikes, and kids stop diligently learning any random bullshit the older people say based on trust. English competency arrives around age 8 because English is a toy language created by waves and waves and waves of trade, conquest, and cultural admixture. We have a lot of room to get much weirder and stay within traditional human bounds.
((That is, we have a lot of room for English, left alone, to mutate, IF this broader theory is correct. It might not be.
A way to test the larger theory would be to anthropologically construct a way of predicting from first principles when puberty tends to start in human subpopulations (because we have strong suggestions that diet and social patterns can change it), then reconstruct the predicted value of puberty onset over historical timescales, then correlate that to modern relatively easily measured “age until language mastery” for many modern languages.
That would confirm most of the theory. The other thing you’d need to track is the percentage of speakers who are speaking after learning any given language as a second language. High rates of this should simplify a tongue and cut against the other process that adds complexity by default.))
To show how weird English is: English is the only proto indo european language that doesn’t think the moon is female (“la luna”) and spoons are male (“der Löffel”). I mean… maybe not those genders specifically in every language. But some gender in each language.
I just looked up Gujurati, which is also descended from Proto-Indo-European and moon (chandri (“ચંદ્રા”)) is feminine and ladle (chamcho (“ચમચો”)) is masculine… but teaspoon (chamchi (“ચમચી”)) is feminine(!)… so… yeah… that one retained gender and also has gender/semantic conflation! :-)
Except in English. The moon is a rock in English, not a girl. And a spoon is a tool, not a boy. Because English is a weird rare toy language (practically a creole, implying that it was a pidgin for many), that doesn’t force people to memorize reams of playful historical bullshit, in order to “sound like they speak it properly” :-)
“English” traces all the way back to a language (with gendered declined nouns and verb conjugation) spoken by Eurasian Charioteers in 7000BC or whatever and at each step most of the changes were all just “part of the stream of invective”.
...
Regarding word count specifically…
Something you find over and over and over again in language is agglutinating grammar where entire sentences are just. One. Word. But not like that… rather: Asinglebigwordcanbeusedtocommunicate oneideafromamongavastarray.
These languages are also often irregular! (6) Like the language was already agglutinative 1000 years ago, (9) and then people spent the next ten centuries making it more pronounceable, and punny, and fun??? (16)
The above paragraph round trips through “Google’s understanding of Inuktut”, which (I think?) is a simplified language arising from systematizing and averaging out dialects starting from relatively normally complex languages like Inuktitut… and basically all of those polar languages are agglutinative, and have been at least for centuries.
I brought that one paragraph back to English to suggest roughly how much was lost by Google’s translation.
The parenthetic numbers show “words per clause” through the process:
So here’s my (half silly) proposal: maybe English experienced catastrophic simplifications between ~600AD and ~1500AD and then became preternaturally frozen once it was captured in text by the rise of printing, literacy, industrialization, and so on. The starting point itself was relatively unnatural, I think.
So then, in recent history, maybe what we’re seeing is just a looooong and slooooow motion trend (that’ll take a millennium or three to complete at this rate (unless we abandon literacy or something, and free the language from the strictures of printing and mass education?)) where English is still slowly trying to become an agglutinative language with irregular morphology?
Like (here’s the deep crazy idea:) like maybe every language wants to ultimately be after >200 generations of accumulated youthful ignorance, cryptogenic wordplay, lazy mouths, and no writing?
For example: I just made up the word “cryptogenic” to be “having a genesis in a desire to be hard to understand” (which I considered myself to have a right to do, since english has a productive morphology) but when I looked up other skilled speakers have deployed it in other ways… Oxford thinks it means “(of a disease) of obscure or uncertain origin” and most of the usages are for “diseases not yet subjectively diagnosed by the doctor during the course of treatment (rather than diseases whose etiology is a known mystery to standard medical science)”. It gets used like “Knowing the cause of a cryptogenic stroke can help prevent recurrent stroke” (source is the metadata summary of this webpage).
Whereas I’m claiming that many words are cryptogenic in the sense that they started out, like “skibidi”, within youth culture because kids liked that grownups didn’t know what it means. If “skibidi” catches on, and gains an intergeneratioanlly re-usable meaning (maybe related to being scared in a fun way? or yet-another-adjective like hep? or whatever?) then it will have been partly possible because kids liked having their own words that “parents just don’t understand”.
This is hard for English, because it is written. And because many second language speakers learn English every year.
But one thing that English can do (despite enormous pressures to be learnable and written in a stable way) is boil itself down to stock phrases for entire sentences. Later, these stock phrases could eventually agglutinate into single words, maybe, or at least they might if global civilization and travel and communication collapses in a way that leaves literally any humans alive, but trapped in tiny local regions with low literacy for many generations… which is a very specific and unlikely possible future. (Prolly we either get wildly richer and become transhuman or else just all end up dead to predatory posthumans.)
Persian is ungendered too. They don’t even have gendered pronouns.
https://en.wikipedia.org/wiki/Persian_grammar
Thank you for the correction! I didn’t realize Persian descended from PIE too. Looking at the likely root cause of my ignorance, I learned that Kurdish and Pashto are also PIE descended. Pashto appears to have noun gender, but I’m getting hints that at least one dialect of Kurdish also might not?!
If Sorani doesn’t have gendered nouns then I’m going to predict (1) maybe Kurdish is really old and weird and interesting (like branching off way way long ago with more time to drift) and/or (2) there was some big trade/empire/mixing simplification that happened “more recently” with divergence later?
If neither of those are true, then my larger heuristic about “why English is weird” might have a deep abstract counter example, and deserve lower credence.
Persian is a language of empire and social mixing, so its “similar simplification” doesn’t actually function as a strong counter-example to the broader thesis, but it is still great to be surprised :-)
This is interesting. I think English concentrates its weirdness in pronunciation, which is very irregular. Although adult native speakers don’t realize it, this presents a serious learning difficulty for non-native speakers and young English-speaking children. Studies show that English-speaking students need more years of learning to master their language (at least for reading) than French students do, who themselves need more years than young Italian, Spanish or Finnish students (Stanislas Dehaene, Reading in the brain).
I think most of that is actually a weirdness in our orthography. To linguists, languages are, fundamentally a thing that happens in the mouth and not on the page. In the mouth, the hardest thing is basically rhoticism… the “tongue curling back” thing often rendered with “r”. The Irish, Scottish, and American accents retain this weirdness, but a classic Boston, NYC, or southern British accents tends to drop it.
The Oxford English Dictionary gives two IPA transcriptions for “four”: the American /fɔr/ makes sense to me and has an “r” in it, but the British is /fɔː/ has just totally given up on curling the tongue or trying to pretend in the dictionary that this is happening in human mouths.
That tongue curl is quite hard. Quite a few five year olds in rural Idaho (and maybe regions where rhotic dialects are maintained) often struggle with it, and are corrected by teachers and parents (and maybe made fun of by peers) for not speaking properly… for spontaneously adopting “a New York Accent” due a very common a childhood “speech impediment”. Many ESL speakers drop it, hence the city dialects dropping it, not just in practice in the mouth, but officially.
(“J” is a runner up for weirdness in the mouth, but I think that’s just because the voiced postaveolar affricate /dʒ/ is a pretty rare phoneme.)
English orthography is kind of a disaster, I agree. It attempts to shoehorn a german/celtic/french/norse pidgin-or-creole into the latin letter system, and … yeah. Tough task. It was never going to be clean.
If I was going to offer a defense of the status quo here, I’d say that there is no flat/simple orthography to switch to.
Every accent would need its own separate “spelling reform” and their texts would be less mutually intelligible, and it would hurt science and the letters quite a lot, and also probably lead to faster drift into a world where “English” denotes a language family rather than a language.
Interestingly, Interslavic is an attempt to “design by hand” a similar thing for slavic speakers to what English still has bascially for free: common words with stable spellings and meanings, and huge tolerance for how they are pronounced. Once you see the overarching vision for “a written language system” with these properties as a desirable end point… since English is already at that desirable end point, why change it? <3
You’re right. I said “pronunciation,” but the problem is more exactly about the translation between graphemes and phonemes.
I don’t think children have any more difficulty learning to speak English than other languages. The difficulty comes in learning to spell in writing and, to a lesser extent, learning to pronounce written words when writing. Btw, there’s actually much more regularity in English spelling/pronunciation than may appear, and than is routinely taught. Much of the “weirdness” is the result of historical processes which are fairly regular in themselves, once you know the rules.