I have a few thoughts about designing new languages.
Generally, backreferencing often is quite complicated. Words like “he”, “her”, “this” and “that” can often only be interpreted based on context. In German grammatical gender often contains information that helps to make such backreferencing clear. If I remember correctly backreferencing was quite complicated and complex in Lojban.
One of the bigger problems of Lojban is that it is designed by focusing on having words for a specific list of known concepts. Part of what a good language allows it to make up new words to describe new concepts. All good science involves making up new terms to describe observed phenomena.
Ideally, you have a way to easily create new terms that are understood.
If you take English for example you have existing pairs like see—watch and hear—listen but you can’t easily get to the same distinction for a word like think. When learning to meditate that distinction is useful as doing the think/see is okay but think/watch is to be avoided. I know a person who said that the fact that Esperanto allowed easily to make a word for the new concept allowed him to have a conversation where meditation started to make sense for him when he previously didn’t get the point.
On the same token English has student and teacher which is similar to child and parent but has no easy way to say the equivalent of sibling for the first pair that exists for the second. There’s also no equivalent for cousin. You could design a language in a way where there’s a structure that easily gives you ways to extend all sorts of other contexts in a similar way.
Similar to those relations I think that spatial concepts could be a lot better.
If a language is bad for what you want to talk about, you run in a lot of Motte and Bailey issues. It’s a sign of a good language when you are able to be precise to clarify what you mean. The way the English language overloads “to feel” makes it really hard to speak well about a lot of distinction. I don’t have a good way to ask feel!physical sensation, feel!emotion and feel!mood (and the see/watch distinction for each one...).
When it comes to avoiding people from misunderstanding each other, it’s helpful if a person who hears a single phoneme wrong doesn’t hear another word that exists and that means something completely different. In informatics, there’s the concept of error-correcting codes to make sure that messages are resilient against errors. Especially a language that’s fully a-priori can think well about how to use the available combinations of phonemes to assign words in a way that’s resilient to a few errors.
Anaphora is super complicated, and I’ve thought long and hard about how to express them. Each loglang has its own ways of dealing with anaphors. Yes, you are correct that Lojban anaphora is poorly designed. There’s the ko’V series, the vo’V series, goi, the letteral series...it’s really bad.
Most people use a variant of the ko’V series. How it works is that you bind a variable to ko’a (or the others in the series), and then when you repeat “ko’a”, it recalls the bound variable. The extremely big issue with this is that it requires forethought. It’s fine when you’re writing, but when you’re speaking, you don’t necessarily know whether or not you’ll need to refer back to something you said before. You could simply repeat the words, and context plus good faith/Grice’s Maxims usually means you can safely assume you meant to refer to the same thing, but you didn’t state it explicitly. Very unloglangy.
Toaq anaphora is also not good. The new Toaq anaphor system is such that all arguments are classified into several classes: animate entity (really Toaq? Animacy distinction?), inanimate entity, abstract entity, adjectives, clauses, LU-clauses, genitives, personal pronouns and demonstratives. Each pronoun refers to the closest argument that fulfills its type—each class has its own pronoun. The issue is if you want to talk about things which belong to the same class, this type of anaphor becomes unwieldy. The plus side is that it requires no forethought.
I plan on having a variation of Toaq anaphors, which I’ll discuss in a later chapter.
Creating new words is something that all loglangs encourage. It’s more of an infrastructure issue—Lojban and Toaq both have community dictionaries that anyone can add to (Jbovlaste and Toadua respectively). People can then define new words to talk about what they want to talk about, as they wish. It also saves a lot of effort on the part of the language maker(s).
I distinguish between vagueness and ambiguity. Vagueness is when a word encloses a large volume in semantic space. This is totally fine, and most root words ought to be on the vague side. Ambiguity is when a word encloses disconnected volumes in semantic space. This is unacceptable and should be removed. Consider the vagueness of the word “animal” and the ambiguity of the word “set”.
On the same token English has student and teacher which is similar to child and parent but has no easy way to say the equivalent of sibling for the first pair that exists for the second.
Sorry, I don’t understand what you mean.
Yes, this is something that upset me with Lojban and Eberban and pleased me with Toaq. Lojban usually tries to make particle families have similar forms. This is bad because single-phoneme errors can cause misunderstanding, since particles in the same family would usually take the same places as each other. It’s best to have particles in the same family be phonetically far away, even if it makes it harder to learn. Phonetically-close words should be semantically far away such that even if point errors occur, context can be sufficient to correct it.
In English, it’s not possible to construct easily a word that refers to “someone who has the same teacher as me” or “someone who reads the same blog as me”.
If you have a wordpair like employee and boss the nearest equivalent for sibling is coworker but even that doesn’t specifically mean someone who has the same boss as you.
If you create a new language and just try to create words for important concepts like employee,boss, child, parent, student and teacher which is roughly what Lojban did you can’t reuse the same structure as easily as you would be able if you put more thought into identifying the relations that there are and how to systematize them.
If you have a language like English with words like see, watch, hear and listen and need a similar term for as listen for taste you can make up a new word. Making up the new word is relatively cheap. The problem is that your listener doesn’t automatically understand the new word. The speaker and listener have to engage in an effort to learn the new word and can’t just construct it on the fly and be understood.
Lojban made its own words for concepts like north and south instead of creating a more systematic approach. If you have a more systematic approach you could have something like X degrees in reference system Y where north would made up of two syllables. One syllable would refer to something like “0 degree on a plane” and the other syllable about “cardinal direction”. Then east is one syllable for “90 degree on a plane” + the syllable for “cardinal direction”. Once you have such a system you can reuse it in different contexts. You then can afford to have words for more than just 0, 90, 180, and 270 degrees.
In aviation in practice, they refer to “there’s another plane at 2 o’clock” which is quite complex way to reuse the concept of the clock to have more than just 4 distinctions of directions in a plane. Once you have a system that can be reused, it might become more natural to state your political position as “2 o’clock” on the political compass instead of just speaking one-dimensionally about being left- or right-wing.
If you do new science and that gives you a new 2D-reference frame, having the existing language provide you with a powerful way to address individual points allows you to more easily think about your new topic of investigation because the language helps you in a way that a language that has not thought about systematizing such a mechanism does.
For a new language to be actually useful, one way is to provide better systemization that makes the language superior when talking about a specific problem domain.
Lojban is too much designed based on the idea of wanting to translate what can already be easily be expressed in English.
The extremely big issue with this is that it requires forethought. It’s fine when you’re writing, but when you’re speaking, you don’t necessarily know whether or not you’ll need to refer back to something you said before.
This is especially an issue if you have a conversation with someone and don’t know what
Creating new words is something that all loglangs encourage. It’s more of an infrastructure issue—Lojban and Toaq both have community dictionaries that anyone can add to (Jbovlaste and Toadua respectively). People can then define new words to talk about what they want to talk about, as they wish.
No language that gets actually used in practice has people consistently referring to dictionaries to deal with new words. If anyone doing knowledge production has to interface with a dictionary-maker to get his terms approved, that’s widely unpractical.
A good English speaker has access to a few hundred thousand words, you can’t learn that amount of words easily from a dictionary.
It’s best to have particles in the same family be phonetically far away, even if it makes it harder to learn.
It doesn’t have to make it harder to learn. Let’s say we have numbers:
1: fa
2: ge
3: hi
4: jo
5: ku
6: la
Now, what’s the name for 7? You can derive from the pattern that it’s ‘me’. If you forget the word for a single number you can easily reconstruct it if you understand the general pattern and at the same time you can’t confuse any of the numbers by mishearing a single phoneme.
If you drop Lojban’s idea of making your words derive from existing words you can create patterns that help to learn related concepts while still having phonetical distance.
In English, it’s not possible to construct easily a word that refers to “someone who has the same teacher as me” or “someone who reads the same blog as me”.
I don’t see how it’s useful to make words (i.e. separate lexemes) for these concepts, when they’re better expressed as phrases. The relationship of “parent-child-sibling” (in the genetic sense) is more fundamental than “employee-boss” because the former is immutable. You cannot lose your genetic relation, whereas you can separate from your boss. I also think it’s good that “coworker” doesn’t imply having the same boss—there could be no boss (e.g. a startup with two co-founders). Whereas there cannot be a child without a parent.
I’m more concerned with removing ambiguity from words (in the sense that words that enclose non-continuous spaces in semantic space have to be separated), than I am in trying to figure out how to divide it exactly. Many natural languages make distinctions (and not make distinctions) differently than in English—and in the same way, you can rederive the relation “has the same boss as me” via phrases using other words rather than creating a word.
No language that gets actually used in practice has people consistently referring to dictionaries to deal with new words. If anyone doing knowledge production has to interface with a dictionary-maker to get his terms approved, that’s widely unpractical.
This is a consequence of the languages being less mature than natural languages. Natlangs have had much more time to build up vocabulary.
Now, what’s the name for 7? You can derive from the pattern that it’s ‘me’.
I cannot see how it’s ‘me’. I can tell the pattern of the vowels: a e i o u. But how is it m?
Phrases take more effort than having words for things. In practice that usually results in people being vaguer about what they mean and less conversational bandwidth.
Generally, when people are doing new things they need new words. In the poly community, you for example have people talking about metamours (which is someone who is in a relationship with the same person as you). While it’s possible to express that as a phrase, it’s something that’s important enough to have it’s own word. In English, a newly made word like this is not able to be understood by people who haven’t heard it before.
If you however put effort into thinking through the primitives of your language, you can actually easily make words that are understood without having to be learned specifically.
There can be context where the ability to have a word for a person who has the same boss is important and contexts where it’s not important to have such a word. A language that makes it easy to have such words when needed is superior when it comes to speaking about new domains of knowledge.
It’s possible that a new language would be superior enough over existing languages to be used in a new domain of knowledge that people prefer to write in it over writing in English.
I cannot see how it’s ‘me’. I can tell the pattern of the vowels: a e i o u. But how is it m?
If you follow the alphabet m would be the next consonant. My main point here is that you can have structure that can give order that make learning easier that doesn’t depend on the words being phonetically similar.
This is especially true if you reuse the structures.
If you however put effort into thinking through the primitives of your language, you can actually easily make words that are understood without having to be learned specifically.
I highly doubt this is true or possible in any meaningful degree. There have already been several conlangs that try this—Lojban is one with its compounding system, another is Toki Pona. While it’s definitely possible to have compounds whose meaning is related to their components, each context a specific component is going to have to be interpreted in its own special way. Again, because of context. You’re going to have to learn something explicitly regardless.
I highly doubt this is true or possible in any meaningful degree.
I gave an example of my friend having an experience where Esperanto already allowed him to have a conservation about meditation that he couldn’t have had easily in English or German which are the languages he otherwise speaks.
Lojban put little effort into it as evidenced by having words for individual cardinal directions instead of going for a more systematic approach.
When it comes to family relations and also for things like lover/metamour, you would model them mathematical as a graph plus a context. Systematizing a language allows you to have words for things like metamour that are immediately understood.
I have a few thoughts about designing new languages.
Generally, backreferencing often is quite complicated. Words like “he”, “her”, “this” and “that” can often only be interpreted based on context. In German grammatical gender often contains information that helps to make such backreferencing clear. If I remember correctly backreferencing was quite complicated and complex in Lojban.
One of the bigger problems of Lojban is that it is designed by focusing on having words for a specific list of known concepts. Part of what a good language allows it to make up new words to describe new concepts. All good science involves making up new terms to describe observed phenomena.
Ideally, you have a way to easily create new terms that are understood.
If you take English for example you have existing pairs like see—watch and hear—listen but you can’t easily get to the same distinction for a word like think. When learning to meditate that distinction is useful as doing the think/see is okay but think/watch is to be avoided. I know a person who said that the fact that Esperanto allowed easily to make a word for the new concept allowed him to have a conversation where meditation started to make sense for him when he previously didn’t get the point.
On the same token English has student and teacher which is similar to child and parent but has no easy way to say the equivalent of sibling for the first pair that exists for the second. There’s also no equivalent for cousin. You could design a language in a way where there’s a structure that easily gives you ways to extend all sorts of other contexts in a similar way.
Similar to those relations I think that spatial concepts could be a lot better.
If a language is bad for what you want to talk about, you run in a lot of Motte and Bailey issues. It’s a sign of a good language when you are able to be precise to clarify what you mean. The way the English language overloads “to feel” makes it really hard to speak well about a lot of distinction. I don’t have a good way to ask feel!physical sensation, feel!emotion and feel!mood (and the see/watch distinction for each one...).
When it comes to avoiding people from misunderstanding each other, it’s helpful if a person who hears a single phoneme wrong doesn’t hear another word that exists and that means something completely different. In informatics, there’s the concept of error-correcting codes to make sure that messages are resilient against errors. Especially a language that’s fully a-priori can think well about how to use the available combinations of phonemes to assign words in a way that’s resilient to a few errors.
Anaphora is super complicated, and I’ve thought long and hard about how to express them. Each loglang has its own ways of dealing with anaphors. Yes, you are correct that Lojban anaphora is poorly designed. There’s the ko’V series, the vo’V series, goi, the letteral series...it’s really bad.
Most people use a variant of the ko’V series. How it works is that you bind a variable to ko’a (or the others in the series), and then when you repeat “ko’a”, it recalls the bound variable. The extremely big issue with this is that it requires forethought. It’s fine when you’re writing, but when you’re speaking, you don’t necessarily know whether or not you’ll need to refer back to something you said before. You could simply repeat the words, and context plus good faith/Grice’s Maxims usually means you can safely assume you meant to refer to the same thing, but you didn’t state it explicitly. Very unloglangy.
Toaq anaphora is also not good. The new Toaq anaphor system is such that all arguments are classified into several classes: animate entity (really Toaq? Animacy distinction?), inanimate entity, abstract entity, adjectives, clauses, LU-clauses, genitives, personal pronouns and demonstratives. Each pronoun refers to the closest argument that fulfills its type—each class has its own pronoun. The issue is if you want to talk about things which belong to the same class, this type of anaphor becomes unwieldy. The plus side is that it requires no forethought.
I plan on having a variation of Toaq anaphors, which I’ll discuss in a later chapter.
Creating new words is something that all loglangs encourage. It’s more of an infrastructure issue—Lojban and Toaq both have community dictionaries that anyone can add to (Jbovlaste and Toadua respectively). People can then define new words to talk about what they want to talk about, as they wish.
It also saves a lot of effort on the part of the language maker(s).I distinguish between vagueness and ambiguity. Vagueness is when a word encloses a large volume in semantic space. This is totally fine, and most root words ought to be on the vague side. Ambiguity is when a word encloses disconnected volumes in semantic space. This is unacceptable and should be removed. Consider the vagueness of the word “animal” and the ambiguity of the word “set”.
Sorry, I don’t understand what you mean.
Yes, this is something that upset me with Lojban and Eberban and pleased me with Toaq. Lojban usually tries to make particle families have similar forms. This is bad because single-phoneme errors can cause misunderstanding, since particles in the same family would usually take the same places as each other. It’s best to have particles in the same family be phonetically far away, even if it makes it harder to learn. Phonetically-close words should be semantically far away such that even if point errors occur, context can be sufficient to correct it.
In English, it’s not possible to construct easily a word that refers to “someone who has the same teacher as me” or “someone who reads the same blog as me”.
If you have a wordpair like employee and boss the nearest equivalent for sibling is coworker but even that doesn’t specifically mean someone who has the same boss as you.
If you create a new language and just try to create words for important concepts like employee, boss, child, parent, student and teacher which is roughly what Lojban did you can’t reuse the same structure as easily as you would be able if you put more thought into identifying the relations that there are and how to systematize them.
If you have a language like English with words like see, watch, hear and listen and need a similar term for as listen for taste you can make up a new word. Making up the new word is relatively cheap. The problem is that your listener doesn’t automatically understand the new word. The speaker and listener have to engage in an effort to learn the new word and can’t just construct it on the fly and be understood.
Lojban made its own words for concepts like north and south instead of creating a more systematic approach. If you have a more systematic approach you could have something like X degrees in reference system Y where north would made up of two syllables. One syllable would refer to something like “0 degree on a plane” and the other syllable about “cardinal direction”. Then east is one syllable for “90 degree on a plane” + the syllable for “cardinal direction”. Once you have such a system you can reuse it in different contexts. You then can afford to have words for more than just 0, 90, 180, and 270 degrees.
In aviation in practice, they refer to “there’s another plane at 2 o’clock” which is quite complex way to reuse the concept of the clock to have more than just 4 distinctions of directions in a plane. Once you have a system that can be reused, it might become more natural to state your political position as “2 o’clock” on the political compass instead of just speaking one-dimensionally about being left- or right-wing.
If you do new science and that gives you a new 2D-reference frame, having the existing language provide you with a powerful way to address individual points allows you to more easily think about your new topic of investigation because the language helps you in a way that a language that has not thought about systematizing such a mechanism does.
For a new language to be actually useful, one way is to provide better systemization that makes the language superior when talking about a specific problem domain.
Lojban is too much designed based on the idea of wanting to translate what can already be easily be expressed in English.
This is especially an issue if you have a conversation with someone and don’t know what
No language that gets actually used in practice has people consistently referring to dictionaries to deal with new words. If anyone doing knowledge production has to interface with a dictionary-maker to get his terms approved, that’s widely unpractical.
A good English speaker has access to a few hundred thousand words, you can’t learn that amount of words easily from a dictionary.
It doesn’t have to make it harder to learn. Let’s say we have numbers:
1: fa
2: ge
3: hi
4: jo
5: ku
6: la
Now, what’s the name for 7? You can derive from the pattern that it’s ‘me’. If you forget the word for a single number you can easily reconstruct it if you understand the general pattern and at the same time you can’t confuse any of the numbers by mishearing a single phoneme.
If you drop Lojban’s idea of making your words derive from existing words you can create patterns that help to learn related concepts while still having phonetical distance.
I don’t see how it’s useful to make words (i.e. separate lexemes) for these concepts, when they’re better expressed as phrases. The relationship of “parent-child-sibling” (in the genetic sense) is more fundamental than “employee-boss” because the former is immutable. You cannot lose your genetic relation, whereas you can separate from your boss. I also think it’s good that “coworker” doesn’t imply having the same boss—there could be no boss (e.g. a startup with two co-founders). Whereas there cannot be a child without a parent.
I’m more concerned with removing ambiguity from words (in the sense that words that enclose non-continuous spaces in semantic space have to be separated), than I am in trying to figure out how to divide it exactly. Many natural languages make distinctions (and not make distinctions) differently than in English—and in the same way, you can rederive the relation “has the same boss as me” via phrases using other words rather than creating a word.
This is a consequence of the languages being less mature than natural languages. Natlangs have had much more time to build up vocabulary.
I cannot see how it’s ‘me’. I can tell the pattern of the vowels: a e i o u. But how is it m?
Phrases take more effort than having words for things. In practice that usually results in people being vaguer about what they mean and less conversational bandwidth.
Generally, when people are doing new things they need new words. In the poly community, you for example have people talking about metamours (which is someone who is in a relationship with the same person as you). While it’s possible to express that as a phrase, it’s something that’s important enough to have it’s own word. In English, a newly made word like this is not able to be understood by people who haven’t heard it before.
If you however put effort into thinking through the primitives of your language, you can actually easily make words that are understood without having to be learned specifically.
There can be context where the ability to have a word for a person who has the same boss is important and contexts where it’s not important to have such a word. A language that makes it easy to have such words when needed is superior when it comes to speaking about new domains of knowledge.
It’s possible that a new language would be superior enough over existing languages to be used in a new domain of knowledge that people prefer to write in it over writing in English.
If you follow the alphabet m would be the next consonant. My main point here is that you can have structure that can give order that make learning easier that doesn’t depend on the words being phonetically similar.
This is especially true if you reuse the structures.
I highly doubt this is true or possible in any meaningful degree. There have already been several conlangs that try this—Lojban is one with its compounding system, another is Toki Pona. While it’s definitely possible to have compounds whose meaning is related to their components, each context a specific component is going to have to be interpreted in its own special way. Again, because of context. You’re going to have to learn something explicitly regardless.
I gave an example of my friend having an experience where Esperanto already allowed him to have a conservation about meditation that he couldn’t have had easily in English or German which are the languages he otherwise speaks.
Lojban put little effort into it as evidenced by having words for individual cardinal directions instead of going for a more systematic approach.
When it comes to family relations and also for things like lover/metamour, you would model them mathematical as a graph plus a context. Systematizing a language allows you to have words for things like metamour that are immediately understood.