Benjamin Kost comments on Open & Welcome Thread—August 2020

Benjamin Kost 7 Aug 2024 16:40 UTC
1 point
0
I don’t expect the jargon filter to work perfectly to explain any concept, but I do expect it to make concepts easier to understand because learning new vocabulary is a somewhat cognitively demanding process, and especially so for some people. Memory works differently for different people, and different people have different confidence levels in their vocabulary skills, so the jargon heavy sentence you used above, while perfectly fine for communicating with people such as you and I, wouldn’t he good for getting someone less technically inclined to read about math or remember what that sentence means. It’s great that you gave me an example to work with though. I just went to Claude and used the process that I am talking about to give you an example and came back with this:

“Multiplying grids of numbers is a step-by-step process”

Can you see how that would be easier to understand at first glance if you were completely unfamiliar with linear algebra? It also doesn’t require memorizing new vocabulary. The way you put it requires an unfamiliar person to both learn a new concept and memorize new vocabulary at the same time. The way I put it doesn’t perfectly explain it to an unfamiliar person, but it gives them a rough idea that is easy to understand while not requiring that they take in any new vocabulary. Because it is less cognitively demanding, it will feel less daunting to the person you are trying to teach so as not to discourage them from trying to learn linear algebra.

I believe you also hit on something important when you mentioned jargon intended to confuse the reader. I suspect that is why a lot of jargon exists in the first place. Take binomial nomenclature for example. Why are biologists naming things using long words in a dead language? That only serves the purpose of making the information more daunting and less accessible to people with poor vocabulary memorization skills. That seems like elitism to me. It makes people who have capable vocabulary memorization skills feel smarter but is a terrible practice from a pedagogical and communication perspective. That said, I assume the majority of the problem is that the people creating these new words are just bad at naming and aren’t taking pedagogy or best communication practices into consideration, but elitism probably plays a role as well.

When I post other places, I purposely dumb down my vocabulary because it is better communication practice. I am not going to bother on LW because it probably would be worse communication for my target audience here anyway and it is extra work for me. (For example, I might use the phrase “teaching strategy” instead of the word “pedagogy”.
- Double 7 Aug 2024 21:39 UTC
  1 point
  0
  Parent
  The translation sentence about matrices does not have the same meaning as mine. Yes, matrices are “grids of numbers”, and yes there’s an algorithm (step by step process) for matrix multiplication, but that isn’t what linearity means.
  
  An operation A is linear iff A(x+y) = A(x) + A(y)
  
  https://orb.binghamton.edu/cgi/viewcontent.cgi?filename=4&article=1002&context=electrical_fac&type=additional#:~:text=Linear operators are functions on,into an entirely different vector.
  
  I asked a doctor friend why doctors use Latin. “To sound smarter than we are. And tradition.” So our words for medicine (and probably similar for biology) are in a local optima, but not a global optima. Tradition is a powerful force, and getting hospitals to change will be difficult. Software to help people read about medicine and other needlessly jargon-filled fields is a great idea.
  
  (Putting evolutionary taxonomy information in the name of a creature is a cool idea though, so binomial nomenclature has something going for it.)
  
  You don’t have to dumb down your ideas on LessWrong, but remember that communication is a difficult task that relies on effort from both parties (especially the author). You’ve been good so far. It’s just my job as your debate partner to ask many questions.
  - Benjamin Kost 8 Aug 2024 3:54 UTC
    1 point
    0
    Parent
    I’m glad you like the idea. That was a good catch that I didn’t capture of the true meaning of linear very well. I was a little rushed before. That said, your definition isn’t correct either. Though it is true that linear functions have that property, that is merely the additivity property of a linear function which is just the distributive property of multiplication used on a polynomial. I also didn’t see where the linked text you provided even defines linearity or contains the additivity rule you listed. That was a linear algebra textbook chapter though, and I am still glad you showed me it because it reminded me of how I was great at math in college, but not at all because of the textbooks (which were very expensive!). I have rather good reading comprehension and college math textbooks might as well be written in another language. I learned the math 100% from the lectures and used the text books only to do the problems in the back and got an A in all 3 Calculus classes I took. I am pretty sure I could write a much easier to understand math textbook and I know it is possible because the software that teaches math isn’t nearly as confusingly worded as the textbooks.
    
    This is how I would keep it as simple as possible and capture more of the original meaning:
    
    Multiplying grids of numbers is a straight-line property process.
    
    That said, point taken regarding math jargon being very challenging to descriptively reword as I suspect it will get a lot harder as the concepts get more complex. The point in my process isn’t to perfectly define the word but to use a descriptive enough word replacement that one’s brain more easily grabs onto it than it does with, for example, Latin terms of absurd length for anatomy like “serratus posterior inferior” which is a muscle I had trouble with recently. Just off the top of my head, I would just call that the lower ribcage stabilizer instead. That gives one a much better idea of where it is and what it does and would be much easier to remember and accurately label on a diagram for a quiz. However, with such abstract concepts like math deals with, this will certainly be very challenging.
    - Double 9 Aug 2024 2:57 UTC
      1 point
      0
      Parent
      The “Definition of a Linear Operator” is at the top of page 2 of the linked text.
      My definition was missing that in order to be linear, A(cx) = cA(x). I mistakenly thought that this property was provable from the property I gave. Apparently it isn’t because of “Hamel bases and the axiom of choice” (ChatGPT tried explaining.)
      
      ”straight-line property process” is not a helpful description of linearity for beginners or for professionals. “Linearity” is exactly when A(cx) = cA(x) and A(x+y) = A(x) + A(y). Describing that in words would be cumbersome. Defining it every time you see it is also cumbersome. When people come across “legitimate jargon”, what they do (and need to do) is to learn a term when they need it to understand what they are reading and look up the definition if they forget.
      I fully support experimental schemes to remove “illegitimate jargon” like medical latin, biology latin, and politic speak. Other jargon, like that in math and chemistry are necessary for communication.
      - Benjamin Kost 10 Aug 2024 15:32 UTC
        1 point
        0
        Parent
        I don’t particularly agree about the math jargon. On the one hand, it might be annoying for people already familiar with the jargon to change the wording they use, but on the other hand, descriptive wording is easier to remember for people who are unfamiliar with a term and using an index to automatically replace the term on demand doesn’t necessarily affect anyone already familiar with the jargon. Perhaps this needs to be studied more, but this seems obvious to me. If “linearity” is exactly when A(cx) = cA(x) and A(x+y) = A(x) + A(y), there is no reason “straight-line property” can’t also mean exactly that, but straight-line property is easier to remember because it’s more descriptive of the concept of linearity.
        
        Also, I can see how the shorthand is useful, but you could just say “linearity is when a function has both the properties of homogeneity and additivity” and that would seem less daunting to many new learners to whom that shorthand reads like ancient Greek. I could make more descriptive replacement words for those concepts as well and it might make it even easier to understand the concept of linearity.
        Double 11 Aug 2024 3:12 UTC
        1 point
        0
        Parent
        The math symbols are far better at explaining linearity that “homogeneity and additivity” because in order to understand those words you need to either bring in the math symbols or say cumbersome sentences. “Straight line property” is just new jargon. “Linear” is already clearly an adjective, and “linearity” is that adjective turned into a noun. If you can’t understand the symbols, you can’t understand the concept (unless you learned a different set of symbols, but there’s no need for that).
        
        Some math notation is bad, and I support changing it. For example, f = O(g) is the notation I see most often for Big-O notation. This is awful because it uses ‘=’ for something other than equality! Better would be f \in O(g) with O(g) being the set of functions that grow slower or as fast as g.
        Benjamin Kost 11 Aug 2024 6:00 UTC
        1 point
        0
        Parent
        I’m trying hard to understand your points here. I am not against mathematical notation as that would be crazy. I am against using it to explain what something is the first time when there is an easier way. Bear with me because I am not a math major, but I am pretty sure “a linear equation is an equation that draws a straight line when you graph it” is a good enough explanation for someone to understand the basic concept.
        
        To me, it seems like “ A(cx) = cA(x) and A(x+y) = A(x) + A(y)” is only the technical definition because they are the only two properties that every linear equation imaginable absolutely has to have in common for certain. However, suppose I didn’t know that, and I wanted to be able to tell if an equation is linear. Easy. Just graph it, and if the graph makes a single straight line, it’s a linear equation. Suppose I didn’t want to or couldn’t graph it. I can still tell whether it is linear or not by whether or not the slope is constant using y=mx/b, or I could just simply look to see if the variables are all to the power of one and only multiplied by scalar constants. Either of those things can help me identify a linear equation, so why is it that we are stuck with A(cx) = cA(x) and A(x+y) = A(x) + A(y) as the definition? Give me some linear equations and I can solve them and graph them all day without knowing that. I know that for a fact because though I am certain that definition was in some of my math textbooks in college, I never read the textbooks and if my professors ever put that on the board, I didn’t remember it, and I certainly never used it for anything even though I’ve multiplied and divided matrices before and still didn’t need it then either. I only got A’s in those classes.
        
        That’s why I am having trouble understanding why that definition is so important how it is too wordy to say “a function or equation with a constant slope that draws a single straight line on a graph” The only reason I can think of is there must be some rare exception that has those same properties but is not a linear equation. Even so, I am fairly certain that homogeneity and additivity could be summed up as “one output per input” and “the distributive property of multiplication is true for the equation/function”. That’s still not that wordy. Let’s pretend for a second that a math professor instead of using words to do the lecture read the symbols phonetically and explained everything in short hand on the board. Would more or fewer people passing the class in your opinion?
        
        I am also wondering what your definition of jargon is. Jargon has 2 required elements:
        
        The key elements of jargon are:
        
        Specific to a particular context: Jargon is used within a specific industry, profession, or group and may not be easily understood by those outside of that context.
        
        Involves technical terms, acronyms, or phrases that are not part of everyday language.
        
        Straight Line Property doesn’t qualify for the second element which is why I like it. That said, linear isn’t the best example of jargon because it has the word “line” in it which at least gives the reader a clue what it means. I’m not trying to redefine words, I’m merely trying to rewrite them so that they use common language words that give a clue to what they mean because I am certain that leads to better memory retention for the layperson hearing it for the first time and is also less jarring to readers with poor vocabulary skills. This should apply equally to all jargon by the definition I gave. However, giving a clue may be very challenging for some jargon words that describe very abstract and arcane concepts that don’t map well to normal words which is what I initially thought your point was.
        
        The only downside I see to providing an option to automatically replace useful jargon on demand is that it might lead to a more permanent replacement of the words over time which would irritate people already familiar with the jargon. If your point is that it is not useful, then I would like to hear your counterargument to the point I made about memory retention and the jarring cognitive effect on people with poor vocabulary skills. The jarring effect is easily observable and it’s hard for me to imagine that word familiarity and embedded clues don’t help memory retention of vocabulary, but I am open to counter arguments.
        sunwillrise 11 Aug 2024 6:26 UTC
        8 points
        2
        Parent
        
        Either of those things can help me identify a linear equation, so why is it that we are stuck with A(cx) = cA(x) and A(x+y) = A(x) + A(y) as the definition?
        
        I’m not sure what you are referring to here. They certainly cannot always (or even usually) identify a linear equation. Those 2 things are going to be anywhere between useless and actively counterproductive in the vast majority of situations where you deal with potentially linear operations.
        
        Indeed, if A is an n × n matrix of rank anything other than n − 1, the solution space of Ax=0 is not going to be a straight line. It will be a subspace of size n—rank(A), which can be made up of a single point (if A is invertible), a plane, a hyperplane, the entire space, etc.
        
        “A function or equation with a constant slope that draws a single straight line on a graph” only works if you have a function on the real line, which is often just… trivial to visualize, especially in comparison to situations where you have matrices (as in linear algebra). Or imagine you have an operation defined on the space of functions on an infinite set X, which takes two functions f and g and adds them pointwise. This is a linear operator that cannot be visualized in any (finite) number of dimensions.
        
        Bear with me because I am not a math major, but I am pretty sure “a linear equation is an equation that draws a straight line when you graph it” is a good enough explanation for someone to understand the basic concept.
        
        So this is not correct, due to the above, and an important part of introductory linear algebra courses at the undergraduate level is to take people away from the Calc 101-style “stare at the graph” thinking and to make them consider the operation itself.
        
        An object (the operation) is not the same as its representation (the drawing of its graph), and this is a critical point to understand as soon as possible when dealing with anything math-related (or really, anything rationality-related, as Eliezer has written about in the Sequences many times). Even the graph itself, in mathematical thinking, is crucially not the same as an actual drawing (it’s just the set of (x, f(x)), where x is in the domain).
        Benjamin Kost 12 Aug 2024 5:00 UTC
        3 points
        0
        Parent
        Thank you for taking the time to explain that. I never took linear algebra, only college algebra, trig, and calc 1, 2, and 3. In college algebra our professor had us adding, subtracting, multiplying, and dividing matrices and I don’t remember needing those formulas to determine they were linear, but it was a long time ago, so my memory could be wrong, or the prof just gave us linear ones and didn’t make us determine whether they were linear or not. I suspected there was a good chance that what I was saying was ignorant, but you never know until you put it out there and ask. I tried getting AI to explain it, but bots aren’t exactly math whizzes themselves either. Anyway, I now stand corrected.
        
        Regarding the graph vs the equation, that sounds like you are saying I was guilty of reification, but aren’t they both just abstractions and not real objects? Perhaps your point is that the equation produces the graph, but not the other way around?
        Double 18 Aug 2024 17:31 UTC
        1 point
        0
        Parent
        A linear operation is not the same as a linear function. Your description describes a linear function, not operation. f(x) = x+1 is a linear function but a nonlinear operation (you can see it doesn’t satisfy the criteria.)
        
        Linear operations are great because they can be represented as matrix multiplication and matrix multiplication is associative (and fast on computers).
        
        “some jargon words that describe very abstract and arcane concepts that don’t map well to normal words which is what I initially thought your point was.”
        
        Yep, that’s what I was getting at. Some jargon can’t just be replaced with non-jargon and retain its meaning. Sometimes people need to actually understand things. I like the idea of replacing pointless jargon (eg species names or medical terminology) but lots of jargon has a point.
        
        Link to great linear algebra videos: https://youtu.be/fNk_zzaMoSs?si=-Fi9icfamkBW04xE
        Benjamin Kost 19 Aug 2024 16:14 UTC
        1 point
        0
        Parent
        “Some jargon can’t just be replaced with non-jargon and retain its meaning.”
        
        I don’t understand this statement. It’s possible to have two different words with the same meaning but different names. If I rename a word, it doesn’t change the meaning, it just changes the name. My purpose here isn’t to change the meaning of words but to rename them so that they are easier to learn and remember.
        
        As far as jargon words go, “linearity” isn’t too bad because it is short and “line” is the root word anyway, so to your point, that one shouldn’t be renamed. Perhaps I jumped to meet your challenge too quickly on impulse. I would agree that some jargon words are fine the way they are because they are already more or less in the format I am looking for.
        
        However, suppose the word were “calimaricharnimom” instead of of “linearity” to describe the very same concept. I’d still want to rename it to something shorter, easier to remember, easier to pronounce, and more descriptive of the idea it represents so that it would be easier to learn and retain which is the goal of the jargon index filter. All words that aren’t already in that format or somewhat close to it are fair game, regardless of how unique or abstract the concept they represent is. The very abstract ones will be challenging to rename in a way that gives the reader a clue, but not impossible to rename that way, and even if we assume it is impossible for some words, just making them shorter, more familiar looking, and easier to pronounce should help.
        
        All that said, this is an enormous project in itself because it would need to be done for every major language, not just English. It would need to be an LLM/human collaboration wiki project. Perhaps I should establish some guidelines for leaving certain jargon words alone for that project.
        Double 20 Aug 2024 13:31 UTC
        2 points
        0
        Parent
        Yes it’s possible we were referring to figuring things by “jargon.” It would be nice to replace cumbersome technical terms with words that have the same meaning (and require a similar level of familiarity with the field to actually understand) but have a clue to their meaning in their structure.
        Benjamin Kost 20 Aug 2024 17:03 UTC
        1 point
        0
        Parent
        I think it’s not only nice, but a necessary step for reducing information asymmetry which is one of the greatest barriers to effective democratic governance. Designing jargon terms to benefit more challenged learners would carry vastly more benefit than designing them to please adept learners. It wouldn’t harm the adept learners in any significant way (especially since it’s optional), but it would significantly help the more challenged learners. Many of my ideas are designed to address the problem of information asymmetry by improving learning and increasing transparency.