I am not sure for how many people it is true, but my own bad-at-mathness is largely about being bad at reading really terse, dense, succint text, because my mind is used to verbose text and thus filtering out half of it or not really paying close attention.
I hate the living guts out of notation, Greek variables or single-letter variables. Even the Bayes theorem is too terse, succint, too information-dense for me. I find it painful that in something like P(B|A) all three bloody letters mean a different thing. It is just too zipped. I would far more prefer something more natural langauge like Probability( If-True (Event1), Event2) (this looks like a software code—and for a reason).
This is actually a virtue when writing programs, I am never the guy who uses single letter variables, my programs are always like MarginPercentage = DivideWODivZeroError((SalesAmount-CostAmount), SalesAmount) * 100. So never too succint, clearly readable.
Let’s stick to the Bayer Theorem. My brain is screaming don’t give me P, A, B. Give me “proper words” like Probability, Event1, and Event2. So that my mind can read “Pro...”, then zone out and rest while reading “bability” and turn back on again with the next word.
This is basically the inability to focus really 100%, needing the “fillers”, the low information density of natural language text for allowing my brain to zone out and rest for fractions of a second, of finding too dense, too terse notation, where losing a single letter means not understanding the problem.
This is largely a redudancy problem. Natural language is redundant, you can say “probably” as “prolly” and people still understand it—so your mind can zone out during reading half of a text and you still get its meaning. Math notation is highly not redundant, miss one single tiny itty bitty letter and you don’t understand a proof.
So I guess I could be better at math if there was an inflated, more redudant, not single-letter-variables, more natural language like version of it.
I guess programming fills that gap well.
I figure Scott does not like terse, dense notation either, however he seems to be good at doing the work of inflating it to something more readable for himself.
I guess I am not reinventing warm water here. There is probably a reason why a programmer would more likely write Probability(If-True(Event1), Event2) than P(A|B) - this is more understandable for many people. I guess it should be part of math education to learn to cope with the denser, terser, less redundant second notation. I guess my teachers did not really manage to impart that to me.
Math notation is optimized for doing math, not learning math. Once you’ve internalized what P(A|B) is, you know what it means at a glance, and when you look at a large equation, you’re more interested in the structure of the whole thing than the identities of it’s constituents (Because abstracting the details away getting results based only on structure is what algebra is).
People that do novel math often invent their own notation—it’s sort of like domain-specific languages written on top of LISP.
Teaching programming languages haven’t proven to be a popular idea, e.g. MIT and Berkeley moved their famous introductory CS class from scheme to python (because python is actually used, even though it is a much less elegant language).
Also, being very frugal with token length seems to be a thing into the 1960′s, see Unix e.g. “ls -l” instead of the far more human eye friendly “list -long” I don’t exactly understand why but apparently this wasn’t really a priority until about, say, 1995 when more and more programmers said fsck Perl with its unreadably frugal letter soup and use stuff like Python, where things are expressed in actual words.
I guess there are good reasons behind it. I still don’t have to like it.
See Comment formatting/Escaping special symbols on the wiki for more details (I’ve backslash-escaped underscores _ in the text part of the link to avoid their turning surrounding texts into italics, and the closing round bracket in the URL part of the link to avoid its interpretation as the end of the URL).
Hm, interesting, I have an aversion to what I see as fluffy and low-info-density content, and I have a hard time pushing my brain in to “high gear” so I can just skim through it. I do think I can shift gears but it seems to take a few months to change my preferred reading mode.
It’s interesting to speculate what math notation would be like if there were competing math notation schemes the same way there are competing programming languages; arguably math notation is terrible when judged by the standards of programming languages. reddit thread
I think this gear thing may be a strong difference between STEM and humanities orientation. (I work more or less STEM, but just to pay bills, I am more of a hobby historian and suchlike at heart.) Taking everything literally, and liking high-density content, while skimming fluff and focusing on intended meaning instead of literal meaning is more of a humanities thing.
This is why I tend to insist that programming should not be called software engineering. Programs are written primarily for people to read, and only secondarily for computers to execute, and thus it floats somewhere in between STEM-type precision and humanities type good readable writing. In my experience programmers don’t exactly have the highly precise, literalist minds of e.g. mechanical engineers. Nor the “there are multiple viewpoints” type of overly-fluffy humanities angle, but somewhere in between—something more like a craft than engineering or philosophy.
BTW a sad reminder of how good Reddit used to be. Thanks.
(This is entirely offtopic, but is there a way to stop this fluctuation of subcultures? Large websites count as subcultures the same way as musical styles count as one. Usually a subculture is started by more high-brow people and as it gets popularized and more low-brow people move in, the starters move on to the next one. I was strongly suspecting that the idea of subreddits may be the killer feature that stops it. Alas, not. even /r/insightfulquestions are insightful only on a high school debating club level. From another angle, it is not merely just an IQ based in and out migration, it is also age based.)
I agree strongly. IMO, this is a significant part of why tutoring works. The tutor “translates” the terse explanations into easier natural language ones.
Edditionally, I think this is partly why the mathematics programs I’ve been exposed to tend to involve a lot of people hanging around each other socially, moreso than is common for other majors. I think even the mathematics majors mainly rely on translations from older students, at least until they progress past a certain point.
That resembles what happens to me when I have to ‘read some math’. I hated integrals and multivariate equations because solving them always took so long, and somewhere halfway through I invariably began thinking ‘okay, we did this and then broken off this piece to pretend it’s actually like that, though we’ll have to weld it back on someday… is there really no meaning to what we’ve already done, except that in some distant future we’ll write the final “=”?..’
In highschool physics, when they gave you an equation, you sure could use it as kind of a lever to turn the model Earth.
Most of the notation was developed for people doing math rather than learning it. If you’re doing a lengthy calculation, you probably want to cut down on the amount of unnecessary symbols. With something like conditional probability, “P(A|B)” makes perfect sense if you’re going to be writing some variant of it many times over the course of many pages.
Or let’s take the Einstein summation convention. Mathematics already had a compact notation for writing sums, but even this proved to be too cumbersome for the tastes of people doing differential geometry in coordinates. This led Einstein to introduce a convention wherein a repeated index denoted summation.
As you pointed out, this tendency toward brevity leads to some very dense writing that can be difficult to interpret if you’re not already used to it. It’s good for working, bad for pedagogy.
Compare shorthand writing. It’s great for writing faster, but I can’t imagine reading a novel written in shorthand.
I’m sympathetic, something very similar is true of me. Thanks for making your thoughts explicit—it’ll be helpful for me when I try to explain the situation to others.
I have ADHD and read a lot (and read more as a kid), so this definitely is interesting to me. Then again, I also like compression at the aesthetic level, but also find it quite difficult to learn/use. I too have to translate Bayes’ Theorem! (I found the version I like best is where the letters are “O” “H” and “E” for observation, hypothesis, evidence.)
People do think of conditional probabilities that way, one just has to be careful not to conflate with p(IF event1 THEN event2), which is a probability of a disjunction, and quite different from p(event2 | event1). Conditioning is kind of weird, Simpsons paradox is a paradox because we are really bad at processing what conditioning actually means.
I find that what helps for me is re-writing maths as I’m learning it.
When I glance at an equation or formula (especially an unfamiliar one), I usually can’t take it in because my mind is trying to glance it all at once. I have to force myself to scan it slowly, either by re-writing it, writing out its definition, or by (holding a ruler under it) and scanning one symbol at a time.
Then again, I’m currently studying a postgraduate degree in maths and I’m not someone who’s ever considered themselves ‘bad at math’.
I myself consider that a large degree of why I find myself to be bad at math is because I have spent very little time really trying to do math as a result of it being actually mentally painful to do due to this effect.
This makes me sad, because even without accomplishment, I feel as if my reasoning ability is “merely above average” and I have no apparent way of leveraging that besides hacking that into making me seem more verbally intelligent (a lame result in my opinion).
I almost immediately start glazing over when I see some math in a text and it takes real effort to make myself pull it apart.
I find that I almost always understand an algorithm when implemented in a programming language with much less effort than if I’m reading it in some mathematical notation.
I am not sure for how many people it is true, but my own bad-at-mathness is largely about being bad at reading really terse, dense, succint text, because my mind is used to verbose text and thus filtering out half of it or not really paying close attention.
I hate the living guts out of notation, Greek variables or single-letter variables. Even the Bayes theorem is too terse, succint, too information-dense for me. I find it painful that in something like P(B|A) all three bloody letters mean a different thing. It is just too zipped. I would far more prefer something more natural langauge like Probability( If-True (Event1), Event2) (this looks like a software code—and for a reason).
This is actually a virtue when writing programs, I am never the guy who uses single letter variables, my programs are always like MarginPercentage = DivideWODivZeroError((SalesAmount-CostAmount), SalesAmount) * 100. So never too succint, clearly readable.
Let’s stick to the Bayer Theorem. My brain is screaming don’t give me P, A, B. Give me “proper words” like Probability, Event1, and Event2. So that my mind can read “Pro...”, then zone out and rest while reading “bability” and turn back on again with the next word.
This is basically the inability to focus really 100%, needing the “fillers”, the low information density of natural language text for allowing my brain to zone out and rest for fractions of a second, of finding too dense, too terse notation, where losing a single letter means not understanding the problem.
This is largely a redudancy problem. Natural language is redundant, you can say “probably” as “prolly” and people still understand it—so your mind can zone out during reading half of a text and you still get its meaning. Math notation is highly not redundant, miss one single tiny itty bitty letter and you don’t understand a proof.
So I guess I could be better at math if there was an inflated, more redudant, not single-letter-variables, more natural language like version of it.
I guess programming fills that gap well.
I figure Scott does not like terse, dense notation either, however he seems to be good at doing the work of inflating it to something more readable for himself.
I guess I am not reinventing warm water here. There is probably a reason why a programmer would more likely write Probability(If-True(Event1), Event2) than P(A|B) - this is more understandable for many people. I guess it should be part of math education to learn to cope with the denser, terser, less redundant second notation. I guess my teachers did not really manage to impart that to me.
Math notation is optimized for doing math, not learning math. Once you’ve internalized what P(A|B) is, you know what it means at a glance, and when you look at a large equation, you’re more interested in the structure of the whole thing than the identities of it’s constituents (Because abstracting the details away getting results based only on structure is what algebra is).
Perhaps math really needs multiple “programming languages”. One for teaching, one for higher level stuff...
People that do novel math often invent their own notation—it’s sort of like domain-specific languages written on top of LISP.
Teaching programming languages haven’t proven to be a popular idea, e.g. MIT and Berkeley moved their famous introductory CS class from scheme to python (because python is actually used, even though it is a much less elegant language).
I think historically math started with longer variables, but it wasn’t so convenient. Compare:
3x^2 + 4x + 1 = 0
with
three times the square of a value, and four times the value, and one, equals nothing
The latter may be easier to read, but the former is easier to divide by x+1.
https://en.wikipedia.org/wiki/Variable_(mathematics)#Genesis_and_evolution_of_the_concept
Also, being very frugal with token length seems to be a thing into the 1960′s, see Unix e.g. “ls -l” instead of the far more human eye friendly “list -long” I don’t exactly understand why but apparently this wasn’t really a priority until about, say, 1995 when more and more programmers said fsck Perl with its unreadably frugal letter soup and use stuff like Python, where things are expressed in actual words.
I guess there are good reasons behind it. I still don’t have to like it.
To get the link
https://en.wikipedia.org/wiki/Variable_(mathematics)#Genesis_and_evolution_of_the_concept#Genesis_and_evolution_of_the_concept)
use the following code in your comment:
See Comment formatting/Escaping special symbols on the wiki for more details (I’ve backslash-escaped underscores _ in the text part of the link to avoid their turning surrounding texts into italics, and the closing round bracket in the URL part of the link to avoid its interpretation as the end of the URL).
Hm, interesting, I have an aversion to what I see as fluffy and low-info-density content, and I have a hard time pushing my brain in to “high gear” so I can just skim through it. I do think I can shift gears but it seems to take a few months to change my preferred reading mode.
It’s interesting to speculate what math notation would be like if there were competing math notation schemes the same way there are competing programming languages; arguably math notation is terrible when judged by the standards of programming languages. reddit thread
I think this gear thing may be a strong difference between STEM and humanities orientation. (I work more or less STEM, but just to pay bills, I am more of a hobby historian and suchlike at heart.) Taking everything literally, and liking high-density content, while skimming fluff and focusing on intended meaning instead of literal meaning is more of a humanities thing.
This is why I tend to insist that programming should not be called software engineering. Programs are written primarily for people to read, and only secondarily for computers to execute, and thus it floats somewhere in between STEM-type precision and humanities type good readable writing. In my experience programmers don’t exactly have the highly precise, literalist minds of e.g. mechanical engineers. Nor the “there are multiple viewpoints” type of overly-fluffy humanities angle, but somewhere in between—something more like a craft than engineering or philosophy.
BTW a sad reminder of how good Reddit used to be. Thanks.
(This is entirely offtopic, but is there a way to stop this fluctuation of subcultures? Large websites count as subcultures the same way as musical styles count as one. Usually a subculture is started by more high-brow people and as it gets popularized and more low-brow people move in, the starters move on to the next one. I was strongly suspecting that the idea of subreddits may be the killer feature that stops it. Alas, not. even /r/insightfulquestions are insightful only on a high school debating club level. From another angle, it is not merely just an IQ based in and out migration, it is also age based.)
I agree strongly. IMO, this is a significant part of why tutoring works. The tutor “translates” the terse explanations into easier natural language ones.
Edditionally, I think this is partly why the mathematics programs I’ve been exposed to tend to involve a lot of people hanging around each other socially, moreso than is common for other majors. I think even the mathematics majors mainly rely on translations from older students, at least until they progress past a certain point.
That resembles what happens to me when I have to ‘read some math’. I hated integrals and multivariate equations because solving them always took so long, and somewhere halfway through I invariably began thinking ‘okay, we did this and then broken off this piece to pretend it’s actually like that, though we’ll have to weld it back on someday… is there really no meaning to what we’ve already done, except that in some distant future we’ll write the final “=”?..’
In highschool physics, when they gave you an equation, you sure could use it as kind of a lever to turn the model Earth.
Most of the notation was developed for people doing math rather than learning it. If you’re doing a lengthy calculation, you probably want to cut down on the amount of unnecessary symbols. With something like conditional probability, “P(A|B)” makes perfect sense if you’re going to be writing some variant of it many times over the course of many pages.
Or let’s take the Einstein summation convention. Mathematics already had a compact notation for writing sums, but even this proved to be too cumbersome for the tastes of people doing differential geometry in coordinates. This led Einstein to introduce a convention wherein a repeated index denoted summation.
As you pointed out, this tendency toward brevity leads to some very dense writing that can be difficult to interpret if you’re not already used to it. It’s good for working, bad for pedagogy.
Compare shorthand writing. It’s great for writing faster, but I can’t imagine reading a novel written in shorthand.
I’m sympathetic, something very similar is true of me. Thanks for making your thoughts explicit—it’ll be helpful for me when I try to explain the situation to others.
I have ADHD and read a lot (and read more as a kid), so this definitely is interesting to me. Then again, I also like compression at the aesthetic level, but also find it quite difficult to learn/use. I too have to translate Bayes’ Theorem! (I found the version I like best is where the letters are “O” “H” and “E” for observation, hypothesis, evidence.)
People do think of conditional probabilities that way, one just has to be careful not to conflate with p(IF event1 THEN event2), which is a probability of a disjunction, and quite different from p(event2 | event1). Conditioning is kind of weird, Simpsons paradox is a paradox because we are really bad at processing what conditioning actually means.
I find that what helps for me is re-writing maths as I’m learning it.
When I glance at an equation or formula (especially an unfamiliar one), I usually can’t take it in because my mind is trying to glance it all at once. I have to force myself to scan it slowly, either by re-writing it, writing out its definition, or by (holding a ruler under it) and scanning one symbol at a time.
Then again, I’m currently studying a postgraduate degree in maths and I’m not someone who’s ever considered themselves ‘bad at math’.
I myself consider that a large degree of why I find myself to be bad at math is because I have spent very little time really trying to do math as a result of it being actually mentally painful to do due to this effect.
This makes me sad, because even without accomplishment, I feel as if my reasoning ability is “merely above average” and I have no apparent way of leveraging that besides hacking that into making me seem more verbally intelligent (a lame result in my opinion).
However I’m not a programmer either, yet.
I almost immediately start glazing over when I see some math in a text and it takes real effort to make myself pull it apart.
I find that I almost always understand an algorithm when implemented in a programming language with much less effort than if I’m reading it in some mathematical notation.