Some thoughts, partly cached and partly inspired by the above:
Modern algebraic notation—single-letter variables and stuff—is very useful for symbol manipulation. Imagine if you had to write statements like “The area of the square whose side is the hypotenuse is equal to the sum of the areas of the squares whose sides are the two legs” instead of “a^2+b^2=c^2″ and had to derive e.g. the quadratic equation. Like doing long division with Roman numerals, as someone put it.
Note that, in the above example, “a^2+b^2=c^2” requires context explaining what a, b, and c are. Shorter notation is to some extent necessarily more ambiguous: fewer symbols means fewer different combinations of them; context is the only way to disambiguate them to refer uniquely to a larger set of referents.
Pedagogy presumably optimizes for readability and minimizing ambiguity for young ignorant minds, as well as for the gradeability of the written output they produce. Therefore, it’d probably be biased against introducing ambiguous notation. Not that they never teach it—obviously they do teach algebraic notation, the merits of it are so high they can’t avoid it—but there might be cases where a notation would be useful to the serious student, but the pedagogues choose a more cumbersome notation because their interests differ.
Published papers, meanwhile, probably optimize for readability and rigor. I do believe it’s common for them to introduce complicated definitions so as to make certain statements short… But also, I think paper writers have a strong incentive to prove their result rigorously but a weaker incentive to explain the mental models (and notations for playing with them) that led to their insight, if they’re different from the shortest-path-to-victory proof (which I think is not rare). Bill Thurston (Fields medalist) wrote about this, e.g.:
“We mathematicians need to put far greater effort into communicating mathematical ideas. To accomplish this, we need to pay much more attention to communicating not just our definitions, theorems, and proofs, but also our ways of thinking. We need to appreciate the value of different ways of thinking about the same mathematical structure.
We need to focus far more energy on understanding and explaining the basic mental infrastructure of mathematics—with consequently less energy on the most recent results. This entails developing mathematical language that is effective for the radical purpose of conveying ideas to people who don’t already know them.”
Therefore, what you receive, in education and in papers you read, probably has less in the way of “finding the optimal notation for symbol manipulation” than would best serve you.
To compensate for this, you should perhaps err on the side of imagining “Is there some more convenient way to write this, even if it doesn’t generalize?”. For example, on a small scale, when I’ve found myself writing out a series of “f(x,y) = z” statements (with constant f), I might turn them into “x,y z”; also I sometimes use spacing rather than parentheses to indicate order of operations. In programming, this might correspond to having a bunch of aliases for commonly used programs (“gs” for “git status”; “gg” for a customized recursive-grep pipeline), or, on the more extreme side, writing a domain-specific language and then using that language to solve your problem.
Of course, there are drawbacks. It may be more difficult to show a random person your work if it looks like the above; if you get pulled away and forget the context, you might be unable to decipher your own scribblings; when you need to write up your results, you will have a bigger cleanup job to do. However, each of these can be solved, and one might remark that “taking longer to explain your insights” is a better problem to have than “having fewer insights to explain”.
Some thoughts, partly cached and partly inspired by the above:
Modern algebraic notation—single-letter variables and stuff—is very useful for symbol manipulation. Imagine if you had to write statements like “The area of the square whose side is the hypotenuse is equal to the sum of the areas of the squares whose sides are the two legs” instead of “a^2+b^2=c^2″ and had to derive e.g. the quadratic equation. Like doing long division with Roman numerals, as someone put it.
Note that, in the above example, “a^2+b^2=c^2” requires context explaining what a, b, and c are. Shorter notation is to some extent necessarily more ambiguous: fewer symbols means fewer different combinations of them; context is the only way to disambiguate them to refer uniquely to a larger set of referents.
Pedagogy presumably optimizes for readability and minimizing ambiguity for young ignorant minds, as well as for the gradeability of the written output they produce. Therefore, it’d probably be biased against introducing ambiguous notation. Not that they never teach it—obviously they do teach algebraic notation, the merits of it are so high they can’t avoid it—but there might be cases where a notation would be useful to the serious student, but the pedagogues choose a more cumbersome notation because their interests differ.
Published papers, meanwhile, probably optimize for readability and rigor. I do believe it’s common for them to introduce complicated definitions so as to make certain statements short… But also, I think paper writers have a strong incentive to prove their result rigorously but a weaker incentive to explain the mental models (and notations for playing with them) that led to their insight, if they’re different from the shortest-path-to-victory proof (which I think is not rare). Bill Thurston (Fields medalist) wrote about this, e.g.:
“We mathematicians need to put far greater effort into communicating mathematical ideas. To accomplish this, we need to pay much more attention to communicating not just our definitions, theorems, and proofs, but also our ways of thinking. We need to appreciate the value of different ways of thinking about the same mathematical structure.
We need to focus far more energy on understanding and explaining the basic mental infrastructure of mathematics—with consequently less energy on the most recent results. This entails developing mathematical language that is effective for the radical purpose of conveying ideas to people who don’t already know them.”
Therefore, what you receive, in education and in papers you read, probably has less in the way of “finding the optimal notation for symbol manipulation” than would best serve you.
To compensate for this, you should perhaps err on the side of imagining “Is there some more convenient way to write this, even if it doesn’t generalize?”. For example, on a small scale, when I’ve found myself writing out a series of “f(x,y) = z” statements (with constant f), I might turn them into “x,y z”; also I sometimes use spacing rather than parentheses to indicate order of operations. In programming, this might correspond to having a bunch of aliases for commonly used programs (“gs” for “git status”; “gg” for a customized recursive-grep pipeline), or, on the more extreme side, writing a domain-specific language and then using that language to solve your problem.
Of course, there are drawbacks. It may be more difficult to show a random person your work if it looks like the above; if you get pulled away and forget the context, you might be unable to decipher your own scribblings; when you need to write up your results, you will have a bigger cleanup job to do. However, each of these can be solved, and one might remark that “taking longer to explain your insights” is a better problem to have than “having fewer insights to explain”.