Ravi Vakil’s advice for potential PhD students includes this bit on “tendrils to be backfilled” that’s stuck with me ever since as a metaphor for deepening understanding over time:
Here’s a phenomenon I was surprised to find: you’ll go to talks, and hear various words, whose definitions you’re not so sure about. At some point you’ll be able to make a sentence using those words; you won’t know what the words mean, but you’ll know the sentence is correct. You’ll also be able to ask a question using those words. You still won’t know what the words mean, but you’ll know the question is interesting, and you’ll want to know the answer. Then later on, you’ll learn what the words mean more precisely, and your sense of how they fit together will make that learning much easier.
The reason for this phenomenon is that mathematics is so rich and infinite that it is impossible to learn it systematically, and if you wait to master one topic before moving on to the next, you’ll never get anywhere. Instead, you’ll have tendrils of knowledge extending far from your comfort zone. Then you can later backfill from these tendrils, and extend your comfort zone; this is much easier to do than learning “forwards”. (Caution: this backfilling is necessary. There can be a temptation to learn lots of fancy words and to use them in fancy sentences without being able to say precisely what you mean. You should feel free to do that, but you should always feel a pang of guilt when you do.)
I don’t think “mathematics [being] so rich and infinite that it is impossible to learn it systematically” is the only reason (or maybe it subsumes the next point, I’m not sure what Vakil meant exactly). I think the other reason is what Bill Thurston pointed out in On proof and progress in mathematics:
Why is there such a big expansion from the informal discussion to the talk to the paper? One-on-one, people use wide channels of communication that go far beyond formal mathematical language. They use gestures, they draw pictures and diagrams, they make sound effects and use body language. Communication is more likely to be two-way, so that people can concentrate on what needs the most attention. With these channels of communication, they are in a much better position to convey what’s going on, not just in their logical and linguistic facilities, but in their other mental facilities as well.
In talks, people are more inhibited and more formal. Mathematical audiences are often not very good at asking the questions that are on most people’s minds, and speakers often have an unrealistic preset outline that inhibits them from addressing questions even when they are asked.
In papers, people are still more formal. Writers translate their ideas into symbols and logic, and readers try to translate back.
Why is there such a discrepancy between communication within a subfield and communication outside of subfields, not to mention communication outside mathematics?
Mathematics in some sense has a common language: a language of symbols, technical definitions, computations, and logic. This language efficiently conveys some, but not all, modes of mathematical thinking. Mathematicians learn to translate certain things almost unconsciously from one mental mode to the other, so that some statements quickly become clear. Different mathematicians study papers in different ways, but when I read a mathematical paper in a field in which I’m conversant, I concentrate on the thoughts that are between the lines. I might look over several paragraphs or strings of equations and think to myself “Oh yeah, they’re putting in enough rigamarole to carry such-and-such idea.” When the idea is clear, the formal setup is usually unnecessary and redundant—I often feel that I could write it out myself more easily than figuring out what the authors actually wrote. It’s like a new toaster that comes with a 16-page manual. If you already understand toasters and if the toaster looks like previous toasters you’ve encountered, you might just plug it in and see if it works, rather than first reading all the details in the manual.
People familiar with ways of doing things in a subfield recognize various patterns of statements or formulas as idioms or circumlocution for certain concepts or mental images. But to people not already familiar with what’s going on the same patterns are not very illuminating; they are often even misleading. The language is not alive except to those who use it.
The classic MathOverflow thread on thinking and explaining that Thurston himself started has a lot of memorable examples of what he referred to above by “One-on-one, people use wide channels of communication that go far beyond formal mathematical language”. I suspect one category of examples that the LW crowd would especially resonate with is this “adversarial perspective” described by Terry Tao:
One specific mental image that I can communicate easily with collaborators, but not always to more general audiences, is to think of quantifiers in game theoretic terms. Do we need to show that for every epsilon there exists a delta? Then imagine that you have a bag of deltas in your hand, but you can wait until your opponent (or some malicious force of nature) produces an epsilon to bother you, at which point you can reach into your bag and find the right delta to deal with the problem. Somehow, anthropomorphising the “enemy” (as well as one’s “allies”) can focus one’s thoughts quite well. This intuition also combines well with probabilistic methods, in which case in addition to you and the adversary, there is also a Random player who spits out mathematical quantities in a way that is neither maximally helpful nor maximally adverse to your cause, but just some randomly chosen quantity in between. The trick is then to harness this randomness to let you evade and confuse your adversary.
Is there a quantity in one’s PDE or dynamical system that one can bound, but not otherwise estimate very well? Then imagine that it is controlled by an adversary or by Murphy’s law, and will always push things in the most unfavorable direction for whatever you are trying to accomplish. Sometimes this will make that term “win” the game, in which case one either gives up (or starts hunting for negative results), or looks for additional ways to “tame” or “constrain” that troublesome term, for instance by exploiting some conservation law structure of the PDE.
It’s a pity this sort of understanding is harder to convey via text or in lectures.
Ravi Vakil’s advice for potential PhD students includes this bit on “tendrils to be backfilled” that’s stuck with me ever since as a metaphor for deepening understanding over time:
I don’t think “mathematics [being] so rich and infinite that it is impossible to learn it systematically” is the only reason (or maybe it subsumes the next point, I’m not sure what Vakil meant exactly). I think the other reason is what Bill Thurston pointed out in On proof and progress in mathematics:
The classic MathOverflow thread on thinking and explaining that Thurston himself started has a lot of memorable examples of what he referred to above by “One-on-one, people use wide channels of communication that go far beyond formal mathematical language”. I suspect one category of examples that the LW crowd would especially resonate with is this “adversarial perspective” described by Terry Tao:
It’s a pity this sort of understanding is harder to convey via text or in lectures.