More later, but just a brief remark – I think that one issue is that the top ~200 mathematicians are of such high intellectual caliber that they’ve plucked all of the low hanging fruit and that as a result mathematicians outside of that group have a really hard time doing research that’s both interesting and original. (The standard that I have in mind here is high, but I think that as one gains perspective one starts to see that superficially original research is often much less so than it looks.) I know many brilliant people who have only done so once over an entire career.
I am so confused as to why your standard seems to be so absurdly high to me.
Is it because my particular subfield is unusually full of low-hanging fruit? Or because so few of those ~200 top mathematicians work in it?
Is it because I don’t see how “superficially original” all of the work done in my field is? I lack perspective?
I am so confused as to why your standard seems to be so absurdly high to me.
The way in which I operationalize the originality / interest of research is “in 50 years, what will the best mathematicians think about it?” I think that this perspective is unusual amongst mathematicians as a group, but not among the greatest ones. I’d be interested in how it jibs with your own.
Anyway, I think that if one adopts this perspective and takes a careful look at current research using Bayesian reasoning, one is led to the conclusion that almost all of it will be considered to be irrelevant (confidence ~80%).
When I was in grad school, I observed people proving lots of theorems in low dimensional topology that were sort of interesting to me, but it’s also my best guess that most of them will be viewed in hindsight as similar to how advanced Euclidean geometry theorems are today – along the lines of “that’s sort of pretty, but not really worthy of serious attention.”
Is it because I don’t see how “superficially original” all of the work done in my field is? I lack perspective?
How old are you?
When I started grad school, I was blown away by how much the professors could do.
A few years out of grad school, I saw that a lot of the theorems were things that it was well known to experts that it was possible to prove by using certain techniques, and that proving them was in some sense a matter of the researchers dotting their i’s and crossing their t’s.
And in situations where something seemed strikingly original, the basic idea often turned out to be due to somebody other than the author of a paper (not to say that the author plagiarized – on the contrary, the author almost always acknowledged the source of the idea – but a lot of times people don’t read the fine print well enough to notice).
For example, the Wikipedia page on Paul Vojta reads
In formulating a number of striking conjectures, he pointed out the possible existence of parallels between the Nevanlinna theory of complex analysis, and diophantine analysis. This was a novel contribution to the circle of ideas around the Mordell conjecture and abc conjecture, suggesting something of large importance to the integer solutions (affine space) aspect of diophantine equations. It has been taken up in his own work, and that of others.
I had the chance to speak with Vojta and ask how he discovered these things, and he said that his advisor Barry Mazur
suggested that investigate possible parallels between Nevanlinna theory and diophantine analysis.
Similarly, even though Andrew Wiles’ work on Fermat’s Last Theorem does seem to be regarded by experts as highly original, the conceptual framework that he used had been developed by Barry Mazur, and I would guess (weakly – Idon’t have an inside view – just extrapolating based on things that I’ve heard) that people with deep knowledge of the field would say that Mazur’s contribution to the solution of Fermat’s last theorem was more substantial than that of Wiles.
The way in which I operationalize the originality / interest of research is “in 50 years, what will the best mathematicians think about it?”
Eegads. How do you even imagine what those people will be like?
I think that this perspective is unusual amongst mathematicians as a group, but not among the greatest ones. I’d be interested in how it jibs with your own.
Sure, I don’t think anyone I know really thinks of their work that way.
Is it because I don’t see how “superficially original” all of the work done in my field is? I lack perspective?
How old are you?
29, graduating in a few months.
A few years out of grad school, I saw that a lot of the theorems were things that it was well known to experts that it was possible to prove by using certain techniques, and that proving them was in some sense a matter of the researchers dotting their i’s and crossing their t’s.
Yeah, sure, that’s the vast majority of everything I’ve done so far, and some fraction of the work my subfield puts out.
The people two or three levels above me, though, they’re putting out genuinely new stuff on the order of once every three to five years. Maybe not “the best mathematicians 50 years from now think this is amazing” stuff, but I think the tools will still be in use in the generation after mine. Similar to the way most of my toolbox was invented in the 70′s-80′s.
I don’t understand your concept of originality. It has to be created in a vacuum to be original?
In the counterfactual where Vojta doesn’t exist, does Mazur go on to write similar papers? Is that the problem?
Eegads. How do you even imagine what those people will be like?
Well, if, e.g. you’re working on a special case of an unsolved problem using an ad hoc method with applicability that’s clearly limited to that case, and you think that the problem will probably be solved in full generality with a more illuminating solution within the next 50 years, then you have good reason to believe that work along these lines has no lasting significance.
Sure, I don’t think anyone I know really thinks of their work that way.
Not consciously, but there’s a difference between doing research that you think could contribute substantially to human knowledge and research that you know won’t. I think that a lot of mathematicians’ work falls into the latter category.
This is a long conversation, but I think that there’s a major issue of the publish or perish system (together with social pressures to be respectful to one’s colleagues) leading to doublethink, where on an explicit level, people think that their own research and the research of their colleagues is interesting, because they’re trying to make the best of the situation, but where there’s a large element of belief-in-belief, and that they don’t actually enjoy doing their work or hearing about their colleagues’ work in seminars. Even when people do enjoy their work, they often don’t know what they’re missing out on by not working on things that they find most interesting on an emotional level.
The people two or three levels above me, though, they’re putting out genuinely new stuff on the order of once every three to five years. Maybe not “the best mathematicians 50 years from now think this is amazing” stuff, but I think the tools will still be in use in the generation after mine. Similar to the way most of my toolbox was invented in the 70′s-80′s.
This sounds roughly similar to what I myself believe – the differences may be semantic. I think that work can be valuable even if people don’t find it amazing. I also think that there are people outside of the top 200 mathematicians who do really interesting work of lasting historical value – just that it doesn’t happen very often. (Weil said that you can tell that somebody is a really good mathematician if he or she has made two really good discoveries, and that Mordell is a counterexample.) It’s also possible that I’d consider the people who you have in mind to be in the top 200 mathematicians even if they aren’t considered to be so broadly.
I don’t understand your concept of originality. It has to be created in a vacuum to be original?
It’s hard to convey effect sizes in words. The standard that I have in mind is “producing knowledge that significantly changes experts’ Bayesian priors” (whether it be about what mathematical facts are true, or which methods are useful in a given context, or what the best perspective on a given topic is). By “significantly changes” I mean something like “uncovers something that some experts would find surprising.”
In the counterfactual where Vojta doesn’t exist, does Mazur go on to write similar papers? Is that the problem?
I don’t have enough subject matter knowledge to know how much Vojta added beyond what Mazur suggested (it could that upon learning more I would consider his marginal contributions to be really huge). I guess in bringing up those examples I didn’t so much mean “Vojta and Wiles didn’t do original work – it had already essentially been done by Mazur” as much as “the original contributions in math are more densely concentrated in a smaller number of people than one would guess from the outside,” which in turn bears on the question of how someone should assess his or her prospects for doing genuinely original work in a given field.
I guess in bringing up those examples I didn’t so much mean “Vojta and Wiles didn’t do original work – it had already essentially been done by Mazur” as much as “the original contributions in math are more densely concentrated in a smaller number of people than one would guess from the outside,” which in turn bears on the question of how someone should assess his or her prospects for doing genuinely original work in a given field.
I agree with your assessment of things here, but I do think it’s worth taking a moment to honor people who take correct speculation and turn it into a full proof. This is useful cognitive specialization of labor, and I don’t think it makes much sense to value originality over usefulness.
Well, if, e.g. you’re working on a special case of an unsolved problem using an ad hoc method with applicability that’s clearly limited to that case, and you think that the problem will probably be solved in full generality with a more illuminating solution within the next 50 years, then you have good reason to believe that work along these lines has no lasting significance.
It is really hard to tell when an ad hoc method will turn out many years later to be a special case of some more broad technique. It may also be that the special case will still need to be done if some later method uses it for bootstrapping.
the original contributions in math are more densely concentrated in a smaller number of people than one would guess from the outside,”
I’m not sure about this at all. Have you tried talking to people who aren’t already in academia about this? As far as I can tell, they think that there are a tiny number of very smart people who are mathematicians and are surprised to find out how many there are.
It is really hard to tell when an ad hoc method will turn out many years later to be a special case of some more broad technique. It may also be that the special case will still need to be done if some later method uses it for bootstrapping.
There are questions of quantitative effect sizes. Feel free to give some examples that you find compelling.
I’m not sure about this at all. Have you tried talking to people who aren’t already in academia about this? As far as I can tell, they think that there are a tiny number of very smart people who are mathematicians and are surprised to find out how many there are.
By “from the outside” I mean “from the outside of a field” (except to the extent that you’re able to extrapolate from your own field.)
Feel free to give some examples that you find compelling.
Fermat’s Last Theorem. The proof assumed that p >=11 and so the ad hoc cases from the 19th century were necessary to round it out. Moreover, the attempt to extend those ad hoc methods lead to the entire branch of algebraic number theory.
Primes in arithmetic progressions: much of what Tao and Greenberg did here extended earlier methods in a deep systematic way that were previously somewhat ad hoc. In fact, one can see a large fraction of modern work that touches on sieves as taking essentially ad hoc sieve techniques and generalizing them.
The proof assumed that p >=11 and so the ad hoc cases from the 19th century were necessary to round it out. Moreover, the attempt to extend those ad hoc methods lead to the entire branch of algebraic number theory.
I don’t recall Wiles’ proof assuming that p >= 11 – can you give a reference? I can’t find one quickly.
The n = 3 and 4 cases were proved by Euler and Fermat. It’s prima facie evident that Euler’s proof (which introduced a new number system with no historical analog) points to the existence of an entire field of math. I find this less so of Fermat’s proof as he stated it, but Fermat is also famous for the obscurity of his writings.
I don’t know the history around the n = 5 and n = 7 cases, and so don’t know whether they were important to the development of algebraic number theory, but exploring them is a natural extension of the exploration of new kinds of number systems that Euler had initiated.
They were subsumed by Kummer’s work, which I understand to have been motivated more by a desire to understand algebraic number fields and reciprocity laws than by Fermat’s last theorem in particular. For this, he developed the theory of ideal numbers, which is very general.
Primes in arithmetic progressions: much of what Tao and Greenberg did here extended earlier methods in a deep systematic way that were previously somewhat ad hoc. In fact, one can see a large fraction of modern work that touches on sieves as taking essentially ad hoc sieve techniques and generalizing them.
Ben Green, not Greenberg :-).
Sure, but the ultimate significance of the work remains to be seen. Of course, tastes vary, and there’s an element of subjectivity, but I think that we can agree that even if there’s a case for the proof being something that people will find interesting in 50 years, that the prior in favor of it is much weaker than the prior in favor of this being the case of, e.g. the Gross-Zagier formula.
I don’t recall Wiles’ proof assuming that p >= 11 – can you give a reference? I can’t find one quickly.
I think this is in the original paper that modularity implies FLT, but I’m on vacation and don’t have a copy available to check. Does this suffice as a reference?
Ben Green, not Greenberg
Yes, thank you.
They were subsumed by Kummer’s work, which I understand to have been motivated more by a desire to understand algebraic number fields and reciprocity laws than by Fermat’s last theorem in particular. For this, he developed the theory of ideal numbers, which is very general.
Sure, but Kummer was aware of the literature before him, and almost certainly used their results to guide him.
Sure, but the ultimate significance of the work remains to be seen. Of course, tastes vary, and there’s an element of subjectivity, but I think that we can agree that even if there’s a case for the proof being something that people will find interesting in 50 years, that the prior in favor of it is much weaker than the prior in favor of this being the case of, e.g. the Gross-Zagier formula.
Agreement may there depend very strongly on how you unpack “much weaker” but I’d be inclined to agree at least weaker without the much.
The way in which I operationalize the originality / interest of research is “in 50 years, what will the best mathematicians think about it?”
How good do you consider past mathematician to have been to judge further interest 50 years down the line.
Do you think that 50 years ago mathematicians understood the significance of all findings made at the time that turned out to be significant?
“in 50 years, what will the best mathematicians think about it?”
How do you make a priori judgments on who the best mathematicians are going to be? In your opinion, what qualities/achievements would put someone in the group of best mathematicians?
Anyway, I think that if one adopts this perspective and takes a careful look at current research using Bayesian reasoning, one is led to the conclusion that almost all of it will be considered to be irrelevant (confidence ~80%).
How different would your deductions be if you were living in a different time period? How much does that depend on the areas in mathematics that you are considering in that reasoning?
I am so confused as to why your standard seems to be so absurdly high to me.
Is it because my particular subfield is unusually full of low-hanging fruit? Or because so few of those ~200 top mathematicians work in it?
Is it because I don’t see how “superficially original” all of the work done in my field is? I lack perspective?
Anyway, this is really weird.
The way in which I operationalize the originality / interest of research is “in 50 years, what will the best mathematicians think about it?” I think that this perspective is unusual amongst mathematicians as a group, but not among the greatest ones. I’d be interested in how it jibs with your own.
Anyway, I think that if one adopts this perspective and takes a careful look at current research using Bayesian reasoning, one is led to the conclusion that almost all of it will be considered to be irrelevant (confidence ~80%).
When I was in grad school, I observed people proving lots of theorems in low dimensional topology that were sort of interesting to me, but it’s also my best guess that most of them will be viewed in hindsight as similar to how advanced Euclidean geometry theorems are today – along the lines of “that’s sort of pretty, but not really worthy of serious attention.”
How old are you?
When I started grad school, I was blown away by how much the professors could do.
A few years out of grad school, I saw that a lot of the theorems were things that it was well known to experts that it was possible to prove by using certain techniques, and that proving them was in some sense a matter of the researchers dotting their i’s and crossing their t’s.
And in situations where something seemed strikingly original, the basic idea often turned out to be due to somebody other than the author of a paper (not to say that the author plagiarized – on the contrary, the author almost always acknowledged the source of the idea – but a lot of times people don’t read the fine print well enough to notice).
For example, the Wikipedia page on Paul Vojta reads
I had the chance to speak with Vojta and ask how he discovered these things, and he said that his advisor Barry Mazur suggested that investigate possible parallels between Nevanlinna theory and diophantine analysis.
Similarly, even though Andrew Wiles’ work on Fermat’s Last Theorem does seem to be regarded by experts as highly original, the conceptual framework that he used had been developed by Barry Mazur, and I would guess (weakly – Idon’t have an inside view – just extrapolating based on things that I’ve heard) that people with deep knowledge of the field would say that Mazur’s contribution to the solution of Fermat’s last theorem was more substantial than that of Wiles.
Eegads. How do you even imagine what those people will be like?
Sure, I don’t think anyone I know really thinks of their work that way.
29, graduating in a few months.
Yeah, sure, that’s the vast majority of everything I’ve done so far, and some fraction of the work my subfield puts out.
The people two or three levels above me, though, they’re putting out genuinely new stuff on the order of once every three to five years. Maybe not “the best mathematicians 50 years from now think this is amazing” stuff, but I think the tools will still be in use in the generation after mine. Similar to the way most of my toolbox was invented in the 70′s-80′s.
I don’t understand your concept of originality. It has to be created in a vacuum to be original?
In the counterfactual where Vojta doesn’t exist, does Mazur go on to write similar papers? Is that the problem?
Well, if, e.g. you’re working on a special case of an unsolved problem using an ad hoc method with applicability that’s clearly limited to that case, and you think that the problem will probably be solved in full generality with a more illuminating solution within the next 50 years, then you have good reason to believe that work along these lines has no lasting significance.
Not consciously, but there’s a difference between doing research that you think could contribute substantially to human knowledge and research that you know won’t. I think that a lot of mathematicians’ work falls into the latter category.
This is a long conversation, but I think that there’s a major issue of the publish or perish system (together with social pressures to be respectful to one’s colleagues) leading to doublethink, where on an explicit level, people think that their own research and the research of their colleagues is interesting, because they’re trying to make the best of the situation, but where there’s a large element of belief-in-belief, and that they don’t actually enjoy doing their work or hearing about their colleagues’ work in seminars. Even when people do enjoy their work, they often don’t know what they’re missing out on by not working on things that they find most interesting on an emotional level.
This sounds roughly similar to what I myself believe – the differences may be semantic. I think that work can be valuable even if people don’t find it amazing. I also think that there are people outside of the top 200 mathematicians who do really interesting work of lasting historical value – just that it doesn’t happen very often. (Weil said that you can tell that somebody is a really good mathematician if he or she has made two really good discoveries, and that Mordell is a counterexample.) It’s also possible that I’d consider the people who you have in mind to be in the top 200 mathematicians even if they aren’t considered to be so broadly.
It’s hard to convey effect sizes in words. The standard that I have in mind is “producing knowledge that significantly changes experts’ Bayesian priors” (whether it be about what mathematical facts are true, or which methods are useful in a given context, or what the best perspective on a given topic is). By “significantly changes” I mean something like “uncovers something that some experts would find surprising.”
I don’t have enough subject matter knowledge to know how much Vojta added beyond what Mazur suggested (it could that upon learning more I would consider his marginal contributions to be really huge). I guess in bringing up those examples I didn’t so much mean “Vojta and Wiles didn’t do original work – it had already essentially been done by Mazur” as much as “the original contributions in math are more densely concentrated in a smaller number of people than one would guess from the outside,” which in turn bears on the question of how someone should assess his or her prospects for doing genuinely original work in a given field.
I agree with your assessment of things here, but I do think it’s worth taking a moment to honor people who take correct speculation and turn it into a full proof. This is useful cognitive specialization of labor, and I don’t think it makes much sense to value originality over usefulness.
It is really hard to tell when an ad hoc method will turn out many years later to be a special case of some more broad technique. It may also be that the special case will still need to be done if some later method uses it for bootstrapping.
I’m not sure about this at all. Have you tried talking to people who aren’t already in academia about this? As far as I can tell, they think that there are a tiny number of very smart people who are mathematicians and are surprised to find out how many there are.
There are questions of quantitative effect sizes. Feel free to give some examples that you find compelling.
By “from the outside” I mean “from the outside of a field” (except to the extent that you’re able to extrapolate from your own field.)
Yes, and I’m not sure how to measure that.
Fermat’s Last Theorem. The proof assumed that p >=11 and so the ad hoc cases from the 19th century were necessary to round it out. Moreover, the attempt to extend those ad hoc methods lead to the entire branch of algebraic number theory.
Primes in arithmetic progressions: much of what Tao and Greenberg did here extended earlier methods in a deep systematic way that were previously somewhat ad hoc. In fact, one can see a large fraction of modern work that touches on sieves as taking essentially ad hoc sieve techniques and generalizing them.
I don’t recall Wiles’ proof assuming that p >= 11 – can you give a reference? I can’t find one quickly.
The n = 3 and 4 cases were proved by Euler and Fermat. It’s prima facie evident that Euler’s proof (which introduced a new number system with no historical analog) points to the existence of an entire field of math. I find this less so of Fermat’s proof as he stated it, but Fermat is also famous for the obscurity of his writings.
I don’t know the history around the n = 5 and n = 7 cases, and so don’t know whether they were important to the development of algebraic number theory, but exploring them is a natural extension of the exploration of new kinds of number systems that Euler had initiated.
They were subsumed by Kummer’s work, which I understand to have been motivated more by a desire to understand algebraic number fields and reciprocity laws than by Fermat’s last theorem in particular. For this, he developed the theory of ideal numbers, which is very general.
Ben Green, not Greenberg :-).
Sure, but the ultimate significance of the work remains to be seen. Of course, tastes vary, and there’s an element of subjectivity, but I think that we can agree that even if there’s a case for the proof being something that people will find interesting in 50 years, that the prior in favor of it is much weaker than the prior in favor of this being the case of, e.g. the Gross-Zagier formula.
I think this is in the original paper that modularity implies FLT, but I’m on vacation and don’t have a copy available to check. Does this suffice as a reference?
Yes, thank you.
Sure, but Kummer was aware of the literature before him, and almost certainly used their results to guide him.
Agreement may there depend very strongly on how you unpack “much weaker” but I’d be inclined to agree at least weaker without the much.
How good do you consider past mathematician to have been to judge further interest 50 years down the line. Do you think that 50 years ago mathematicians understood the significance of all findings made at the time that turned out to be significant?
How do you make a priori judgments on who the best mathematicians are going to be? In your opinion, what qualities/achievements would put someone in the group of best mathematicians?
How different would your deductions be if you were living in a different time period? How much does that depend on the areas in mathematics that you are considering in that reasoning?