I think a lot of this is factual knowledge. There are five publicly available questions from the FrontierMath dataset. Look at the last of these, which is supposed to be the easiest. The solution given is basically “apply the Weil conjectures”. These were long-standing conjectures, a focal point of lots of research in algebraic geometry in the 20th century. I couldn’t have solved the problem this way, since I wouldn’t have recalled the statement. Many grad students would immediately know what to do, and there are many books discussing this, but there are also many mathematicians in other areas who just don’t know this.
In order to apply the Weil conjectures, you have to recognize that they are relevant, know what they say, and do some routine calculation. As I suggested, the Weil conjectures are a very natural subject to have a problem about. If you know anything about the Weil conjectures, you know that they are about counting points of varieties over a finite field, which is straightforwardly what the problems asks. Further, this is the simplest case, that of a curve, which is e.g. what you’d see as an example in an introduction to the subject.
Regarding the calculation, parts of it are easier if you can run some code, but basically at this point you’ve following a routine pattern. There are definitely many examples of someone working out what the Weil conjectures say for some curve in the training set.
Further, asking Claude a bit, it looks like 518±6⋅59+1 are particularly common cases here. So, if you skip some of the calculation and guess, or if you make a mistake, you have a decent chance of getting the right answer by luck. You still need the sign on the middle term, but that’s just one bit of information. I don’t understand this well enough to know if there’s a shortcut here without guessing.
Overall, I feel that the benchmark has been misrepresented. If this problem is representative, it seems to test broad factual knowledge of advanced mathematics more than problem-solving ability. Of course, this question is marked as the easiest of the listed ones. Daniel Litt says something like this about some other problems as well, but I don’t really understand how routine he’s saying that they are, are I haven’t tried to understand the solutions myself.
Not a genius. The point isn’t that I can do the problems, it’s that I can see how to get the solution instantly, without thinking, at least in these examples. It’s basically a test of “have you read and understood X.” Still immensely impressive that the AI can do it!
I think a lot of this is factual knowledge. There are five publicly available questions from the FrontierMath dataset. Look at the last of these, which is supposed to be the easiest. The solution given is basically “apply the Weil conjectures”. These were long-standing conjectures, a focal point of lots of research in algebraic geometry in the 20th century. I couldn’t have solved the problem this way, since I wouldn’t have recalled the statement. Many grad students would immediately know what to do, and there are many books discussing this, but there are also many mathematicians in other areas who just don’t know this.
In order to apply the Weil conjectures, you have to recognize that they are relevant, know what they say, and do some routine calculation. As I suggested, the Weil conjectures are a very natural subject to have a problem about. If you know anything about the Weil conjectures, you know that they are about counting points of varieties over a finite field, which is straightforwardly what the problems asks. Further, this is the simplest case, that of a curve, which is e.g. what you’d see as an example in an introduction to the subject.
Regarding the calculation, parts of it are easier if you can run some code, but basically at this point you’ve following a routine pattern. There are definitely many examples of someone working out what the Weil conjectures say for some curve in the training set.
Further, asking Claude a bit, it looks like 518±6⋅59+1 are particularly common cases here. So, if you skip some of the calculation and guess, or if you make a mistake, you have a decent chance of getting the right answer by luck. You still need the sign on the middle term, but that’s just one bit of information. I don’t understand this well enough to know if there’s a shortcut here without guessing.
Overall, I feel that the benchmark has been misrepresented. If this problem is representative, it seems to test broad factual knowledge of advanced mathematics more than problem-solving ability. Of course, this question is marked as the easiest of the listed ones. Daniel Litt says something like this about some other problems as well, but I don’t really understand how routine he’s saying that they are, are I haven’t tried to understand the solutions myself.
Pulling a quote from the tweet replies (https://x.com/littmath/status/1870560016543138191):