I think it’s worth distinguishing how hard it is for a lean programmer to write the solution, how hard it is to solve the math problem in the first place, and how hard it is to write down an ML algorithm that spits out the right lean tactics.
Like, even if something can be written in a compact form, there might be only a dozen of combinations of ~10 tokens that give us a correct solution like nlinarith (b- a), …, where by token I count “nlinarith”, “sq_nonneg”, “b”, “-”, “a”, etc., and the actual search space for something of length 10 is probably ~(grammar size)^10 where the grammar is possibly of size 10-100. (Note: I don’t know how traditional solvers perform on statement of that size, it’s maybe not that hard.)
I agree that traditional methods work well for algebraic problems where proofs are short, and that AI doing search with nlinarith seems “dumb”, but the real question here is whether OAI has found a method to solve such problems at scale.
As you said, the one liner is not really convincing, but the multi step solution, introducing a new axiom in the middle, seems like a general construction to solve all algebraic problems, and even more. (Though they do mention how infinite action space and no no-self-play limits scaling in general.)
I do agree with the general impression that it’s not a huge breakthrough. To me, it’s mostly an update like “look, two years after gpt-f, it’s still hard but se can solve a theorem which requires multiple steps with transformers now!”.
I think we’re basically in agreement here (and I think your summary of the results is fair, I was mostly pushing back on a too-hyped tone coming from OpenAI, not from you)
I think it’s worth distinguishing how hard it is for a lean programmer to write the solution, how hard it is to solve the math problem in the first place, and how hard it is to write down an ML algorithm that spits out the right lean tactics.
Like, even if something can be written in a compact form, there might be only a dozen of combinations of ~10 tokens that give us a correct solution like nlinarith (b- a), …, where by token I count “nlinarith”, “sq_nonneg”, “b”, “-”, “a”, etc., and the actual search space for something of length 10 is probably ~(grammar size)^10 where the grammar is possibly of size 10-100. (Note: I don’t know how traditional solvers perform on statement of that size, it’s maybe not that hard.)
I agree that traditional methods work well for algebraic problems where proofs are short, and that AI doing search with nlinarith seems “dumb”, but the real question here is whether OAI has found a method to solve such problems at scale.
As you said, the one liner is not really convincing, but the multi step solution, introducing a new axiom in the middle, seems like a general construction to solve all algebraic problems, and even more. (Though they do mention how infinite action space and no no-self-play limits scaling in general.)
I do agree with the general impression that it’s not a huge breakthrough. To me, it’s mostly an update like “look, two years after gpt-f, it’s still hard but se can solve a theorem which requires multiple steps with transformers now!”.
I think we’re basically in agreement here (and I think your summary of the results is fair, I was mostly pushing back on a too-hyped tone coming from OpenAI, not from you)