Current theorem provers don’t have a “sense of direction”.
From the description of Polya’s Mathematics and Plausible Reasoning: Vol. II: Patterns of Plausible Inference:
This is a guide to the practical art of plausible reasoning, particularly in mathematics but also in every field of human activity. Using mathematics as the example par excellence, Professor Polya shows how even that most rigorous deductive discipline is heavily dependent on techniques of guessing, inductive reasoning, and reasoning by analogy. In solving a problem, the answer must be guessed at before a proof can even begin, and guesses are usually made from a knowledge of facts, experience, and hunches. The truly creative mathematician must be a good guesser first and a good prover afterward; many important theorems have been guessed but not proved until much later. In the same way, solutions to problems can be guessed, and a good guesser is much more likely to find a correct solution...
Current theorem provers search for a proof, and even contain some ad hoc heuristics for exploring their search trees. (For example, Paulson’s book ML for the working programmer ends with a chapter on theorem proving and offers “We have a general heuristic: never apply universal:left or existential:right to a goal if a different rule can usefully be applied.) However, Polya’s notion of being a good guesser has been formalised as Bayesian reasoning. If a theorem prover, facing a fork in its search tree, uses Bayes Theorem to pick the most likely branch first, it may find much deeper results.
I don’t think there is currently much overlap between those working on theorem proving and those working on Bayesian statistics. There is perhaps even a clash of temperaments between those who seek absolute certainty and those who seek an edge in the messy process of muddling through. Nevertheless I foresee great possibilities of using Bayesian reasoning to guide the internal search of theorem provers, thus giving them a sense of direction.
Reasoning “X hasn’t been progressing recently, therefore X is unlikely to progress in the near future.” is good reasoning if that is really all that we know. But it is also weak reasoning which we should be quick to discard if we see a relevant change looming. The next ten or twenty years might see the incorporation of Bayesian reasoning as a guiding principle for tree search heuristics inside theorem provers and lead to large advances.
However, Polya’s notion of being a good guesser has been formalised as Bayesian reasoning. If a theorem prover, facing a fork in its search tree, uses Bayes Theorem to pick the most likely branch first, it may find much deeper results.
I don’t think there is currently much overlap between those working on theorem proving and those working on Bayesian statistics. There is perhaps even a clash of temperaments between those who seek absolute certainty and those who seek an edge in the messy process of muddling through. Nevertheless I foresee great possibilities of using Bayesian reasoning to guide the internal search of theorem provers, thus giving them a sense of direction.
What I question is whether the Bayesian reasoning algorithms are at all computationally feasible to implement to solve nontrivial problems. See the final two paragraphs of my comment here about computational complexity. Do you have evidence in the other direction?
No. I think one typically has to come up with a brutally truncated approximation to actually Bayesian reasoning. For example, if you have n propositions, instead of considering all 2^ n basic conjunctions, ones first idea is to assume that they are all independent. Typically that is a total failure; the independence assumption abolishes the very interactions that were of interest. So one might let proposition n depend on proposition n-1 and reinvent Markov models.
I don’t see much hope of being able to anticipate which, if any, crude approximations to Bayesian reason are going to work well enough. One just has to try it and see. I don’t think that my comment goes any deeper than saying that there are lots of close to practical things due to be tried soon, so I expect one or two pleasant surprises.
Bayesian probability theory tells you how to process evidence as well as possible. If you know what’s evidence of a correct path, you can make it into an ad-hoc hueristic more easily than a part of a Bayesian update. Seems like the real insight required is to figure out what’s evidence of a correct path.
You’ve put your finger on a weakness of my optimistic vision. If the guesses are calling it 90% of the time, they significantly extend the feasible depth of search. But 60:40? Meh! There is a lot of room for the insights to fail to be sharp enough, which turns the Bayesian stuff into CPU-cycle wasting overhead.
Current theorem provers don’t have a “sense of direction”.
From the description of Polya’s Mathematics and Plausible Reasoning: Vol. II: Patterns of Plausible Inference:
Current theorem provers search for a proof, and even contain some ad hoc heuristics for exploring their search trees. (For example, Paulson’s book ML for the working programmer ends with a chapter on theorem proving and offers “We have a general heuristic: never apply universal:left or existential:right to a goal if a different rule can usefully be applied.) However, Polya’s notion of being a good guesser has been formalised as Bayesian reasoning. If a theorem prover, facing a fork in its search tree, uses Bayes Theorem to pick the most likely branch first, it may find much deeper results.
I don’t think there is currently much overlap between those working on theorem proving and those working on Bayesian statistics. There is perhaps even a clash of temperaments between those who seek absolute certainty and those who seek an edge in the messy process of muddling through. Nevertheless I foresee great possibilities of using Bayesian reasoning to guide the internal search of theorem provers, thus giving them a sense of direction.
Reasoning “X hasn’t been progressing recently, therefore X is unlikely to progress in the near future.” is good reasoning if that is really all that we know. But it is also weak reasoning which we should be quick to discard if we see a relevant change looming. The next ten or twenty years might see the incorporation of Bayesian reasoning as a guiding principle for tree search heuristics inside theorem provers and lead to large advances.
What I question is whether the Bayesian reasoning algorithms are at all computationally feasible to implement to solve nontrivial problems. See the final two paragraphs of my comment here about computational complexity. Do you have evidence in the other direction?
No. I think one typically has to come up with a brutally truncated approximation to actually Bayesian reasoning. For example, if you have n propositions, instead of considering all 2^ n basic conjunctions, ones first idea is to assume that they are all independent. Typically that is a total failure; the independence assumption abolishes the very interactions that were of interest. So one might let proposition n depend on proposition n-1 and reinvent Markov models.
I don’t see much hope of being able to anticipate which, if any, crude approximations to Bayesian reason are going to work well enough. One just has to try it and see. I don’t think that my comment goes any deeper than saying that there are lots of close to practical things due to be tried soon, so I expect one or two pleasant surprises.
Ok, thanks for the clarification
Bayesian probability theory tells you how to process evidence as well as possible. If you know what’s evidence of a correct path, you can make it into an ad-hoc hueristic more easily than a part of a Bayesian update. Seems like the real insight required is to figure out what’s evidence of a correct path.
You’ve put your finger on a weakness of my optimistic vision. If the guesses are calling it 90% of the time, they significantly extend the feasible depth of search. But 60:40? Meh! There is a lot of room for the insights to fail to be sharp enough, which turns the Bayesian stuff into CPU-cycle wasting overhead.