I read a comment in this thread by Armok_GoB, and it reminded me of some machine-learning angles you could take on this problem. Forgive me if I make a fool of myself on this, I’m fairly rusty. Here’s my first guess as to how I’d solve the following:
open problem: the tradeoff of searching for an exact solution versus having a good approximation
Take a bunch of proven statements, and look at half of them. Generate a bunch of possible heuristics, and score them based on how well they predict the other half of the proven statements given the first half as proven. Keep track of how long it takes to apply a heuristic. Use the weighted combination of heuristics that worked best on known data, given various run-time constraints.
With a table of heuristic combinations and their historical effectiveness and computational time, and the expected value of having accurate information, you can quickly compute the expected value of running the heuristics. Then compare it against the expected computation time to see if it’s worth running.
Finally, you can update the heuristics themselves whenever you decide to add more proofs. You can also check short run-time heuristics with longer run-time ones. Things that work better than you expected, you should expect to work better.
Oh, and the value-of-information calculation I mentioned earlier can be used to pick up some cheap computational cycles as well—if it turns out that whether or not the billionth digit of pi is “3” is worth $3.50, you can simply decide to not care about that question.
And to be rigorous, here are the hand-waving parts of this plan:
Generate heuristics. How? I mean, you could simply write every program that takes a list of proofs, starting at the simplest, and start checking them. That seems very inefficient, though. There may be machine learning techniques for this that I simply have not been exposed to.
Given a list of heuristics, how do you determine how well they work? I’m pretty sure this is a known-solved problem, but I can’t remember the exact solution. If I remember right it’s something along the lines of log-difference, where getting something wrong is worth more penalty points the more certain you are about it.
Given a list heuristics, how do you find the best weighted combinations under a run-time constraint? This is a gigantic mess of linear algebra.
And another problem with it that I just found is that there’s no room for meta-heuristics. If the proofs come in two distinguishable groups that are separately amenable to two different heuristics, then it’s a really good idea to separate out these two groups and applying the better approach for that group. My approach seems like it’d be likely to miss this sort of insight.
I read a comment in this thread by Armok_GoB, and it reminded me of some machine-learning angles you could take on this problem. Forgive me if I make a fool of myself on this, I’m fairly rusty. Here’s my first guess as to how I’d solve the following:
Take a bunch of proven statements, and look at half of them. Generate a bunch of possible heuristics, and score them based on how well they predict the other half of the proven statements given the first half as proven. Keep track of how long it takes to apply a heuristic. Use the weighted combination of heuristics that worked best on known data, given various run-time constraints.
With a table of heuristic combinations and their historical effectiveness and computational time, and the expected value of having accurate information, you can quickly compute the expected value of running the heuristics. Then compare it against the expected computation time to see if it’s worth running.
Finally, you can update the heuristics themselves whenever you decide to add more proofs. You can also check short run-time heuristics with longer run-time ones. Things that work better than you expected, you should expect to work better.
Oh, and the value-of-information calculation I mentioned earlier can be used to pick up some cheap computational cycles as well—if it turns out that whether or not the billionth digit of pi is “3” is worth $3.50, you can simply decide to not care about that question.
And to be rigorous, here are the hand-waving parts of this plan:
Generate heuristics. How? I mean, you could simply write every program that takes a list of proofs, starting at the simplest, and start checking them. That seems very inefficient, though. There may be machine learning techniques for this that I simply have not been exposed to.
Given a list of heuristics, how do you determine how well they work? I’m pretty sure this is a known-solved problem, but I can’t remember the exact solution. If I remember right it’s something along the lines of log-difference, where getting something wrong is worth more penalty points the more certain you are about it.
Given a list heuristics, how do you find the best weighted combinations under a run-time constraint? This is a gigantic mess of linear algebra.
And another problem with it that I just found is that there’s no room for meta-heuristics. If the proofs come in two distinguishable groups that are separately amenable to two different heuristics, then it’s a really good idea to separate out these two groups and applying the better approach for that group. My approach seems like it’d be likely to miss this sort of insight.