Solutions and Open Problems
Followup To: Approaching Logical Probability
Last time, we required our robot to only assign logical probability of 0 or 1 to statements where it’s checked the proof. This flowed from our desire to have a robot that comes to conclusions in limited time. It’s also important that this abstract definition has to take into account the pool of statements that our actual robot actually checks. However, this restriction doesn’t give us a consistent way to assign numbers to unproven statements—to be consistent we have to put limits on our application of the usual rules of probability.
• Integrating this approach into a larger framework of decision-making. This builds off of making tradeoffs with your computing time and handling infinities. Basically, we want our robot to make decisions in limited time, not just output logical probabilities in limited time, and making decisions requires considering your possible actions and the utility of outcomes, which are allowed to be really complicated and require approximation. And then, we need to somehow direct computational resources into different tasks to make the best decision.
• Integrating this approach with second-order arithmetic. If you look at MIRI’s paper that uses a probability distribution over logical statements, their approach is quite different—for one, they don’t allow for any limits on the robot’s resources. And for another, there are all sorts of properties that are important when considering second-order arithmetic that we haven’t needed yet. For example, what happens when we ask for P(P(this statement)<0.3)?
Thank you for reading the Logical Uncertainty sequence. I hope that things which were not obvious now seem obvious. If you want your logical probability distribution to have certain nice properties, it is a good idea to only slightly depart from the original desiderata of probability, and build up from there. Jumping straight to an answer is not necessary, and is probably a bad idea anyhow.
End of the sequence Logical Uncertainty
Previous Post: Approaching Logical Probability
- Approaching Logical Probability by 27 Feb 2014 7:44 UTC; 13 points) (
- 15 Mar 2014 7:00 UTC; 10 points) 's comment on What are you working on? March / April 2014 by (
I read a comment in this thread by Armok_GoB, and it reminded me of some machine-learning angles you could take on this problem. Forgive me if I make a fool of myself on this, I’m fairly rusty. Here’s my first guess as to how I’d solve the following:
Take a bunch of proven statements, and look at half of them. Generate a bunch of possible heuristics, and score them based on how well they predict the other half of the proven statements given the first half as proven. Keep track of how long it takes to apply a heuristic. Use the weighted combination of heuristics that worked best on known data, given various run-time constraints.
With a table of heuristic combinations and their historical effectiveness and computational time, and the expected value of having accurate information, you can quickly compute the expected value of running the heuristics. Then compare it against the expected computation time to see if it’s worth running.
Finally, you can update the heuristics themselves whenever you decide to add more proofs. You can also check short run-time heuristics with longer run-time ones. Things that work better than you expected, you should expect to work better.
Oh, and the value-of-information calculation I mentioned earlier can be used to pick up some cheap computational cycles as well—if it turns out that whether or not the billionth digit of pi is “3” is worth $3.50, you can simply decide to not care about that question.
And to be rigorous, here are the hand-waving parts of this plan:
Generate heuristics. How? I mean, you could simply write every program that takes a list of proofs, starting at the simplest, and start checking them. That seems very inefficient, though. There may be machine learning techniques for this that I simply have not been exposed to.
Given a list of heuristics, how do you determine how well they work? I’m pretty sure this is a known-solved problem, but I can’t remember the exact solution. If I remember right it’s something along the lines of log-difference, where getting something wrong is worth more penalty points the more certain you are about it.
Given a list heuristics, how do you find the best weighted combinations under a run-time constraint? This is a gigantic mess of linear algebra.
And another problem with it that I just found is that there’s no room for meta-heuristics. If the proofs come in two distinguishable groups that are separately amenable to two different heuristics, then it’s a really good idea to separate out these two groups and applying the better approach for that group. My approach seems like it’d be likely to miss this sort of insight.
The “unfortunate accident” may be that you simply didn’t have enough time to apply the rule. If you have limited time, you have to stop somewhere. There doesn’t have to be anything magical between steps N and N+1, it could merely mean that the timer went off at step N.
Just to expand, this example looks deceptively suspicious to humans because the proof is so obvious, even to us, and can be done in a single line using that one rule. For simple arithmetic like this, a robot would be able to prove it immediately after being asked whether 99+99=198 with hardly any time or effort. The procedure is so simple that, in practice, it would be efficient to delay generating the proof until it’s asked for. Many human readers will mentally prove it while reading the sentence without even noticing that it took any work to do so. It’s so fast that there’s no need to guess the answer probabilistically beforehand.
So I’m not sure this example is good for an intuitive illustration that “some unfortunate accident” is going on. It’s very strange to run out of time in a procedure that takes so little time to complete, so it’s hard to reason about the appropriateness of guessing at circumstance.
Maybe a better example would be, having already proved Pythagoras’ theorem, to prove that the body diagonal of a unit cube has length sqrt(3). This might take most humans a little while to think about, and if someone ran a stopwatch and said you have five seconds to decide whether it’s true or not, the idea of guessing P=0.5 for something you don’t have time to prove might be easier for people to understand. =)
Yup, that’s the idea :)
While obviously not rigorous enough for something serious, one obvious hack is to do the “0.5 unless proven” thing, and then have a long list of special case dumb heuristics with different weights that update that without any proofs involved at all. The list of heuristics could be gotten from some unsafe source like the programer or another AI or mechanical turk, and then the weights learned by first guessing and then proving to see if it was right, with heuristics that are to bad kicked out entirely.
At this stage, if I had to guess, I’d guess that the next output would be ‘4’, doubling every time.
That’s a good one too. Or ” 121212″ or “12222222”. But not so much “1,2,sofa”.
I have some rambling thoughts on the subject. I just hope they aren’t too stupid or obvious ;-)
Let’s take as a framework the aforementioned example of the last digit of the zillionth prime. We’ll say that the agent will be rewarded if it gets it right, on, shall we say, a log scoring rule. This means that the agent is incentivised to give the best (most accurate) probabilities it can, given the information it has. The more unreasonably confident it is, the more it loses, and the same with underconfidence.
By the way, for now I will assume the agent fully knows the scoring rule it will be judges by. It is quite possible that this assumption raises problems of its own, but I will ignore them for now.
So, the agent starts with a prior over the possible answers (a uniform prior?), and starts updating itself. But it wants to figure out how long it will spend doing so, before it should give up and hand in for grading its “good enough” answer. This is the main problem we are trying to solve here.
In the degenerate case in which it has nothing else in the universe other than this to give it utility, I actually think it is the correct answer to work forever (or as long as it can before physically falling apart) on the answer. But we shall make the opposite assumption. Let’s call the amount of utility lost to the agent as an opportunity cost in a given unit of time by the name C. (We shall also make the assumption that the agent knows what C is, at least approximately. This is perhaps a slightly more dangerous assumption, but we shall accept it for now.)
So, the agent want to work for as many units of time as it can before the marginal amount of extra utility it would earn from the scoring rule from the work of a unit time is less than C.
The only problem left is figuring out that margin. But, by the assumption that the agent knows the scoring rule, it knows the derivative of the scoring function as well. At any given point in time, it can figure out the amount of change to the potential utility it would get from the change to the probabilities it assigns. Thus, if the agent knows approximately the range in which it may update in the next step, it can figure out whether or not the next stage is worthwhile.
In other words, once it is close enough to the answer that it predicts that a marginal update would move it closer to the answer by an amount that gives less than C utility, it can quit, and not perform the next step.
This makes sense, right? I do suspect that this is the direction to drive at in the solution to this problem.