I am starting another (probably) long sequence of posts on asymptotic logical uncertainty. Here, we will move past the Benford test to a more general goal. The purpose of this series will be to investigate the following question:
What is the correct resource bounded analogue of Solomonoff induction?
This series is a continuation of the Asymptotic Logical Uncertainty series, but I will start fresh with the notation, and try to keep it self contained.
Consider an environment, which is a Turing machine E. E takes as input a single natural number n, and outputs a bit E(n).
Our goal is to construct a Turing machine M, which takes as input an environment E and a natural number n, and outputs a probability M(n). This probability is meant to represent the probability that E outputs an 1.
Let A(n) be a fast growing function. We restrict M to run in time O(A(n)2) on input n. In all of our proposals, M will sample and simulate Turing machines which run it tame O(A(n)) on input n. We will judge M by comparing it to other turing machines which run in O(A(n)) time on input n.
We will score M based on how close its probabilities are to the true environment, E. I will talk later about different ways we can score M, but first, I want to describe three different difficulty settings for this problem:
Level 1: E(n) is guaranteed to output in time at most A(n+1).
Note that we assume that A(n+1)>>A(n)2, so this does not make the question completely trivial. However, this assumption does allow M to compute all E(i) for all i<n when asked to assign a probability to n. As we will soon see, this assumption make the problem much easier. Unfortunately, it also makes the problem much less important, as this is not a very realistic assumption to make.
Bounded Solomonoff induction is almost completely solved at this difficulty level by the “Subsequence Induction” algorithm, which I will present in detail later in the series. There may be some further work extending the existing algorithm so that M not only does better than anything that runs in time A(n), but is also well calibrated and does better than any A(n) time transformations that can be applied to its output.
Level 2: E(n) is guaranteed to output in bounded time for all inputs.
It is perhaps surprising that this difficulty level is so much harder than level 1. We still eventually get to learn every bit of the true environment, there is just a much larger gap between the bits we can compute, and the bit we are expected to predict at any given time. As we will see, one of the most valuable desired properties is attainable at level 1, put provably impossible at level 2. Most of the work here will be in pushing together the gap between impossibility results and powerful learning algorithms, to pin down boundary of possibility. There is some hope of being able to extend algorithms from level 1 into level 2, but so far these extensions have not been able to preserve all the desirable properties.
I plan on explaining more about exactly what makes Level 2 so much more difficult than level 1, but in the mean time, let me say that the primary difference is that level 2 is vulnerable to things like the problem described here
Level 3: E is not guaranteed to halt on all inputs, but we only judge M on its predictions in cases where E halts.
As of right now, it is very unclear how much more difficult level 3 is than level 2. This is mostly because many suggested algorithms for level 2 transfer very nicely to level 3, while many do not transfer at all. Until we solve level 2, we will not know which case we are in. For example, when we first passed the Benford test, we were only trying to pass it at level 2, but the algorithm we came up with turned out to generalize to level 3 very easily.
At level 3, we can take E to be the environment which assigns 1 to n that encode provable sentences, assigns 0 to n that encode disprovable sentences, and does not halt on independent sentences, which is the environment that we care most about. However, I would be very happy just to solve the problem at level 2, as that gets us most of the benefit. Some have argued that the distinction between level 2 and level 3 does not matter at all.
Bounded Solomonoff Induction: Three Difficulty Settings
I am starting another (probably) long sequence of posts on asymptotic logical uncertainty. Here, we will move past the Benford test to a more general goal. The purpose of this series will be to investigate the following question: What is the correct resource bounded analogue of Solomonoff induction?
This series is a continuation of the Asymptotic Logical Uncertainty series, but I will start fresh with the notation, and try to keep it self contained.
Consider an environment, which is a Turing machine E. E takes as input a single natural number n, and outputs a bit E(n).
Our goal is to construct a Turing machine M, which takes as input an environment E and a natural number n, and outputs a probability M(n). This probability is meant to represent the probability that E outputs an 1.
Let A(n) be a fast growing function. We restrict M to run in time O(A(n)2) on input n. In all of our proposals, M will sample and simulate Turing machines which run it tame O(A(n)) on input n. We will judge M by comparing it to other turing machines which run in O(A(n)) time on input n.
We will score M based on how close its probabilities are to the true environment, E. I will talk later about different ways we can score M, but first, I want to describe three different difficulty settings for this problem:
Level 1: E(n) is guaranteed to output in time at most A(n+1).
Note that we assume that A(n+1)>>A(n)2, so this does not make the question completely trivial. However, this assumption does allow M to compute all E(i) for all i<n when asked to assign a probability to n. As we will soon see, this assumption make the problem much easier. Unfortunately, it also makes the problem much less important, as this is not a very realistic assumption to make.
Bounded Solomonoff induction is almost completely solved at this difficulty level by the “Subsequence Induction” algorithm, which I will present in detail later in the series. There may be some further work extending the existing algorithm so that M not only does better than anything that runs in time A(n), but is also well calibrated and does better than any A(n) time transformations that can be applied to its output.
Level 2: E(n) is guaranteed to output in bounded time for all inputs.
It is perhaps surprising that this difficulty level is so much harder than level 1. We still eventually get to learn every bit of the true environment, there is just a much larger gap between the bits we can compute, and the bit we are expected to predict at any given time. As we will see, one of the most valuable desired properties is attainable at level 1, put provably impossible at level 2. Most of the work here will be in pushing together the gap between impossibility results and powerful learning algorithms, to pin down boundary of possibility. There is some hope of being able to extend algorithms from level 1 into level 2, but so far these extensions have not been able to preserve all the desirable properties.
I plan on explaining more about exactly what makes Level 2 so much more difficult than level 1, but in the mean time, let me say that the primary difference is that level 2 is vulnerable to things like the problem described here
Level 3: E is not guaranteed to halt on all inputs, but we only judge M on its predictions in cases where E halts.
As of right now, it is very unclear how much more difficult level 3 is than level 2. This is mostly because many suggested algorithms for level 2 transfer very nicely to level 3, while many do not transfer at all. Until we solve level 2, we will not know which case we are in. For example, when we first passed the Benford test, we were only trying to pass it at level 2, but the algorithm we came up with turned out to generalize to level 3 very easily.
At level 3, we can take E to be the environment which assigns 1 to n that encode provable sentences, assigns 0 to n that encode disprovable sentences, and does not halt on independent sentences, which is the environment that we care most about. However, I would be very happy just to solve the problem at level 2, as that gets us most of the benefit. Some have argued that the distinction between level 2 and level 3 does not matter at all.