there are bit-strings for which Solomonoff Induction performs at worse-than-chance level forever!
This reminds me… at high school I tried to figure out the most unnatural sequence of bits. Defined as a sequence of ones and zeroes such that if I show you any prefix, and you try to figure out the next digit, you will be wrong.
I figured out that the first digit would be 1, because the simplest possible sequence is “00000000...”. The next digit would be zero, because the simplest sequence starting with 1 is “11111111...”. The third digit would also be zero, because the simplest sequence starting with 10 is “10101010...”.
After that I wasn’t sure anymore, because it seemed to me that if I continue using the same kind of reasoning, at some moment “this kind of reasoning” will itself become the simplest explanation for the generated data, and therefore I should stop doing it at some moment—but when exactly? Probably at the point where “he is breaking all patterns on purpose” becomes a more likely explanation that any specific pattern.
We’re going to use Solomonoff induction, or (if you want it to be computable) an approximation like AIXI-tl, so we’ll need a prior over all Turing machines. Let’s go with the speed prior for now.
At each bit, choose the bit which has lower probability according to this predictor.
This sequence is entirely deterministic, but can’t be predicted without self-reference.
This reminds me… at high school I tried to figure out the most unnatural sequence of bits. Defined as a sequence of ones and zeroes such that if I show you any prefix, and you try to figure out the next digit, you will be wrong.
I figured out that the first digit would be 1, because the simplest possible sequence is “00000000...”. The next digit would be zero, because the simplest sequence starting with 1 is “11111111...”. The third digit would also be zero, because the simplest sequence starting with 10 is “10101010...”.
After that I wasn’t sure anymore, because it seemed to me that if I continue using the same kind of reasoning, at some moment “this kind of reasoning” will itself become the simplest explanation for the generated data, and therefore I should stop doing it at some moment—but when exactly? Probably at the point where “he is breaking all patterns on purpose” becomes a more likely explanation that any specific pattern.
...sorry if this does not make sense at all.
It does make sense, and there’s a way to do it!
We’re going to use Solomonoff induction, or (if you want it to be computable) an approximation like AIXI-tl, so we’ll need a prior over all Turing machines. Let’s go with the speed prior for now.
At each bit, choose the bit which has lower probability according to this predictor.
This sequence is entirely deterministic, but can’t be predicted without self-reference.