An anti-inductive sequence

I was thinking about what would it mean for a sequence of bits to be “anti-inductive”. It probably is a concept that is already known (as a rule of thumb, if I can think about it, someone probably already wrote a paper on it 50 years ago), but I haven’t heard about it.

*

Some sequences are predictable and can be compressed. These two concepts are deeply related, because if you can successfully predict the next part of the sequence, you don’t need to actually write it down; hence compression. A completely random sequence of bits cannot be compressed or predicted.

There is a simple mathematical proof that some sequences cannot be compressed, although it doesn’t say which ones. For any natural number N, there are more sequences of size exactly N, than sequences of size smaller than N. Therefore no program can generate a unique sequence shorter than N for any input sequence of size N.

*

Things get more complicated if we consider the caveat that although random sequences in general cannot be compressed, true randomness means that sometimes we accidentally get a sequence that can be compressed—for example, with probability 1/​2ᴺ we get a sequence of N zeroes, and it would sound silly to argue that we can’t compress that!

The solution to this paradox is that if we decide to compress only some selected sequences, then we need to add an extra bit of information specifying whether this sequence was compressed or not. Otherwise, if we see a sequence of bits saying (in binary) “a sequence of thousand zeroes”, we wouldn’t know whether the intended value is this very sequence of bits taken literally, or the sequence of thousand zeroes. One bit doesn’t seem like much, but actually most sequences cannot be compressed, so the cost of adding one extra bit to each of them outweighs the occasional space we would save by compressing the ones we can.

But still, if I needed a random sequence of bits to use e.g. as a password for something important… and by a miracle I generated a sequence of N zeroes… I would probably feel quite uncomfortable to use it, and would simply generate a new one. Is this a good security practice, or not? Because from certain perspective, by removing the “insufficiently random” sequences from the pool of possible choices, I am reducing the size of the pool, which… kinda makes it easier to guess the password?

Something similar actually happened to me once. I used a mobile application that asked me to create a 4-digit PIN. So I typed 4 random digits, and the application said: “nope, you cannot use the same digit multiple times in the PIN, that is not secure”. So I typed 4 random digits again, and the application said: “nope, you cannot use an ascending sequence of digits, that is not secure”. So I typed 4 random digits yet again, and this time the application was finally satisfied. But it felt funny that the more “security checks” the application uses, the more limited is the choice of possible PINs. There are only 10000 possible combinations of 4 digits to start with; I wonder how far an overzealous security department could reduce that number. In a hypothetical extreme case, we would be left with only one possible choice of PIN—certainly the most secure one that no one could possibly guess! The holy grail of information security.

*

Okay, back to the sequences of bits. Imagine that we are trying to create not just any random sequence, but the single most random, most unpredictable sequence ever! Suppose the length of the sequence is not specified in advance, so we just keep generating bits one by one, for as long as necessary.

What we could do is create a predictor—an algorithm that looks at the previously generated bits, tries to find all possible patterns in them and predict the most likely following bit—and then actually output the opposite. Keep doing this for every bit.

(To make the output fully deterministic, we add an extra rule that if the predictor assigns exactly 50% probabilities of the next bit being 0 or 1, we output 0.)

What would the generated sequence look like?

  • The first bit… we have no information yet; it could be anything. Therefore, the first bit is 0.

  • The second bit… all we know so far is that there was a 0 in the sequence, so it is more likely that the next bit will be a 0, too. Therefore, the second bit is 1.

  • Now we have 01; it could be an alternating sequence, so the next bit would be 0? Therefore the third bit is 1.

  • After 011… well, I am not sure; I think I could argue in both ways?

We need to make a more formal description of the algorithm, because the reasoning is going to get more complicated at each step, since we are trying to defy expectations.

The standard solution for the predictor would be to use the Solomonoff prior (take all possible algorithms that generate sequences of bits, and assign to each of them an initial probability 1/​2^ the length of the algorithm), after each observed bit remove the algorithms that were falsified by the observed data, and then calculate the total probability of the remaining algorithms that predict that 0 would be next, versus the total probability of the remaining algorithms that predict that 1 would be next.

This solution is (1) uncomputable, and (2) depends on the exact details of the language used to write the algorithms. But, from the mathematical perspective (once we specify the language for writing the algorithms) it is a formal definition of the most unpredictable sequence.

Yes, it sounds ironic that we have a formal definition of something “unpredictable”, but the definition is uncomputable, which means that no algorithm can actually predict the sequence, we just have its mathematical definition.

So I guess we won’t find the following bits of the sequence. At least, not arbitrarily many. Perhaps we could still find two or three more, but each of them would require more complicated reasoning.