The concept of Kolmogorov Sufficient Statistic might be the missing piece. (cf Elements of information theory section 14.12)
We want the shortest program that describes a sequence of bits. A particularly interpretable type of such programs is “the sequence is in the set X generated by program p, and among those it is the n’th element” Example “the sequence is in the set of sequences of length 1000 with 104 ones, generated by (insert program here), of which it is the n~10^144′th element”.
We therefore define f(String, n) to be the size of the smallest set containing String which is generated by a program of length n. (Or alternatively where a program of length n can test for membership of the set)
If you plot the logarithm of f(String,n) you will often see bands where the line has slope −1, corresponding to using the extra bit to hardcode one more bit of the index. In this case the longer programs aren’t describing any more structure than the program where the slope started being −1. We call such a program a Kolmogorov minimal statistic.
The relevance is that for a biased coin with each flip independent the Kolmogorov minimal statistic is the bias. And it is often more natural to think about the Kolmogorov minimal statistics.
The concept of Kolmogorov Sufficient Statistic might be the missing piece. (cf Elements of information theory section 14.12)
We want the shortest program that describes a sequence of bits. A particularly interpretable type of such programs is “the sequence is in the set X generated by program p, and among those it is the n’th element”
Example “the sequence is in the set of sequences of length 1000 with 104 ones, generated by (insert program here), of which it is the n~10^144′th element”.
We therefore define f(String, n) to be the size of the smallest set containing String which is generated by a program of length n. (Or alternatively where a program of length n can test for membership of the set)
If you plot the logarithm of f(String,n) you will often see bands where the line has slope −1, corresponding to using the extra bit to hardcode one more bit of the index. In this case the longer programs aren’t describing any more structure than the program where the slope started being −1. We call such a program a Kolmogorov minimal statistic.
The relevance is that for a biased coin with each flip independent the Kolmogorov minimal statistic is the bias. And it is often more natural to think about the Kolmogorov minimal statistics.