I’d guess that cross-entropy/alt-complexity differs from K-complexity by at most a constant? I don’t have a proof here or even a full argument, but rough intuition is that, if the same function can be represented k different ways using a program of length n, then we should be able to specify the function using n—log(k) bits rather than n bits (modulo a constant to handle the extra overhead of the scheme). Similar to how, if we need to specify one of k possible incompressible n-bit strings and we don’t care which one, then that should require n—log(k) bits on average.
That said, good post, in hindsight it totally makes sense that this alternative complexity measure is the more natural thing to think about.
I also have the cached belief/intuition that the gap is at most an additive constant, and I think that’s probably why people in the field don’t sweat the difference, given that everything is only defined up to additive constants anyway.
Agree that alt-complexity is more natural from a wide variety of perspectives. But if the additive constant gap is true then I don’t think it’s a big deal technically (though it may still be significant pedagogically, and it’s weird that it’s not easy to find a reference for this claim if it’s needed for k-complexity to be a reasonable definition).
But if the additive constant gap is true then I don’t think it’s a big deal technically (though it may still be significant pedagogically, and it’s weird that it’s not easy to find a reference for this claim if it’s needed for k-complexity to be a reasonable definition)
There is an easy-to-find reference of this claim: The coding theorem, Theorem 4.3.3 in Li and Vitányi’s “An Introduction to Kolmogorov Complexity and Its Applications”, which was introduced to me as the most well-known book on algorithmic information theory.
I’d guess that cross-entropy/alt-complexity differs from K-complexity by at most a constant? I don’t have a proof here or even a full argument, but rough intuition is that, if the same function can be represented k different ways using a program of length n, then we should be able to specify the function using n—log(k) bits rather than n bits (modulo a constant to handle the extra overhead of the scheme). Similar to how, if we need to specify one of k possible incompressible n-bit strings and we don’t care which one, then that should require n—log(k) bits on average.
That said, good post, in hindsight it totally makes sense that this alternative complexity measure is the more natural thing to think about.
I also have the cached belief/intuition that the gap is at most an additive constant, and I think that’s probably why people in the field don’t sweat the difference, given that everything is only defined up to additive constants anyway.
Agree that alt-complexity is more natural from a wide variety of perspectives. But if the additive constant gap is true then I don’t think it’s a big deal technically (though it may still be significant pedagogically, and it’s weird that it’s not easy to find a reference for this claim if it’s needed for k-complexity to be a reasonable definition).
There is an easy-to-find reference of this claim: The coding theorem, Theorem 4.3.3 in Li and Vitányi’s “An Introduction to Kolmogorov Complexity and Its Applications”, which was introduced to me as the most well-known book on algorithmic information theory.
The coding theorem is a different claim when going deeper into the nuances. I may go into this point in a future post.
The post now exists.