It is also completely unique, being the shortest possible string. It is so lean that not a single redundant detail remains—otherwise, it would not be the shortest string.
I don’t think this is necessarily true. (Though I am not sure about it.) I imagine there could be two different compression strategies that both happen to produce a result of the same length, but cannot be merged.
To use your example, “AAABB” can be compressed to either “3A2B” or “3ABB”, both containing 4 characters. Knowing that “2B” and “BB” represent the same thing doesn’t allow you to exploit this “redundancy” to further reduce it to one character.
Also, some parts of my body and mind are more important than others—the exact shape of all my hairs at this moment is a lot of data (not easy to compress, because there is a lot of randomness involved), and in a second the shape will be different anyway, and even if you cut my hair short it would still be “me” (at least I do not experience existential horror whenever I get a haircut). Also not sure if gut flora should be included.
I guess my point is that even relatively useless things can require many bits of information and you actually don’t need them, some lossy compression would suffice, but if you overdo it, you get TheFly.
I imagine there could be two different compression strategies that both happen to produce a result of the same length, but cannot be merged.
I think this is correct, but I think of this as being similar to chirality—multiple symmetric versions of the same essential information. I think it also probably depends on the description language you use, so maybe in one language something might have multiple versions, but in another it wouldn’t?
Yes, if there is no deep underlying reason why the two minimal descriptions should be same, and it “just happened”, I would assume that with slightly different description language it would not happen.
Even the “3A2B” vs “3ABB” example would stop working if encoding a number used a different number of bits than encoding a character.
I don’t think this is necessarily true. (Though I am not sure about it.) I imagine there could be two different compression strategies that both happen to produce a result of the same length, but cannot be merged.
To use your example, “AAABB” can be compressed to either “3A2B” or “3ABB”, both containing 4 characters. Knowing that “2B” and “BB” represent the same thing doesn’t allow you to exploit this “redundancy” to further reduce it to one character.
Also, some parts of my body and mind are more important than others—the exact shape of all my hairs at this moment is a lot of data (not easy to compress, because there is a lot of randomness involved), and in a second the shape will be different anyway, and even if you cut my hair short it would still be “me” (at least I do not experience existential horror whenever I get a haircut). Also not sure if gut flora should be included.
I guess my point is that even relatively useless things can require many bits of information and you actually don’t need them, some lossy compression would suffice, but if you overdo it, you get The Fly.
I think this is correct, but I think of this as being similar to chirality—multiple symmetric versions of the same essential information. I think it also probably depends on the description language you use, so maybe in one language something might have multiple versions, but in another it wouldn’t?
Yes, if there is no deep underlying reason why the two minimal descriptions should be same, and it “just happened”, I would assume that with slightly different description language it would not happen.
Even the “3A2B” vs “3ABB” example would stop working if encoding a number used a different number of bits than encoding a character.