You lay it out very nicely. But I’d quibble that as long as your nth-order Markov chain isn’t exceptionally small and fully deterministic, there might be room for more explanation. Maybe there’s no explanation and the data is genuinely random, but what if it’s a binary encoded Russian poem? When you’ve exhausted all self-contained short theories, that doesn’t mean the work of science is done. You also need to exhaust all analogies with everything in the world whose complexity is already “paid for”, and then look at that in turn, and so on.
First thing I did before even reading the article is see that it wasn’t ASCII or UTF-8 (or at least if it was it wasn’t bit-aligned). Definitely on the short list of things technical folks are going to instinctively check, along with maybe common “magic bytes” at the start of the maybe-a-file.
You lay it out very nicely. But I’d quibble that as long as your nth-order Markov chain isn’t exceptionally small and fully deterministic, there might be room for more explanation. Maybe there’s no explanation and the data is genuinely random, but what if it’s a binary encoded Russian poem? When you’ve exhausted all self-contained short theories, that doesn’t mean the work of science is done. You also need to exhaust all analogies with everything in the world whose complexity is already “paid for”, and then look at that in turn, and so on.
Yeah, but I couldn’t get that program to run on my desktop.
I agree. I definitely would have run through common encodings before going to Markov Chains.
First thing I did before even reading the article is see that it wasn’t ASCII or UTF-8 (or at least if it was it wasn’t bit-aligned). Definitely on the short list of things technical folks are going to instinctively check, along with maybe common “magic bytes” at the start of the maybe-a-file.