Hutter’s contest only goes back to 2006, and is very much not a generalizable progress curve as it’s based on a static dump of Wikipedia. You’d want to also use Calgary & Canterbury Corpus results. (Ideally you’d dredge up compressors from the early history of computing, like basic Unix utilities from the ’60s or ’70s, but then you have the difficulty of actually running them—same issue that bars really good evaluation of “Proebsting’s Law”.)
Compression is closely related to forecasting—which in turn is a large part of intelligence (my estimate is 80% forecasting vs 20% pruning and evaluation). So your thesis bears on the idea of an intelligence explosion.
Well, creatures able to compress their sense data to 0.22 of its original size might drive to extinction creatures who can only manage a compression ratio of 0.23 - in an evolutionary contest. ‘Small’ differences in modeling ability—as measured by compression ratios—could thus have large effects on the world.
However compression (and AI) are hard problems and run rapidly into diminishing returns—at least if you measure them in this way.
Well, creatures able to compress their sense data to 0.22 of its original size might drive to extinction creatures
who can only manage a compression ratio of 0.23 - in an evolutionary contest.
Unless you have a model that exactly describes how a given message was generated, its Shannon entropy is not known but estimated… and typically estimated based on the current state of the art in compression algorithms. So unless I misunderstood, this seems like a circular argument.
I highly recommend Thomas and Cover’s book, a very readable intro on info theory. The point is we don’t need to know the distribution from which the bits came from to do very well in the limit. (There are gains to be had in the region before “in the limit,” but these gains will track the kinds of gains you get in statistics if you want to move beyond asymptotic theory).
I’m not as familiar with Shannon’s work as I’d like to be, but don’t Shannon’s bounds limit compression efficiency and not compression speed? This is a speed test, not an efficiency test.
Create a compressed version (self-extracting archive) of the 100MB file enwik8 of less than about 16MB. More precisely:
[etc]
There are restrictions on runspeed, but they seem to be there to limit brute forcing all possible approaches. There is also talk of “compressing intelligently.” I am not even sure what a pure speed test of compression would mean—the identity function is pretty fast!
There are linear time implementations of compression schemes that reach the Shannon bound, so any runtime improvement of an asymptotically optimal encoder will not be dramatic either...
This is my point—the algorithm challenge is to do a given compression task quickly, and Shannon has no limits I’m aware of on how fast that can be done. So there’s no problem of approaching physical limits the way there would be for, say, “Encode this file in the smallest space possible”.
Edit: The stuff below the line either wasn’t there when I replied or I didn’t see it. You’re right though, if linear-time implementations already exist, then you’re basically hard-bounded.
Figures for data compression are relevant—and should be obtainable.
Hutter’s contest only goes back to 2006, and is very much not a generalizable progress curve as it’s based on a static dump of Wikipedia. You’d want to also use Calgary & Canterbury Corpus results. (Ideally you’d dredge up compressors from the early history of computing, like basic Unix utilities from the ’60s or ’70s, but then you have the difficulty of actually running them—same issue that bars really good evaluation of “Proebsting’s Law”.)
Shannon’s bounds mean that you won’t see crazy progress in general in data compression.
Compression is closely related to forecasting—which in turn is a large part of intelligence (my estimate is 80% forecasting vs 20% pruning and evaluation). So your thesis bears on the idea of an intelligence explosion.
I guess it’s very weak negative evidence (like the fact that NP-complete problems exist, and lots of AI problems are NP-complete).
The steelman of a pro-explosion argument is probably very sophisticated and easily avoids these issues.
Well, creatures able to compress their sense data to 0.22 of its original size might drive to extinction creatures who can only manage a compression ratio of 0.23 - in an evolutionary contest. ‘Small’ differences in modeling ability—as measured by compression ratios—could thus have large effects on the world.
However compression (and AI) are hard problems and run rapidly into diminishing returns—at least if you measure them in this way.
Sounds pretty speculative.
Unless you have a model that exactly describes how a given message was generated, its Shannon entropy is not known but estimated… and typically estimated based on the current state of the art in compression algorithms. So unless I misunderstood, this seems like a circular argument.
You need to read about universal coding, e.g. start here:
http://en.wikipedia.org/wiki/Universal_code_(data_compression%29
I highly recommend Thomas and Cover’s book, a very readable intro on info theory. The point is we don’t need to know the distribution from which the bits came from to do very well in the limit. (There are gains to be had in the region before “in the limit,” but these gains will track the kinds of gains you get in statistics if you want to move beyond asymptotic theory).
I’m not as familiar with Shannon’s work as I’d like to be, but don’t Shannon’s bounds limit compression efficiency and not compression speed? This is a speed test, not an efficiency test.
???
From the url:
Create a compressed version (self-extracting archive) of the 100MB file enwik8 of less than about 16MB. More precisely:
[etc]
There are restrictions on runspeed, but they seem to be there to limit brute forcing all possible approaches. There is also talk of “compressing intelligently.” I am not even sure what a pure speed test of compression would mean—the identity function is pretty fast!
There are linear time implementations of compression schemes that reach the Shannon bound, so any runtime improvement of an asymptotically optimal encoder will not be dramatic either...
This is my point—the algorithm challenge is to do a given compression task quickly, and Shannon has no limits I’m aware of on how fast that can be done. So there’s no problem of approaching physical limits the way there would be for, say, “Encode this file in the smallest space possible”.
Edit: The stuff below the line either wasn’t there when I replied or I didn’t see it. You’re right though, if linear-time implementations already exist, then you’re basically hard-bounded.
To be sure, constant factors matter a lot! But I think in this case the curve would be pretty boring.