saturn comments on Algorithmic Progress in Six Domains

saturn 9 Aug 2013 15:35 UTC
0 points
Unless you have a model that exactly describes how a given message was generated, its Shannon entropy is not known but estimated… and typically estimated based on the current state of the art in compression algorithms. So unless I misunderstood, this seems like a circular argument.
- IlyaShpitser 9 Aug 2013 17:47 UTC
  0 points
  Parent
  You need to read about universal coding, e.g. start here:
  
  http://en.wikipedia.org/wiki/Universal_code_(data_compression%29
  
  I highly recommend Thomas and Cover’s book, a very readable intro on info theory. The point is we don’t need to know the distribution from which the bits came from to do very well in the limit. (There are gains to be had in the region before “in the limit,” but these gains will track the kinds of gains you get in statistics if you want to move beyond asymptotic theory).