Houshalter comments on Significance of Compression Rate Method

Houshalter 30 May 2010 15:11 UTC
1 point

1.Set up a camera next to a highway, and record the stream of passing cars. To compress the resulting data, you will need to develop a computational understanding of the visual appearance of automobiles. You will need theories of hubcaps, windshields, license plates, car categories, and so on.

How does that work? The data would be rediculously complex. If you want to sacrifice quality, you could just convert everything to 3d models and compress that, but that would have alot of error compared to an original video. If you were to compress just the video, knowledge about cars wouldn’t do any good because there are far to many outliers, the pixel values. Now there might be an complex way to do it, assume the general shapes of objects and limit the possible variation of pixels in those areas, but even then there are to many exceptions to be useful. And how do you compress faces? You might be able to create a method that can identify a face or a specific face from something else, but then its not compression, its object recognition. The same with your bird example which might even be more complex then the car example. And if you decide to not overfit, this sentence could be compressed by “its all made of letters and the most common letter is e, so therefore it is all E’s.”
- Morendil 30 May 2010 15:19 UTC
  4 points
  Parent
  As the OP says “the key difference is the emphasis on large-scale compression”.
  
  Something with 3D models eventually compresses a lot better than recording video, over a long period of time, for the same reason video games are made with 3D models rather than by recording every possible game scenario as a precomputed animation.
  - Houshalter 30 May 2010 15:38 UTC
    2 points
    Parent
    But there is always going to be error. Games don’t look realistic because of this method. On the video, you could have a spot on the window where it instantly turns blue for no reason. What I’m trying to say is theres to much static, to many outliers. If you were to take a thousand pictures of someone, no two pictures would be the same, even though they appear to be.
    - SilasBarta 30 May 2010 19:25 UTC
      4 points
      Parent
      Along with what rwallace said, it might be a good idea to point out that the total message length actually breaks down into three parts:
      
      1) Specification of the encoding: in info-theoretical terms, this maps more common, long strings (observations) to shorter symbols. It plays the role of the theory, in that it implicitly says what observations are most likely. This part tells you how to “decompress” parts 2 and 3 into their original form.
      
      2) The compressed string: this lists all the observed data, using the shortcuts given in 1). Common occurrences are easily described by short strings.
      
      3) The “exceptions” to the theory—the observations that aren’t easily compressed into 2). The longer strings are reserved for these, since they occur the least.
      
      So yes, even if you can’t reach perfection, any strong regularity found in the data will allow you to compress it, even if it’s a rule with part-3 exceptions.
      
      Daniel’s Reusability hypothesis is correct: any insight you gain into a particular domain can be equivalently expressed as a way to compress descriptions of that domain.
      
      However, let me again register my annoyance with this series, not on grounds that it’s wrong (so far), but that it’s just an exposition of the well-accepted minimum message length formalism. It may not be accepted by scientists, but it’s accepted here, and sciences do implicitly use it when they can’t do controlled experiments (which are not strictly necessary for science anyway), such as in astronomy and geology.
      What links here?
      SilasBarta's comment on Significance of Compression Rate Method by Daniel_Burfoot (1 Jun 2010 19:07 UTC; 0 points)
      - Houshalter 30 May 2010 21:15 UTC
        0 points
        Parent
        First of all, compression doesn’t have to be limited to just commonly occuring sets. For example, if I create a fractal, you don’t have to find out commonly used colors or pixels that occur near each other, but just create a program which generates the exact same fractal. When you do this you add another element to the system—time. How much time do you spend trying to compress something, or vice versa, decompressing it? And no compression system can garuntee that every observed instance can be compressed to less then its original size. Infact, there are always going to be cases where compressing it actually increases the size of the file, possibly significantly, but at least just a few bits.
        SilasBarta 30 May 2010 21:29 UTC
        0 points
        Parent
        
        First of all, compression doesn’t have to be limited to just commonly occuring sets. For example, if I create a fractal, you don’t have to find out commonly used colors or pixels that occur near each other, but just create a program which generates the exact same fractal.
        
        That kind of regularity can still be explained in the language of commonly occurring sets: For a fractal, you reserve a symbol for the overall structure, which is common (in that something like it occurs frequently), and you have a symbol to instantiate it, given a particular position, to what scale. Time is handled similarly.
        
        Compression amounts to saying: “This is the structure of the data. Now, here’s the remaining data to full in any uncertainty not resolved by its structure.” The structure tells you what long symbols you can substitute with short symbols.
        
        And no compression system can garuntee that every observed instance can be compressed to less then its original size.
        
        True, compression only attempts to guarantee that, for the function generating the data, you save, on average, by using a certain compression scheme. If the generating function has no regularity, then you won’t be able to compress, even on average.
        Houshalter 30 May 2010 22:20 UTC
        0 points
        Parent
        
        That kind of regularity can still be explained in the language of commonly occurring sets: For a fractal, you reserve a symbol for the overall structure, which is common (in that something like it occurs frequently), and you have a symbol to instantiate it, given a particular position, to what scale. Time is handled similarly.
        
        Yes but if you have a symbol for every possible fractal, it defeats the purpose. The point was to use time as a memory resource. This also works in reverse with caching and other methods, you can use memory as if it were time. Both, however, only work up to a point.
    - rwallace 30 May 2010 18:28 UTC
      3 points
      Parent
      Yes, but if your compressed data contains both a 3-D model and a stream of outlier pixels (to be applied as delta to the result of rendering the model), this is still going to be smaller than if you didn’t have a 3-D model and just supplied the video as a pixel stream.
      
      Mind you, if what you want is machine vision, I think in practice you will do better to work on machine vision directly than on video compression. But the compression approach is valid as a matter of mathematical principle.
      - Daniel_Burfoot 30 May 2010 18:53 UTC
        5 points
        Parent
        Right—the delta correction has lower entropy than the raw pixel stream. Of course, you have to use a good 3-D model!
        
        Mind you, if what you want is machine vision, I think in practice you will do better to work on machine vision directly than on video compression.
        
        Are you familiar with the field of machine vision? If not, two quick points: 1) the evaluation metrics it employs are truly terrible. 2) Almost all computer vision tasks can be reformulated as specialized image compression tricks (e.g. the stereo correspondence problem can be reformulated as a way of compressing the second image in a stereo pair by predicting the pixels using the first image and a disparity function).
        rwallace 30 May 2010 23:56 UTC
        0 points
        Parent
        I’ll admit I’m not particularly familiar with machine vision. I was extrapolating from other areas such as natural language processing, where trying to formulate problems in terms of compression isn’t particularly helpful, but if an expert on machine vision says it is helpful in that field, fair enough.
        PhilGoetz 1 Jun 2010 16:54 UTC
        2 points
        Parent
        
        natural language processing, where trying to formulate problems in terms of compression isn’t particularly helpful
        
        Why do you think that? Statistical natural language processing has been very successful, both for recognition and translation of text, and for speech recognition.
        rwallace 1 Jun 2010 19:34 UTC
        0 points
        Parent
        Statistical natural language processing yes, but that’s not the same thing as text compression (again, there is a mathematical sense in which one could consider them related in principle, but doing text compression and doing natural language processing were still separate activities last I heard).