Right—the delta correction has lower entropy than the raw pixel stream. Of course, you have to use a good 3-D model!
Mind you, if what you want is machine vision, I think in practice you will do better to work on machine vision directly than on video compression.
Are you familiar with the field of machine vision? If not, two quick points: 1) the evaluation metrics it employs are truly terrible. 2) Almost all computer vision tasks can be reformulated as specialized image compression tricks (e.g. the stereo correspondence problem can be reformulated as a way of compressing the second image in a stereo pair by predicting the pixels using the first image and a disparity function).
I’ll admit I’m not particularly familiar with machine vision. I was extrapolating from other areas such as natural language processing, where trying to formulate problems in terms of compression isn’t particularly helpful, but if an expert on machine vision says it is helpful in that field, fair enough.
natural language processing, where trying to formulate problems in terms of compression isn’t particularly helpful
Why do you think that? Statistical natural language processing has been very successful, both for recognition and translation of text, and for speech recognition.
Statistical natural language processing yes, but that’s not the same thing as text compression (again, there is a mathematical sense in which one could consider them related in principle, but doing text compression and doing natural language processing were still separate activities last I heard).
Right—the delta correction has lower entropy than the raw pixel stream. Of course, you have to use a good 3-D model!
Are you familiar with the field of machine vision? If not, two quick points: 1) the evaluation metrics it employs are truly terrible. 2) Almost all computer vision tasks can be reformulated as specialized image compression tricks (e.g. the stereo correspondence problem can be reformulated as a way of compressing the second image in a stereo pair by predicting the pixels using the first image and a disparity function).
I’ll admit I’m not particularly familiar with machine vision. I was extrapolating from other areas such as natural language processing, where trying to formulate problems in terms of compression isn’t particularly helpful, but if an expert on machine vision says it is helpful in that field, fair enough.
Why do you think that? Statistical natural language processing has been very successful, both for recognition and translation of text, and for speech recognition.
Statistical natural language processing yes, but that’s not the same thing as text compression (again, there is a mathematical sense in which one could consider them related in principle, but doing text compression and doing natural language processing were still separate activities last I heard).