The way I interpret is that it is possible to find an algorithm to compress a set of data points in a way that is also good at predicting other data points, not yet observed. In yet other words, a good approximation is, for some reason, sometimes also a good extrapolation.
Well, yes, and the reason isn’t mysterious.
In order to compress a stream of data you need to discover some structure in it. If there is no structure—e.g. if the stream is truly random—then no compression is possible. And if the structure you found is “really there” and not an artifact of your structure-searching techniques, then it just as useful for extrapolation and prediction.
Well, yes, and the reason isn’t mysterious.
In order to compress a stream of data you need to discover some structure in it. If there is no structure—e.g. if the stream is truly random—then no compression is possible. And if the structure you found is “really there” and not an artifact of your structure-searching techniques, then it just as useful for extrapolation and prediction.