Predicting a string front-to-back is easier than back-to-front. Crutchfield has a very natural measure for this called the causal irreversibility.
In short, given a data stream Crutchfield constructs a minimal (but maximally predictive) forward predictive model S+ which predicts the future given the past (or the next tokens given the context) and the minimal maximally predictive (retrodictive?) backward predictive model S− which predicts the past given the future (or the previous token based on ′ future’ contexts).
The remarkable thing is that these models don’t have to be the same size as shown by a simple example (the ′ random insertion process’ ) whose forward model has 3 states and whose backward model has 4 states.
The causal irreversibility is roughly speaking the difference between the size of the forward and backward model.
Predicting a string front-to-back is easier than back-to-front. Crutchfield has a very natural measure for this called the causal irreversibility.
In short, given a data stream Crutchfield constructs a minimal (but maximally predictive) forward predictive model S+ which predicts the future given the past (or the next tokens given the context) and the minimal maximally predictive (retrodictive?) backward predictive model S− which predicts the past given the future (or the previous token based on ′ future’ contexts).
The remarkable thing is that these models don’t have to be the same size as shown by a simple example (the ′ random insertion process’ ) whose forward model has 3 states and whose backward model has 4 states.
The causal irreversibility is roughly speaking the difference between the size of the forward and backward model.
See this paper for more details.