If the idea that time stems from the second law is true, and we apply the principle of eliminating variables that are redundant because they don’t make any difference, we can collapse the notions of time and entropy into one thing. Under these assummptions, in a universe where entropy is decreasing (relative to our external notion of ‘time’), the internal ‘time’ is in fact running backward.
As also noted by some other commenters, it seems to me that the expressed conditional dependence of different points in a universe is in some way equivalent to increasing entropy.
Let’s assume that the laws of the universe described by the LMR picture are in fact time-symmetric and that the number of states each point can be in is too large to describe exactly (i.e. just as is the case in our actual universe, as far as we know). In that case, we can only describe our conditional knowledge of M2 given the states of M1 and R1,2 using very rough descriptions, not using the fully detailed descriptions describing the exact states. It seems to me that this can only be usefully done if there is some kind of structure in the states of M1,2 (a.k.a. low entropy) that matches our coarse description. Saying that the L or M part of the universe is in a low entropy state is equivalent to saying that some of the possible states are much more common for the nodes in the L or M part than other states. Our coarse predictor will necessarily make wrong predictions given some input states. Since the actual laws are time symmetric, if the input states to our predictor were randomly distributed over all possible states, our predictions would fail equally often predicting from left to right or from right to left. Only if on the left the states we can predict correctly happen more often than on the right will there be an inequality in the number of correct predictions.
...except that I now seem to have concluded that time always flows in the opposite direction of what Eliezers conditional dependence indicates, so I’m not sure how to interpret that. Maybe it is because I am assuming time symmetric laws and Eliezer is using time-asymmetric probablistic laws. However, it still seems correct to me that in the case of time symmetric underlying laws and a coarse (incomplete) predictor, predictions can only be better in one way than the other if there is a difference in how often we see correctly predicted input relative to incorrectly predicted input, and therefore if there is a difference in entropy.
Extrapolating from Eliezers line of reasoning you would probably find that although you remember ss0 + ss0 = ssss0, if you try to derive ss0 + ss0 from the peano axioms, you also discover it ends up as sss0, and starting with ss0 + ss0 = ssss0 quickly leads you to a contradiction.