When the new memory state Mt is generated by a Bayesian update from the previous one Mt−1 and the new observation Ot, it’s a sufficient statistic of these information sources for the world state Wt, so that Mt keeps all the information about the world that was remembered or observed:
=I(W_t;(M_{t-1},O_t)))
As this is all the information available, other ways to update can only have less information.
The amount of information gained by a Bayesian update is
When the new memory state Mt is generated by a Bayesian update from the previous one Mt−1 and the new observation Ot, it’s a sufficient statistic of these information sources for the world state Wt, so that Mt keeps all the information about the world that was remembered or observed:
As this is all the information available, other ways to update can only have less information.
The amount of information gained by a Bayesian update is
and because the observation only depends on the world