A Cartesian agent’s hypotheses are all about the agent’s internal states, and different possible causes for those states, so the idea of ‘world-states that don’t include the agent’ can’t be directly represented.
A sequence predictor’s predictions are all about the agent’s input tape states*, and different possible causes for those states. The hypotheses are programs that implement entire models of the Universe, and these can definitely directly represent world-states which don’t include the agent.
* More realistically, the states of the registers where the sensor data is placed.
ETA: I wonder if this intuition is caused by that fact that I am a practicing Bayesian statistician, so the distinction between posterior distributions and posterior predictive distributions is more salient to me.
A sequence predictor’s predictions are all about the agent’s input tape states*, and different possible causes for those states. The hypotheses are programs that implement entire models of the Universe, and these can definitely directly represent world-states which don’t include the agent.
* More realistically, the states of the registers where the sensor data is placed.
ETA: I wonder if this intuition is caused by that fact that I am a practicing Bayesian statistician, so the distinction between posterior distributions and posterior predictive distributions is more salient to me.