For sufficiently rich Z, that means that the summary must include a full model of the environment.
Is this a thoerem you’ve proven somewhere?
I have it in a notebook, might make a post soonish.
I ask because I already have a result that says this in MDPs: you can compute all optimal value functions iff you know the environment dynamics up to isomorphism.
(John made a post, I’ll just post this here so others can find it: https://www.lesswrong.com/posts/Dx9LoqsEh3gHNJMDk/fixing-the-good-regulator-theorem)
Is this a thoerem you’ve proven somewhere?
I have it in a notebook, might make a post soonish.
I ask because I already have a result that says this in MDPs: you can compute all optimal value functions iff you know the environment dynamics up to isomorphism.
(John made a post, I’ll just post this here so others can find it: https://www.lesswrong.com/posts/Dx9LoqsEh3gHNJMDk/fixing-the-good-regulator-theorem)