I also didn’t understand that. I was thinking of it more like AlphaStar in the sense that your prior is that you’re going to continue using your current (probabilistic) policy for all the steps involved in what you’re thinking about.
(But not like AlphaStar in that the brain is more likely to do one-or-a-few-steps of rollout with clever hierarchical abstract representations of plans, rather than dozens-of-steps rollouts in a simple one-step-at-a-time way.)
I also didn’t understand that. I was thinking of it more like AlphaStar in the sense that your prior is that you’re going to continue using your current (probabilistic) policy for all the steps involved in what you’re thinking about.
(But not like AlphaStar in that the brain is more likely to do one-or-a-few-steps of rollout with clever hierarchical abstract representations of plans, rather than dozens-of-steps rollouts in a simple one-step-at-a-time way.)
See my answer to Gurkenglas.
My understanding of planning by inference (aka active inference?) is not so much like AlphaStar. More to say here, but I’m out of time atm.