I think there is some disagreement here, at least in the way I am using model-based / model-free RL (not sure exactly how you are using it). Model-based RL, at least to me, is not just about explicitly having some kind of model, which I think we both agree exists in cortex, but rather the actual action selection system using that model to do some kind of explicit rollouts for planning. I do not think the basal ganglia does this, while I think the PFC has some meta-learned ability to do this. In this sense, the BG is ‘model-free’ while the cortex is ‘model-based’.
Huh. I’d agree that’s an important distinction, but having a model also can be leveraged for learning; the way I’d normally use it, actor-critic architectures can fall on a spectrum of “modeliness” depending on how “modely” the critic is, even if the actor is a non-recursive, non-modely architecture. I think this is relevant to shard theory because I think the best arguments about shards involve inner alignment failure in model-free-in-my-stricter-sense models.
So, I agree and I think we are getting at the same thing (though not completely sure what you are pointing at). The way to have a model-y critic and actor is to have the actor and critic perform model-free RL over the latent space of your unsupervised world model. This is the key point of my post and why humans can have ‘values’ and desires for highly abstract linguistic concepts such as ‘justice’ as opposed to pure sensory states or primary rewards.
Huh. I’d agree that’s an important distinction, but having a model also can be leveraged for learning; the way I’d normally use it, actor-critic architectures can fall on a spectrum of “modeliness” depending on how “modely” the critic is, even if the actor is a non-recursive, non-modely architecture. I think this is relevant to shard theory because I think the best arguments about shards involve inner alignment failure in model-free-in-my-stricter-sense models.
So, I agree and I think we are getting at the same thing (though not completely sure what you are pointing at). The way to have a model-y critic and actor is to have the actor and critic perform model-free RL over the latent space of your unsupervised world model. This is the key point of my post and why humans can have ‘values’ and desires for highly abstract linguistic concepts such as ‘justice’ as opposed to pure sensory states or primary rewards.