Nice! I think the general lesson here might be that when an agent has predictive representations (like those from a model, or those from a value function, or successor representations) the updates from those predictions can “outpace” the updates from the base credit assignment algorithm, by changing stuff upstream of the contexts that that credit assignment acts on.
Nice! I think the general lesson here might be that when an agent has predictive representations (like those from a model, or those from a value function, or successor representations) the updates from those predictions can “outpace” the updates from the base credit assignment algorithm, by changing stuff upstream of the contexts that that credit assignment acts on.