Very late to the party here. I don’t know how much of the thinking in this post you still endorse or are still interested in. But this was a nice read. I wanted to add a few things:
- since you wrote this piece back in 2021, I have learned there is a whole mini-field of computer science dealing with multi-objective reward learning, maybe centered around . Maybe a good place to start there is https://link.springer.com/article/10.1007/s10458-022-09552-y
- The shard theory folks have done a fairly good job sketching out broad principles but it seems to me the homeostatic regulation does a great job of modulating which values happen to be relevant at any one time—Xavier Roberts-Gaal recently recommended “Where do values come from?” to me and that paper sketches out a fairly specific theory for how this happens (I think it might be that more homeostatic recalculation happens physiologically rather than neurologically, but otherwise buy what they are saying)
- Continue to think the vmPFC is relevant because different parts are known to calculate value of different aspects of stimuli; this can be modulated by state from time to time. a recent paper in this by Luke Chang & colleagues is a neural signature of reward
Very late to the party here. I don’t know how much of the thinking in this post you still endorse or are still interested in. But this was a nice read. I wanted to add a few things:
- since you wrote this piece back in 2021, I have learned there is a whole mini-field of computer science dealing with multi-objective reward learning, maybe centered around . Maybe a good place to start there is https://link.springer.com/article/10.1007/s10458-022-09552-y
- The shard theory folks have done a fairly good job sketching out broad principles but it seems to me the homeostatic regulation does a great job of modulating which values happen to be relevant at any one time—Xavier Roberts-Gaal recently recommended “Where do values come from?” to me and that paper sketches out a fairly specific theory for how this happens (I think it might be that more homeostatic recalculation happens physiologically rather than neurologically, but otherwise buy what they are saying)
- Continue to think the vmPFC is relevant because different parts are known to calculate value of different aspects of stimuli; this can be modulated by state from time to time. a recent paper in this by Luke Chang & colleagues is a neural signature of reward