Does this indicate a correct understanding of your comment, and does it address your point?
I think the core thing going on with my comment is that I think for humans most mentally accessible preferences are instrumental, and the right analogy for them is something like ‘value functions’ instead of ‘reward’ (as in RL).
Under this view, preference alteration is part of normal operation, and so should probably be cast as a special case of the general thing, instead of existing only in this context. When someone who initially dislikes the smell of coffee grows to like it, I don’t think it’s directly because it’s cognitively costly to keep two books, and instead it’s because they have some anticipation-generating machinery that goes from anticipating bad things about coffee to anticipating good things about coffee.
[It is indirectly about cognitive costs, in that if it were free you might store all your judgments ever, but from a functional perspective downweighting obsolete beliefs isn’t that different from forgetting them.]
And so it seems like there are three cases worth considering: given a norm that people should root for the sports team where they grew up, I can either 1) privately prefer Other team while publicly rooting for Local team, 2) publicly prefer Local team in order to not have to lie to myself, or 3) publicly prefer Local team for some other reason. (Maybe I trust the thing that generated the norm is wiser than I am, or whatever.)
Maybe another way to think about this how the agent relates to the social reward gradient; if it’s just a fact of the environment, then it makes sense to learn about it the way you would learn about coffee, whereas if it’s another agent influencing you as you influence it, then it makes sense to keep separate books, and only not do so when the expected costs outweigh the expected rewards.
I think for humans most mentally accessible preferences are instrumental, and the right analogy for them is something like ‘value functions’ instead of ‘reward’ (as in RL).
I agree. As far as I can tell, people seem to be predicting their on-policy Q function when considering different choices. See also attainable utility theory and the gears of impact.
I think the core thing going on with my comment is that I think for humans most mentally accessible preferences are instrumental, and the right analogy for them is something like ‘value functions’ instead of ‘reward’ (as in RL).
Under this view, preference alteration is part of normal operation, and so should probably be cast as a special case of the general thing, instead of existing only in this context. When someone who initially dislikes the smell of coffee grows to like it, I don’t think it’s directly because it’s cognitively costly to keep two books, and instead it’s because they have some anticipation-generating machinery that goes from anticipating bad things about coffee to anticipating good things about coffee.
[It is indirectly about cognitive costs, in that if it were free you might store all your judgments ever, but from a functional perspective downweighting obsolete beliefs isn’t that different from forgetting them.]
And so it seems like there are three cases worth considering: given a norm that people should root for the sports team where they grew up, I can either 1) privately prefer Other team while publicly rooting for Local team, 2) publicly prefer Local team in order to not have to lie to myself, or 3) publicly prefer Local team for some other reason. (Maybe I trust the thing that generated the norm is wiser than I am, or whatever.)
Maybe another way to think about this how the agent relates to the social reward gradient; if it’s just a fact of the environment, then it makes sense to learn about it the way you would learn about coffee, whereas if it’s another agent influencing you as you influence it, then it makes sense to keep separate books, and only not do so when the expected costs outweigh the expected rewards.
I agree. As far as I can tell, people seem to be predicting their on-policy Q function when considering different choices. See also attainable utility theory and the gears of impact.