TurnTrout comments on Reward is not the optimization target

TurnTrout 1 Aug 2022 19:19 UTC
LW: 0 AF: 1
−2
AF
I’m trying to parse out what you’re saying here, to understand whether I agree that human behavior doesn’t seem to be almost perfectly explained as the result of an RL agent (with an interesting internal architecture) maximizing an inner learned reward.
What do you mean by “inner learned reward”? This post points out that even if humans were “pure RL agents”, we shouldn’t expect them to maximize their own reward. Maybe you mean “inner mesa objectives”?