I’m trying to parse out what you’re saying here, to understand whether I agree that human behavior doesn’t seem to be almost perfectly explained as the result of an RL agent (with an interesting internal architecture) maximizing an inner learned reward.
What do you mean by “inner learned reward”? This post points out that even if humans were “pure RL agents”, we shouldn’t expect them to maximize their own reward. Maybe you mean “inner mesa objectives”?
What do you mean by “inner learned reward”? This post points out that even if humans were “pure RL agents”, we shouldn’t expect them to maximize their own reward. Maybe you mean “inner mesa objectives”?