Reward function being (a single )scalar, unstructured quantity in RL( practice) seems weird, not coinciding/aligned with my intuition of learning from ~continuous interaction. Seems more like Kaneman-ish x-channell reward with weights to be distinguished/flexible in the future might yield more realistic/fullblown model.
While I agree with you, I also acknowledge that having changing weights of a multidimensional model is an inconsistency that violates VNM utility axioms, and it means that the agent can be money-pumped (making repeated locally-preferable decisions that each lose some long-term value for the agent).
Any actual decision is a selection of the top choice in a single dimension (“what I choose”). If that partial-ranking is inconsistent, the agent is not rational.
The resolution, of course is to recognize that humans are not rational. https://en.wikipedia.org/wiki/Dynamic_inconsistency gives some pointers to how well we know that’s true. I don’t have any references, and would enjoy seeing some papers or writeups on what it even means for a rational agent to be “aligned” with irrational ones.
Reward function being (a single )scalar, unstructured quantity in RL( practice) seems weird, not coinciding/aligned with my intuition of learning from ~continuous interaction. Seems more like Kaneman-ish x-channell reward with weights to be distinguished/flexible in the future might yield more realistic/fullblown model.
While I agree with you, I also acknowledge that having changing weights of a multidimensional model is an inconsistency that violates VNM utility axioms, and it means that the agent can be money-pumped (making repeated locally-preferable decisions that each lose some long-term value for the agent).
Any actual decision is a selection of the top choice in a single dimension (“what I choose”). If that partial-ranking is inconsistent, the agent is not rational.
The resolution, of course is to recognize that humans are not rational. https://en.wikipedia.org/wiki/Dynamic_inconsistency gives some pointers to how well we know that’s true. I don’t have any references, and would enjoy seeing some papers or writeups on what it even means for a rational agent to be “aligned” with irrational ones.