You can substitute “utility” for “reward”, if you prefer. Reinforcement learning is a fairly general framework, except for its insistence on a scalar reward signal. If you talk to RL folk about the need for multiple reward signals, they say that sticking that information in the sensory channels is mathematically equivalent—which is kinda true.
I don’t endorse Legg’s formalization because it is limited to reinforcement learning agents.
That’s a good reason, and you should make that explicit.
Good point.
You can substitute “utility” for “reward”, if you prefer. Reinforcement learning is a fairly general framework, except for its insistence on a scalar reward signal. If you talk to RL folk about the need for multiple reward signals, they say that sticking that information in the sensory channels is mathematically equivalent—which is kinda true.