You can substitute “utility” for “reward”, if you prefer. Reinforcement learning is a fairly general framework, except for its insistence on a scalar reward signal. If you talk to RL folk about the need for multiple reward signals, they say that sticking that information in the sensory channels is mathematically equivalent—which is kinda true.
You can substitute “utility” for “reward”, if you prefer. Reinforcement learning is a fairly general framework, except for its insistence on a scalar reward signal. If you talk to RL folk about the need for multiple reward signals, they say that sticking that information in the sensory channels is mathematically equivalent—which is kinda true.