Reinforcement learning uses a single scalar reward—by definition. Animals don’t really work like that—they have many pain sensors and they are never combined together—since some of them never get further than spinal cord reflexes.
However, the source of the reward seems to me not to a piece of RL dogma to me—though sure, some models put the reward in the agent’s “environment”.
Reinforcement learning uses a single scalar reward—by definition. Animals don’t really work like that—they have many pain sensors and they are never combined together—since some of them never get further than spinal cord reflexes.
However, the source of the reward seems to me not to a piece of RL dogma to me—though sure, some models put the reward in the agent’s “environment”.