timtyler comments on Muehlhauser-Hibbard Dialogue on AGI

timtyler 10 Jul 2012 0:31 UTC
−2 points
Reinforcement learning uses a single scalar reward—by definition. Animals don’t really work like that—they have many pain sensors and they are never combined together—since some of them never get further than spinal cord reflexes.

However, the source of the reward seems to me not to a piece of RL dogma to me—though sure, some models put the reward in the agent’s “environment”.