We-the-devs choose when to dole out these reinforcement events, either handwriting a simple reinforcement algorithm to do it for us or having human overseers give out reinforcement.
I would unify those by saying: We get to write a reward function however we want. The reward function can depend on external inputs, if that’s what we want. One of those external inputs can be a “reward” button, if that’s what we want. And the reward function can be a single trivial line of code return (100 if reward_button_is_being_pressed else 0), if that’s what we want.
Therefore, “having human overseers give out [reward]” is subsumed by “handwriting a simple [reward function]”.
(I don’t think you would wind up with an AGI at all with that exact reward function, and if you did, I think the AGI would probably kill everyone. But that’s a different issue.)
Important to note: the brain uses reinforcement and reward differently. The brain is primarily based on associative learning (Hebbian learning: neurons that fire together wire together) and the reward/surprise signals act like optional modifiers to the learning rate. Generally speaking, events that are surprisingly rewarding or punishing cause temporary increase in the rate of the formation of the associations. Since we’re talking about trying to translate brain learning functions to similar-in-effect-but-different-in-mechanism machine learning methods, we should try to be clear about human brain terms and ML terms. Sometimes these terms have been borrowed from neuroscience but applied to not quite right ML concepts. Having a unified jargon seems important for the accurate translation of functions.
I would unify those by saying: We get to write a reward function however we want. The reward function can depend on external inputs, if that’s what we want. One of those external inputs can be a “reward” button, if that’s what we want. And the reward function can be a single trivial line of code
return (100 if reward_button_is_being_pressed else 0), if that’s what we want.Therefore, “having human overseers give out [reward]” is subsumed by “handwriting a simple [reward function]”.
(I don’t think you would wind up with an AGI at all with that exact reward function, and if you did, I think the AGI would probably kill everyone. But that’s a different issue.)
(Further discussion in §8.4 here.)
(By the way, I’m assuming “reinforcement” is synonymous with reward, let me know if it isn’t.)
I am using “reinforcement” synonymously with “reward,” yes!
Important to note: the brain uses reinforcement and reward differently. The brain is primarily based on associative learning (Hebbian learning: neurons that fire together wire together) and the reward/surprise signals act like optional modifiers to the learning rate. Generally speaking, events that are surprisingly rewarding or punishing cause temporary increase in the rate of the formation of the associations. Since we’re talking about trying to translate brain learning functions to similar-in-effect-but-different-in-mechanism machine learning methods, we should try to be clear about human brain terms and ML terms. Sometimes these terms have been borrowed from neuroscience but applied to not quite right ML concepts. Having a unified jargon seems important for the accurate translation of functions.