I would add that there are other implementations of reinforcement learning where the dog metaphor are closer to correct. The example you give is for the most basic kind of RL, called policy gradients.
When you say “the dog metaphor” do you mean the original one with the biscuit, or the later one with the killing and breeding?
Biscuit
I would add that there are other implementations of reinforcement learning where the dog metaphor are closer to correct. The example you give is for the most basic kind of RL, called policy gradients.
When you say “the dog metaphor” do you mean the original one with the biscuit, or the later one with the killing and breeding?
Biscuit