I agree that the brain uses temporal difference learning. I thought temporal difference learning was that reward propagates back to earliest reliable stimulus based on difference between expected and observed, then reinforces it. How is that different from the quoted text except that quoted is simpler and doesn’t use that language?
I agree that the brain uses temporal difference learning. I thought temporal difference learning was that reward propagates back to earliest reliable stimulus based on difference between expected and observed, then reinforces it. How is that different from the quoted text except that quoted is simpler and doesn’t use that language?