I think my question is deeper—why do machines ‘want’ or ‘have a goal to’ follow the algorithm to maximize reward? How can machines ‘find stuff rewarding’?
As far as current systems are concerned, the answer is that (as far as anyone knows) they don’t find things rewarding or want things. But they can still run a search to optimize a training signal, and that gives you an agent.
Thanks!
I think my question is deeper—why do machines ‘want’ or ‘have a goal to’ follow the algorithm to maximize reward? How can machines ‘find stuff rewarding’?
As far as current systems are concerned, the answer is that (as far as anyone knows) they don’t find things rewarding or want things. But they can still run a search to optimize a training signal, and that gives you an agent.