I agree that if you have a model of the system (as you do when you know the rules of the game), you can simulate potential actions and consequences, and that seems like search.
Usually, you don’t have a good model of the system, and then you need something else.
Maybe with some forms of supervised learning you can either calculate the solution directly, or just follow a gradient (which may be arguable whether that’s search or not), but with RL, surely the “explore” steps have to count as “search”?
I was thinking of following a gradient in supervised learning.
I agree that pure reinforcement learning with a sparse reward looks like search. I doubt that pure RL with sparse reward is going to get you very far.
Reinforcement learning with demonstrations or a very dense reward doesn’t really look like search, it looks more like someone telling you what to do and you following the instructions faithfully.
I agree that if you have a model of the system (as you do when you know the rules of the game), you can simulate potential actions and consequences, and that seems like search.
Usually, you don’t have a good model of the system, and then you need something else.
I was thinking of following a gradient in supervised learning.
I agree that pure reinforcement learning with a sparse reward looks like search. I doubt that pure RL with sparse reward is going to get you very far.
Reinforcement learning with demonstrations or a very dense reward doesn’t really look like search, it looks more like someone telling you what to do and you following the instructions faithfully.