As I understand it, you’re dividing the agent from the world; once you introduce a reward signal, you’ll be able to call it reinforcement learning. However, until you introduce a reward signal, you’re not doing specifically reinforcement learning—everything applies just as well to any other kind of agent, such as a classical planner.
That’s an excellent point. Of course one cannot introduce RL without talking about the reward signal, and I’ve never intended to.
To me, however, the defining feature of RL is the structure of the solution space, described in this post. To you, it’s the existence of a reward signal. I’m not sure that debating this difference of opinion is the best use of our time at this point. I do hope to share my reasons in future posts, if only because they should be interesting in themselves.
As for your last point: RL is indeed a very general setting, and classical planning can easily be formulated in RL terms.
As I understand it, you’re dividing the agent from the world; once you introduce a reward signal, you’ll be able to call it reinforcement learning. However, until you introduce a reward signal, you’re not doing specifically reinforcement learning—everything applies just as well to any other kind of agent, such as a classical planner.
That’s an excellent point. Of course one cannot introduce RL without talking about the reward signal, and I’ve never intended to.
To me, however, the defining feature of RL is the structure of the solution space, described in this post. To you, it’s the existence of a reward signal. I’m not sure that debating this difference of opinion is the best use of our time at this point. I do hope to share my reasons in future posts, if only because they should be interesting in themselves.
As for your last point: RL is indeed a very general setting, and classical planning can easily be formulated in RL terms.