Please remember that this introduction is non-standard, so you may need to be an expert on standard RL to see the connection. And while some parts are not in place yet, this post does introduce what I consider to be the most important part of the setting of RL.
So I hope we’re not arguing over definitions here. If you expand on your meaning of the term, I may be able to help you see the connection. Or we may possibly find that we use the same term for different things altogether.
I should also explain why I’m giving a non-standard introduction, where a standard one would be more helpful in communicating with others who may know it. The main reason is that this will hopefully allow me to describe some non-standard and very interesting conclusions.
As I understand it, you’re dividing the agent from the world; once you introduce a reward signal, you’ll be able to call it reinforcement learning. However, until you introduce a reward signal, you’re not doing specifically reinforcement learning—everything applies just as well to any other kind of agent, such as a classical planner.
That’s an excellent point. Of course one cannot introduce RL without talking about the reward signal, and I’ve never intended to.
To me, however, the defining feature of RL is the structure of the solution space, described in this post. To you, it’s the existence of a reward signal. I’m not sure that debating this difference of opinion is the best use of our time at this point. I do hope to share my reasons in future posts, if only because they should be interesting in themselves.
As for your last point: RL is indeed a very general setting, and classical planning can easily be formulated in RL terms.
It might be valuable to point out that nothing about this is reinforcement learning yet.
I’m not sure why you say this.
Please remember that this introduction is non-standard, so you may need to be an expert on standard RL to see the connection. And while some parts are not in place yet, this post does introduce what I consider to be the most important part of the setting of RL.
So I hope we’re not arguing over definitions here. If you expand on your meaning of the term, I may be able to help you see the connection. Or we may possibly find that we use the same term for different things altogether.
I should also explain why I’m giving a non-standard introduction, where a standard one would be more helpful in communicating with others who may know it. The main reason is that this will hopefully allow me to describe some non-standard and very interesting conclusions.
But since we are not, we cannot.
Well, there you are. The setting. Not actual RL. So that’s two purely preliminary posts so far. When does the main act come on—the R and the L?
As I understand it, you’re dividing the agent from the world; once you introduce a reward signal, you’ll be able to call it reinforcement learning. However, until you introduce a reward signal, you’re not doing specifically reinforcement learning—everything applies just as well to any other kind of agent, such as a classical planner.
That’s an excellent point. Of course one cannot introduce RL without talking about the reward signal, and I’ve never intended to.
To me, however, the defining feature of RL is the structure of the solution space, described in this post. To you, it’s the existence of a reward signal. I’m not sure that debating this difference of opinion is the best use of our time at this point. I do hope to share my reasons in future posts, if only because they should be interesting in themselves.
As for your last point: RL is indeed a very general setting, and classical planning can easily be formulated in RL terms.