In this post, I want to highlight a fact which I did not see mentioned in the original post or the comments: in the field of reinforcement learning, there are agents which are pursuing the goal of building a system which maximizes a reward function, subject to some additional constraints. These agents are the capabilities researchers designing and implementing SotA reinforcement learning algorithms and other methods to build and test the most capable, general systems across a variety of domains.
I agree that this fact is worth pointing out. I myself have the feeling that I’ve mentioned this somewhere, but couldn’t instantly find/cite where I’d elaborated this over the last year. IIRC, I ended up thinking that the human-applied selection pressure was probably insufficient to actually produce policies which care about reward.
I’m not opposed to using standard shorthand when it’s clear to experienced practitioners what the author means, but I think in posts which discuss both policies and agents, it is important to keep these distinctions in mind and sometimes make them explicit.
I agree very much with your point here, and think this is a way in which I have been imprecise. I mean to write a post soon which advocates against using “agents” to refer to “policy networks”.
The observation that the Dreamer authors exerted strong optimization power to design an effective RL method is what led me to make the prediction here.
(As an aside, I mean to come back and read more about Dreamer and possibly take up a bet with you. I’ve been quite busy.)
I agree that this fact is worth pointing out. I myself have the feeling that I’ve mentioned this somewhere, but couldn’t instantly find/cite where I’d elaborated this over the last year. IIRC, I ended up thinking that the human-applied selection pressure was probably insufficient to actually produce policies which care about reward.
I agree very much with your point here, and think this is a way in which I have been imprecise. I mean to write a post soon which advocates against using “agents” to refer to “policy networks”.
(As an aside, I mean to come back and read more about Dreamer and possibly take up a bet with you. I’ve been quite busy.)