I’m… not demanding that the field of RL change? Where in the post did you perceive me to demand this? For example, I wrote that “I wouldn’t say ‘reinforcement function’ in e.g. a conference paper.” I also took care to write “This terminology is loaded and inappropriate for my purposes.”
Each individual reader can choose to swap to “policy” without communication difficulties, in my experience:
Don’t wait for everyone to coordinate on saying “policy.” You can switch to “policy” right now and thereby improve your private thoughts about alignment, whether or not anyone else gets on board. I’ve enjoyed these benefits for a month. The switch didn’t cause communication difficulties.
(As an aside, I also separately wish RL would change its terminology, but it’s a losing fight as you point out, and I have better things to do with my time.)
I’m… not demanding that the field of RL change? Where in the post did you perceive me to demand this? For example, I wrote that “I wouldn’t say ‘reinforcement function’ in e.g. a conference paper.” I also took care to write “This terminology is loaded and inappropriate for my purposes.”
Each individual reader can choose to swap to “policy” without communication difficulties, in my experience:
(As an aside, I also separately wish RL would change its terminology, but it’s a losing fight as you point out, and I have better things to do with my time.)