So then one approach is to ask everyone to avoid the word “agent” in cases where those intuitions don’t apply, and the other is to ask everyone to constantly remind each other that the “agents” produced by RL don’t necessarily have thus-and-such properties.
I think there’s a way better third alternative: asking each reader to unilaterally switch to “policy.” No coordination, no constant reminders, no communication difficulties (in my experience). I therefore don’t see a case for using “agent” in the mentioned cases.
I added to the post:
Don’t wait for everyone to coordinate on saying “policy.” You can switch to “policy” right now and thereby improve your private thoughts about alignment, whether or not anyone else gets on board. I’ve enjoyed these benefits for a month. The switch didn’t cause communication difficulties.
I think there’s a way better third alternative: asking each reader to unilaterally switch to “policy.” No coordination, no constant reminders, no communication difficulties (in my experience). I therefore don’t see a case for using “agent” in the mentioned cases.
I added to the post: