Can approval-directed agents be considered a form of imitation learning, and if not, are there any safety-relevant differences between imitation learning of (speeded-up) humans, and approval-directed agents?
I think that the only reason to be interested in approval-directed agents rather than straightforward imitation learners is that it may be harder to effectively imitate behavior than to solve the same task in a very different way.
I found an old comment from Paul that answers this: