My critique is not of actor/critic training processes, but of actor/grader motivational designs. I worried that “critic” would make people think I don’t want to use an evaluative model to provide gradients to the actor. That seems non-doomed to me.
Thank you! I’ve been using the terms “inference algorithm” versus “learning algorithm” to talk about that kind of thing. What you said seems fine too, AFAIK.
My critique is not of actor/critic training processes, but of actor/grader motivational designs. I worried that “critic” would make people think I don’t want to use an evaluative model to provide gradients to the actor. That seems non-doomed to me.
Thank you! I’ve been using the terms “inference algorithm” versus “learning algorithm” to talk about that kind of thing. What you said seems fine too, AFAIK.