TurnTrout comments on Don’t design agents which exploit adversarial inputs

TurnTrout 21 Nov 2022 20:28 UTC
LW: 4 AF: 4
0
AF
My critique is not of actor/critic training processes, but of actor/grader motivational designs. I worried that “critic” would make people think I don’t want to use an evaluative model to provide gradients to the actor. That seems non-doomed to me.
- Steven Byrnes 28 Nov 2022 20:22 UTC
  LW: 2 AF: 2
  0
  AF Parent
  Thank you! I’ve been using the terms “inference algorithm” versus “learning algorithm” to talk about that kind of thing. What you said seems fine too, AFAIK.