adamShimi comments on Jitters No Evidence of Stupidity in RL

adamShimi 17 Sep 2021 6:57 UTC
13 points
I like this post. Clear thesis, concrete example, and an argument that makes sense.
One consequence of your point is that in situations where RL training is metaphorically energy-constrained (with a negative reward that pushes you to go as fast as possible, or when there is a small space to go to where jittering might mean falling to one’s death and really bad reward), we should not see jitters. Is that coherent with the literature?
- 1a3orn 17 Sep 2021 15:20 UTC
  3 points
  Parent
  Thanks! That’s definitely a consequence of the argument.
  
  It looks to me like that prediction is generally true, from what I remember about RL videos I’ve seen—i.e., the breakout paddle moves much more smoothly when the ball is near, DeepMind’s agents move more smoothly when being chased in tag, and so on. I should definitely made mental note to be alert to possible exceptions to this, though. I’m not aware of anywhere it’s been treated systematically.
  - tailcalled 17 Sep 2021 20:34 UTC
    3 points
    Parent
    I feel like I once saw RL agents trained with and without energy costs, where the agents trained with energy costs acted a lot less jittery. But I can’t remember where I saw it.