But presumably, after thinking about it a few times during training, they will remember their conclusions for a while, and bring them to mind in whichever episodes they’re relevant.
Why do you think SGD will do this? Or are you imagining non-SGD mechanisms?
It seems non-obvious to me that this will occur with SGD, though possible.
Why do you think SGD will do this? Or are you imagining non-SGD mechanisms?
It seems non-obvious to me that this will occur with SGD, though possible.