Steven Byrnes comments on Open & Welcome Thread—July 2020

Steven Byrnes 22 Jul 2020 23:21 UTC
10 points
Yeah. There’s no gradient descent within a single episode, but if you have a network with input (as always) and with memory (e.g. an RNN) then its behavior in any given episode can be a complicated function of its input over time in that episode, which you can describe as “it figured something out from the input and that’s now determining its further behavior”. Anyway, everything you said is right, I think.