Thomas Larsen comments on Challenge: construct a Gradient Hacker

Thomas Larsen 20 Mar 2023 21:10 UTC
2 points
0
Following up to clarify this: the point is that this attempt fails 2a because if you perturb the weights along the connection $\nabla_{θ} L (θ) - - \to ϵ \cdot I d o u t p u t$ , there is now a connection from the internal representation of $y$ to the output, and so training will send this thing to the function $f (D, θ) \approx y$ .