I always thought there ought to be a good way to explain neural networks starting at backprop.
Since to some extent the criterion for architecture selection always seems to be whether or no gradients will explode very easy.
As in, I feel like the correct definition of neural networks in the current climate is closer to: “A computation graph structured in such a way that some optimize/loss function combination will make it adjust its equation towards a seemingly reasonable minima”.
Because in practice that’s what seems to define a good architecture design.
I always thought there ought to be a good way to explain neural networks starting at backprop.
Since to some extent the criterion for architecture selection always seems to be whether or no gradients will explode very easy.
As in, I feel like the correct definition of neural networks in the current climate is closer to: “A computation graph structured in such a way that some optimize/loss function combination will make it adjust its equation towards a seemingly reasonable minima”.
Because in practice that’s what seems to define a good architecture design.