John_Maxwell comments on Why Gradients Vanish and Explode

John_Maxwell 10 Aug 2019 4:15 UTC
3 points

I think this is quite strong evidence that I was not taught the correct usage of vanishing gradients.

I’m very confused. The way I’m reading the quote you provided, it says ReLu works better because it doesn’t have the gradient vanishing effect that sigmoid and tanh have.
- Matthew Barnett 10 Aug 2019 4:25 UTC
  3 points
  Parent
  Interesting. I just re-read it and you are completely right. Well I wonder how that interacts with what I said above.