jacob_cannell answers Why does gradient descent always work on neural networks?