Rohin Shah comments on [AN #98]: Understanding neural net training by seeing which gradients were helpful