Pattern comments on Rethinking Batch Normalization

Pattern Aug 3, 2019, 12:41 PM
3 points
And to top that off, they found that even in networks where they artificially increased ICS, performance barely suffered.
All networks, or just ones with batch normalization?
- Matthew Barnett Aug 6, 2019, 1:46 AM
  3 points
  Parent
  That’s a good point of clarification which perhaps weakens the point I was making there. From the paper,
  adding the same amount of noise to the activations of the standard (non-BatchNorm) network prevents it from training entirely