activatedgeek comments on Understanding “Deep Double Descent”

activatedgeek 29 Aug 2020 16:48 UTC
LW: 5 AF: 3
AF
I want to point out some recent work by Andrew Gordon Wilson’s group—https://cims.nyu.edu/~andrewgw/#papers.
Particularly, https://arxiv.org/abs/2003.02139 takes a look a double descent from the perspective where they argue that parameters are a bad proxy of model complexity/capacity. Rather, effective dimensionality is what we should be plotting against and double descent effectively vanishes (https://arxiv.org/abs/2002.08791) when we use Bayesian model averaging instead of point estimates.