evhub comments on Understanding “Deep Double Descent”

evhub 6 Dec 2019 19:37 UTC
LW: 4 AF: 2
0
AF

I wonder if this is a neural network thing, an SGD thing, or a both thing?

Neither, actually—it’s more general than that. Belkin et al. show that it happens even for simple models like decision trees. Also see here for an example with polynomial regression.

Are you aware of this work and the papers they cite?

Yeah, I am. I definitely think that stuff is good, though ideally I want something more than just “approximately K-complexity.”