evhub comments on Inductive biases stick around

evhub 20 Dec 2019 19:11 UTC
LW: 2 AF: 1
AF
What double descent definitely says is that for a fixed dataset, larger models with zero training error are simpler than smaller models with zero training error. I think it does say somewhat more than that also, which is that larger models do have a real tendency towards being better at finding simpler models in general. That being said, the dataset on which the concept of a dog in your head was trained on is presumably way larger than that of any ML model, so even if your brain is really good at implementing Occam’s razor and finding simple models, your model is still probably going to be more complicated.