Lucius Bushnaq comments on My Criticism of Singular Learning Theory

Lucius Bushnaq 20 Nov 2023 6:41 UTC
7 points
6
IIRC this is probably the case for a broad range of non-NN models. I think the original Double Descent paper showed it for random Fourier features.

My current guess is that NN architectures are just especially affected by this, due to having even more degenerate behavioral manifolds, ranging very widely from tiny to large RLCTs.