On its face, this sounds somewhat crazy—how can a model with more parameters be simpler? But in fact I think this is just a very straightforward consequence of double descent
One man’s modus ponens is another man’s modus tollens: while you seem to go from “double descent is real” to “larger models are simpler”, I go from “larger models are more complex” to “something crazy is happening with double descent”.
Possible crux: I think the empirical evidence justifies “double descent is a real effect that occurs in some situations”, but not “double descent is clearly real and happens in the vast majority of realistic settings”.
One man’s modus ponens is another man’s modus tollens: while you seem to go from “double descent is real” to “larger models are simpler”, I go from “larger models are more complex” to “something crazy is happening with double descent”.
Possible crux: I think the empirical evidence justifies “double descent is a real effect that occurs in some situations”, but not “double descent is clearly real and happens in the vast majority of realistic settings”.