IIRC @jake_mendel and @Kaarel have thought about this more, but my rough recollection is: a simple story about the regularization seems sufficient to explain the training dynamics, so a fancier SLT story isn’t obviously necessary. My guess is that there’s probably something interesting you could say using SLT, but nothing that simpler arguments about the regularization wouldn’t tell you also. But I haven’t thought about this enough.
IIRC @jake_mendel and @Kaarel have thought about this more, but my rough recollection is: a simple story about the regularization seems sufficient to explain the training dynamics, so a fancier SLT story isn’t obviously necessary. My guess is that there’s probably something interesting you could say using SLT, but nothing that simpler arguments about the regularization wouldn’t tell you also. But I haven’t thought about this enough.