Alexander Gietelink Oldenziel comments on Stan van Wingerden’s Shortform

Alexander Gietelink Oldenziel 11 Dec 2024 18:23 UTC
5 points
2
I would be interested what current SLT-dogma on grokking is. I get asked whether SLT explains grokking all the time but always have to reply with an unsatisfying ‘there’s probably something there but I don’t understand the details’.
@Zach Furman @Jesse Hoogland
- Zach Furman 13 Dec 2024 10:01 UTC
  3 points
  0
  Parent
  IIRC @jake_mendel and @Kaarel have thought about this more, but my rough recollection is: a simple story about the regularization seems sufficient to explain the training dynamics, so a fancier SLT story isn’t obviously necessary. My guess is that there’s probably something interesting you could say using SLT, but nothing that simpler arguments about the regularization wouldn’t tell you also. But I haven’t thought about this enough.