gwern comments on Evidence Sets: Towards Inductive-Biases based Analysis of Prosaic AGI

gwern 19 Dec 2021 19:55 UTC
4 points
There’s a lot of physics-related nonlinearities/phase-transitions/powerlaw material in these workshop slides & videos, looks like: https://sites.google.com/mila.quebec/scaling-laws-workshop/schedule
- bayesian_kitten 23 Dec 2021 12:35 UTC
  1 point
  Parent
  This post (which is really dope) provides some grokking examples in large language models in a Big-Bench video at 19313s & 19458s, with that segment (18430s-19650s) being a nice watch! I shall spend a bit more time collecting and precisely identifying evidence and then include it in the grokking part of this post. This was a really nice thing to know about and very suprising.
  - gwern 23 Dec 2021 17:32 UTC
    3 points
    Parent
    I’ve commented on that, but I’m not convinced that the phase transitions in learning are grokking, per se. There are many different scaling phenomenon, and we shouldn’t go around prematurely conflating them.