aysja comments on A Mechanistic Interpretability Analysis of Grokking

aysja 25 Aug 2022 7:11 UTC
10 points
1
I love this work! It’s really cool to see interpretability on toy models in such a clear way.

The trend from memorization to generalization reminds me of the information bottleneck idea. I don’t know that much about it (read this Quanta article a while ago), but they appear to be making a similar claim about phase transitions. I believe this is the paper one would want to read to get a deeper understanding of it.