gwern comments on OthelloGPT learned a bag of heuristics

gwern 8 Jul 2024 16:02 UTC
3 points
1
Seems like the natural next step would be to try to investigate grokking, as this appears analogous: you have a model which has memorized or learned a grabbag of heuristics & regularities, but as far as you can tell, the algorithmic core is eluding the model despite what seems like ample parameterization & data, perhaps because it is a wide shallow model. So one could try to train a skinny net, and maybe aggressively subsample the training data down into a maximally diverse subset. If it groks, then one should be able to read off much more interpretable algorithmic sub-models.
What links here?
- Noosphere89's comment on What are your contrarian views? by Metus (7 Oct 2024 0:16 UTC; 2 points)