One of the talks at ILIAD had a set for PCA plots where the PC2 turned around at different points for different training setups. I think the turning point corresponded to when the model started to overfit. I don’t quite remember. But what ever the meaning of the turning point was, I think they also verified this with some other observation. Given that this was ILIAD the other observation was probably the LLC.
If you want to look it up I can try to find the talk among the recordings.
One of the talks at ILIAD had a set for PCA plots where the PC2 turned around at different points for different training setups. I think the turning point corresponded to when the model started to overfit. I don’t quite remember. But what ever the meaning of the turning point was, I think they also verified this with some other observation. Given that this was ILIAD the other observation was probably the LLC.
If you want to look it up I can try to find the talk among the recordings.
The paper you’re thinking of is probably The Developmental Landscape of In-Context Learning.
It looks related, but these are not the plots I remember from the talk.