Chris_Leong comments on You’re Measuring Model Complexity Wrong

Chris_Leong 12 Oct 2023 10:39 UTC
LW: 3 AF: 2
0
AF
Do phase transitions actually show up? So far, the places where theoretically predicted phase transitions are easiest to confirm are simplified settings like deep linear networks and toy models of superposition. For larger models, we expect phase transitions to be common but “hidden.” Among our immediate priorities are testing just how common these transitions are and whether we can detect hidden transitions.

What do you mean by ’hidden”?
- Daniel Murfet 12 Oct 2023 21:06 UTC
  10 points
  2
  Parent
  Not easily detected. As in, there might be a sudden (in SGD steps) change in the internal structure of the network over training that is not easily visible in the loss or other metrics that you would normally track. If you think of the loss as an average over performance on many thousands of subtasks, a change in internal structure (e.g. a circuit appearing in a phase transition) relevant to one task may not change the loss much.