keith_wynroe comments on Lucius Bushnaq’s Shortform

keith_wynroe 6 Sep 2024 10:26 UTC
1 point
0
What are your thoughts on KL-div after the unembed softmax as a metric?
- Lucius Bushnaq 6 Sep 2024 11:06 UTC
  3 points
  0
  Parent
  On its own, this’d be another metric that doesn’t track the right scale as models become more powerful.
  
  The same KL-div in GPT-2 and GPT-4 probably corresponds to the destruction of far more of the internal structure in the latter than the former.
  
  Destroy 95% of GPT-2′s circuits, and the resulting output distribution may look quite different. Destroy 95% of GPT-4′s circuits, and the resulting output distribution may not be all that different, since 5% of the circuits in GPT-4 might still be enough to get a lot of the most common token prediction cases roughly right.
- Neel Nanda 6 Sep 2024 10:40 UTC
  2 points
  0
  Parent
  I don’t see important differences between that and ce loss delta in the context Lucius is describing