ryan_greenblatt comments on Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small

ryan_greenblatt Feb 10, 2024, 12:30 AM
3 points
1
I think the learned positional embeddings combined with training on only short sequences is likely to be the issue. Changing either would suffice.
- Joseph Bloom Feb 10, 2024, 1:12 AM
  1 point
  0
  Parent
  Makes sense. Will set off some runs with longer context sizes and track this in the future.