BiEchi comments on Attention SAEs Scale to GPT-2 Small

BiEchi 9 Mar 2024 3:56 UTC
1 point
0
@Connor Kissane @Neel Nanda Does SAE work on MLP blocks of GPT2-small as well? I find the recovery rate significantly low (40%) for MLP activations of larger models like GPT2-small.
- Neel Nanda 9 Mar 2024 16:39 UTC
  2 points
  0
  Parent
  We’ve found slightly worse results for MLPs, but nowhere near 40%, I expect you’re training your SAEs badly. What exact metric equals 40% here?