Dmitrii Kharlapenko

Karma: 111

Dmitrii Kharlapenko Aug 6, 2024, 1:56 PM
2 points
0
in reply to: Clément Dumas’s comment on: Self-explaining SAE features
Do you mean SAE encoder weights by input features? We did not look into them.

Dmitrii Kharlapenko Aug 6, 2024, 1:55 PM
4 points
0
in reply to: Clément Dumas’s comment on: Self-explaining SAE features
Thanks! We did try to use it in the repeat setting to make the model produce more than a single token, but it did not work well.

And as far as I remember it also did not improve the meaning prompt much.