Lucius Bushnaq comments on Lucius Bushnaq’s Shortform

Lucius Bushnaq 6 Sep 2024 7:19 UTC
2 points
2
I’ve seen a little bit of this, but nowhere near as much as I think the topic merits. I agree that systematic studies on where and how the reconstruction errors make their effects known might be quite informative.
Basically, whenever people train SAEs, or use some other approximate model decomposition that degrades performance, I think they should ideally spend some time after just playing with the degraded model and talking to it. Figure out in what ways it is worse.
- Sodium 6 Sep 2024 16:26 UTC
  1 point
  0
  Parent
  Hmmm ok maybe I’ll take a look at this :)