wesg comments on SAE reconstruction errors are (empirically) pathological

wesg 29 Mar 2024 22:44 UTC
1 point
0
Yes this a good consideration. I think
1. KL as a metric makes a good tradeoff here by mostly ignoring changes to tokens the original model treated as low probability (as opposed to measuring something more cursed like log prob L2 distance) and so I think captures the more interesting differences.
2. This motivates having good baselines to determine what this noise floor should be.