tailcalled comments on SAE reconstruction errors are (empirically) pathological

tailcalled 29 Mar 2024 18:21 UTC
12 points
3
Could this be explained if SAEs only find a subset of the features so therefore the reconstructions are just entirely missing random features whereas random noise is just random and therefore mostly ignored?
- wesg 29 Mar 2024 18:46 UTC
  1 point
  0
  Parent
  Yup! I think something like this is probably going on. I blamed this on L1 but this could also be some other learning or architectural failure (eg, not enough capacity):
  Some features are dense (or groupwise dense, i.e., frequently co-occur together). Due to the L1 penalty, some of these dense features are not represented. However, for KL it ends up being better to nosily represent all the features than to accurately represent some fraction of them.