Hi any idea how this would compare to just replacing the l1 loss with a smoothed l0 loss function? Something like ∑ilog(1+a|xi|) (summed across the sparse representation).
We found that exactly that form of sparsity penalty did improve shrinkage with standard (ungated) SAEs, and provide a decent boost to loss recovered at low L0. (We didn’t evaluate interpretability though.) But then we hit upon Gated SAEs which looked even better, and for which modifying the sparsity penalty in this way feels less necessary, so we haven’t experimented with combining the two.
Hi any idea how this would compare to just replacing the l1 loss with a smoothed l0 loss function? Something like ∑ilog(1+a|xi|) (summed across the sparse representation).
We found that exactly that form of sparsity penalty did improve shrinkage with standard (ungated) SAEs, and provide a decent boost to loss recovered at low L0. (We didn’t evaluate interpretability though.) But then we hit upon Gated SAEs which looked even better, and for which modifying the sparsity penalty in this way feels less necessary, so we haven’t experimented with combining the two.