Glen Taggart comments on Normalizing Sparse Autoencoders

Glen Taggart 11 Apr 2024 23:14 UTC
3 points
0
I want to mention that in my experience a factor of 2 difference in L0 makes a pretty huge difference in reconstruction score/L2 norm. IMO ideally you should compare pareto curves for each architecture or get two datapoints that have almost the exact same L0 if you want to compare two architectures.
- Fengyuan Hu 12 Apr 2024 1:32 UTC
  1 point
  0
  Parent
  The additional experiment under Experiment-Performance Verification (Figure 11) compares normalized_1 and baseline_1 on layer 5 which have almost identical $L_{0}$ . The result showed no observable difference.