Fengyuan Hu comments on Normalizing Sparse Autoencoders

Fengyuan Hu 10 Apr 2024 16:54 UTC
1 point
0
Good point. Firstly, the mean L0 between the experiment and the baseline is within a scaling factor of 2, so it’s in a reasonably close range. I also added a new set of figures comparing the reconstruction score of one layer that have the closest match on L0 between the experiment group. Spoiler, the scores are still almost the same at the end of training. You can find it under Experiments-Performance Validation.
- Glen Taggart 11 Apr 2024 23:14 UTC
  3 points
  0
  Parent
  I want to mention that in my experience a factor of 2 difference in L0 makes a pretty huge difference in reconstruction score/L2 norm. IMO ideally you should compare pareto curves for each architecture or get two datapoints that have almost the exact same L0 if you want to compare two architectures.
  - Fengyuan Hu 12 Apr 2024 1:32 UTC
    1 point
    0
    Parent
    The additional experiment under Experiment-Performance Verification (Figure 11) compares normalized_1 and baseline_1 on layer 5 which have almost identical $L_{0}$ . The result showed no observable difference.