Logan Riggs comments on Normalizing Sparse Autoencoders

Logan Riggs 10 Apr 2024 14:32 UTC
3 points
0
For comparing CE-difference (or the mean reconstruction score), did these have similar L0′s? If not, it’s an unfair comparison (higher L0 is usually higher reconstruction accuracy).
- Fengyuan Hu 10 Apr 2024 16:54 UTC
  1 point
  0
  Parent
  Good point. Firstly, the mean L0 between the experiment and the baseline is within a scaling factor of 2, so it’s in a reasonably close range. I also added a new set of figures comparing the reconstruction score of one layer that have the closest match on L0 between the experiment group. Spoiler, the scores are still almost the same at the end of training. You can find it under Experiments-Performance Validation.
  - Glen Taggart 11 Apr 2024 23:14 UTC
    3 points
    0
    Parent
    I want to mention that in my experience a factor of 2 difference in L0 makes a pretty huge difference in reconstruction score/L2 norm. IMO ideally you should compare pareto curves for each architecture or get two datapoints that have almost the exact same L0 if you want to compare two architectures.
    - Fengyuan Hu 12 Apr 2024 1:32 UTC
      1 point
      0
      Parent
      The additional experiment under Experiment-Performance Verification (Figure 11) compares normalized_1 and baseline_1 on layer 5 which have almost identical $L_{0}$ . The result showed no observable difference.