Good point. Firstly, the mean L0 between the experiment and the baseline is within a scaling factor of 2, so it’s in a reasonably close range. I also added a new set of figures comparing the reconstruction score of one layer that have the closest match on L0 between the experiment group. Spoiler, the scores are still almost the same at the end of training. You can find it under Experiments-Performance Validation.
I want to mention that in my experience a factor of 2 difference in L0 makes a pretty huge difference in reconstruction score/L2 norm. IMO ideally you should compare pareto curves for each architecture or get two datapoints that have almost the exact same L0 if you want to compare two architectures.
The additional experiment under Experiment-Performance Verification (Figure 11) compares normalized_1 and baseline_1 on layer 5 which have almost identical L0. The result showed no observable difference.
Good point. Firstly, the mean L0 between the experiment and the baseline is within a scaling factor of 2, so it’s in a reasonably close range. I also added a new set of figures comparing the reconstruction score of one layer that have the closest match on L0 between the experiment group. Spoiler, the scores are still almost the same at the end of training. You can find it under Experiments-Performance Validation.
I want to mention that in my experience a factor of 2 difference in L0 makes a pretty huge difference in reconstruction score/L2 norm. IMO ideally you should compare pareto curves for each architecture or get two datapoints that have almost the exact same L0 if you want to compare two architectures.
The additional experiment under Experiment-Performance Verification (Figure 11) compares
normalized_1
andbaseline_1
on layer 5 which have almost identical L0. The result showed no observable difference.