I want to mention that in my experience a factor of 2 difference in L0 makes a pretty huge difference in reconstruction score/L2 norm. IMO ideally you should compare pareto curves for each architecture or get two datapoints that have almost the exact same L0 if you want to compare two architectures.
The additional experiment under Experiment-Performance Verification (Figure 11) compares normalized_1 and baseline_1 on layer 5 which have almost identical L0. The result showed no observable difference.
I want to mention that in my experience a factor of 2 difference in L0 makes a pretty huge difference in reconstruction score/L2 norm. IMO ideally you should compare pareto curves for each architecture or get two datapoints that have almost the exact same L0 if you want to compare two architectures.
The additional experiment under Experiment-Performance Verification (Figure 11) compares
normalized_1
andbaseline_1
on layer 5 which have almost identical L0. The result showed no observable difference.