Great post. I wonder how to determine what is a “reasonable” maximum epsilon to use in the adversarial training. Does performance on normal examples get worse as epsilon increases?
An interesting question! I looked in “Towards Deep Learning Models Resistant to Adversarial Attacks” to see what they had to say on the question. If I’m interpreting their Figure 6 correctly, there’s a negligible increase in error rate as epsilon increases, and then at some point the error rate starts swooping up toward 100%. The transition seems to be about where the perturbed images start to be able to fool humans. (Or perhaps slightly before.). So you can’t really blame the model for being fooled, in that case. If I had to pick an epsilon to train with, I would pick one just below the transition point, where robustness is maximized without getting into the crazy zone.
All this is the result of a cursory inspection of a couple of papers. There’s about a 30% chance I’ve misunderstood.
Great post. I wonder how to determine what is a “reasonable” maximum epsilon to use in the adversarial training. Does performance on normal examples get worse as epsilon increases?
An interesting question! I looked in “Towards Deep Learning Models Resistant to Adversarial Attacks” to see what they had to say on the question. If I’m interpreting their Figure 6 correctly, there’s a negligible increase in error rate as epsilon increases, and then at some point the error rate starts swooping up toward 100%. The transition seems to be about where the perturbed images start to be able to fool humans. (Or perhaps slightly before.). So you can’t really blame the model for being fooled, in that case. If I had to pick an epsilon to train with, I would pick one just below the transition point, where robustness is maximized without getting into the crazy zone.
All this is the result of a cursory inspection of a couple of papers. There’s about a 30% chance I’ve misunderstood.