Nice paper! I found reading it quite insightful. Here are some key extracts from the paper:
Improving adversarial robustness by classifying several down-sampled noisy images at once:
“Drawing inspiration from biology [eye saccades], we use multiple versions of the same image at once, downsampled to lower resolutions and augmented with stochastic jitter and noise. We train a model to classify this channel-wise stack of images simultaneously. We show that this by default yields gains in adversarial robustness without any explicit adversarial training.”
Improving adversarial robustness by using an ensemble of intermediate layer predictions:
“Using intermediate layer predictions. We show experimentally that a successful adversarial attack on a classifier does not fully confuse its intermediate layer features (see Figure 5). An image of a dog attacked to look like e.g. a car to the classifier still has predominantly dog-like intermediate layer features. We harness this de-correlation as an active defense by CrossMax ensembling the predictions of intermediate layers. This allows the network to dynamically respond to the attack, forcing it to produce consistent attacks over all layers, leading to robustness and interpretability.”
Nice paper! I found reading it quite insightful. Here are some key extracts from the paper:
Improving adversarial robustness by classifying several down-sampled noisy images at once:
Improving adversarial robustness by using an ensemble of intermediate layer predictions: