We hypothesize that adversarial attacks exploit the open space risk of classic monotonic activation functions. This paper introduces the tent activation function with bounded open space risk and shows that tents make deep learning models more robust to adversarial attacks. We demonstrate on the MNIST dataset that a classifier with tents yields an average accuracy of 91.8% against six white-box adversarial attacks, which is more than 15 percentage points above the state of the art.
Basically, causing all unbounded polytopes to have a zero-affine-transformation at extreme values improves adversarial robustness.
By the way, although the tent activation function prevents monotonic growth in the direction perpendicular to the decision hyperplane, I haven’t heard of any activation function that prevents the neuron from being active when the input goes too far out of distribution in a direction parallel to the hyperplane. It might be interesting to explore that angle.
Here is a paper that addresses using activation functions that bound the so-called “open space”:
Improved Adversarial Robustness by Reducing Open Space Risk via Tent Activations
According to the paper:
Basically, causing all unbounded polytopes to have a zero-affine-transformation at extreme values improves adversarial robustness.
By the way, although the tent activation function prevents monotonic growth in the direction perpendicular to the decision hyperplane, I haven’t heard of any activation function that prevents the neuron from being active when the input goes too far out of distribution in a direction parallel to the hyperplane. It might be interesting to explore that angle.