Lee Sharkey comments on Interpreting Neural Networks through the Polytope Lens

Lee Sharkey 27 Sep 2022 19:55 UTC
3 points
0
Thanks for your comment!

However, I don’t really see how you’d easily extend the polytope formulation to activation functions that aren’t piecewise linear, like tanh or logits, while the functional analysis perspective can handle that pretty easily. Your functions just become smoother.

Extending the polytope lens to activation functions such as sigmoids, softmax, or GELU is the subject of a paper by Baleistriero & Baraniuk (2018) https://arxiv.org/abs/1810.09274

In the case of GELU and some similar activation functions, you’d need to replace the binary spine-code vectors with vectors whose elements take values in (0, 1).

There’s some further explanation in Appendix C!

In the functional analysis view, a “feature” is a description of a set of inputs that makes a particular element in a given layer’s function space take activation values close to their maximum value. E.g., some linear combination of neurons in a layer is most activated by pictures of dog heads.

This, indeed, is the assumption we wish to relax.

But there’s a lot more to know about a function f than what max({f(x) | x \in X}) is.

Agreed!

Scaling up some of the activations in a layer by a constant factor means you’re increasing the norm of the corresponding functions, changing the principal component basis of the layer’s function space. So it shouldn’t be surprising if subsequent layers get messed up by that.

There are many lenses that let us see how unsurprising this experiment was, and this is another one! We only use this experiment to show that it’s surprising when you view features as directions and don’t qualify that view by invoking a distribution of activation magnitude where semantics is still valid (called a ‘distribution of validity’ in this post).