TurnTrout comments on Deep learning models might be secretly (almost) linear

TurnTrout 25 Apr 2023 1:04 UTC
10 points
2
At least how I would put this—I don’t think the important part is that NNs are literally almost linear, when viewed as input-output functions. More like, they have linearly represented features (i.e. directions in activation space, either in the network as a whole or at a fixed layer), or there are other important linear statistics of their weights (linear mode connectivity) or activations (linear probing).
Maybe beren can clarify what they had in mind, though.
- beren 25 Apr 2023 12:26 UTC
  12 points
  0
  Parent
  Yes. The idea is that the latent space of the neural network’s ‘features’ are ‘almost linear’ which is reflected in both the linear-ish properties of the weights and activations. Not that the literal I/O mapping of the NN is linear, which is clearly false.
  More concretely, as an oversimplified version of what I am saying, it might be possible to think of neural networks as a combined encoder and decoder to a linear vector space. I.e. we have nonlinear function f and g which encode the input x to a latent space z and g which decodes it to the output y -i.e. f(x) = z and g(z) = y. We the hypothesise that the latent space z is approximately linear such that we can perform addition and weighted sums of zs as well as scaling individual directions in z and these get decoded to the appropriate outputs which correspond to sums or scalings of ‘natural’ semantic features we should expect in the input or output.