tailcalled comments on Basin broadness depends on the size and number of orthogonal features

tailcalled 27 Aug 2022 20:10 UTC
5 points
0
Hm, this makes me wonder:
So if I understand, if you take an eigenvector $v$ with a large eigenvalue in the Hessian, then that corresponds to a feature the network has learned is important for its loss. But more specifically, it corresponds to parameters of the network (i.e. an axis in networkspace) that measure features which correlate in the imagespace.
So the eigenvector $v$ doesn’t give you the features directly in imagespace, it gives you the network parameters which “measure” the feature? I wonder if one could translate this to imagespace. Taking a stab at it, given an image $x$ , definitionally $v$ is the parameter axis that measures the feature, so changes in $Θ$ along $v$ should be proportional to the feature’s presence in $x$ ? Ergo, $v \cdot \nabla_{v} f (x, Θ + v)$ should measure the extent to which $x$ exhibits the feature?
Not sure if this is useful, or relevant to your post. Maybe it’s something I should experiment with.
- Lucius Bushnaq 31 Aug 2022 12:59 UTC
  3 points
  0
  Parent
  So the eigenvector $v$ doesn’t give you the features directly in imagespace, it gives you the network parameters which “measure” the feature?
  Nope, you can straightforwardly read off the feature in imagespace, I think. Remember, the eigenvector doesn’t just show you which parameters “form” the feature through linear combination, it also shows you exactly what that linear combination is. If your eigenvector is (2,0,-3), that means the feature in image space looks like taking the twice the activations of the node connected to $Θ_{1}$ , plus −3 times the activations of $Θ_{3}$ .
  
  Ergo, $v \cdot \nabla_{v} f (x, Θ + v)$ should measure the extent to which $x$ exhibits the feature?
  We’re planning to test the connection between the orthogonal features and the actual training data through something similar to this actually, yes. See this comment and the math by John it’s replying to.
  - tailcalled 31 Aug 2022 14:38 UTC
    2 points
    0
    Parent
    Hmm, I suppose in the single-linear-layer case, your way of transferring it to imagespace is equivalent to mine, whereas in the multi-nonlinear-layer case, I am not sure which generalization is the most appropriate.
    - Lucius Bushnaq 31 Aug 2022 15:24 UTC
      1 point
      0
      Parent
      Your way of doing it basically approximates the network to first order in the parameter changes/second order in the loss function. That’s the same as the method I’m proposing above really, except you’re changing the features to account for the chain rule acting on the layers in front of them. You’re effectively transforming the network into an equivalent one that has a single linear layer, with the entries of $\nabla_{v} f (x, Θ)$ as the features.
      That’s fine to do when you’re near a global optimum, the case discussed in the main body of this post, and for tiny changes it’ll hold even generally, but for a broader understanding of the dynamics layer by layer, I think insisting on the transformation to imagespace might not be so productive.
      Note that imagespace/=thing that is interpretable. You can recognise a dog head detector fine just by looking at its activations, no need to transpose it into imagespace somehow.
- tailcalled 27 Aug 2022 20:20 UTC
  3 points
  1
  Parent
  (Wait, I say “imagespace” due to thinking too much about image classifiers as the canonical example of a neural network, but of course other inputs can be given to the NN too.)
- tailcalled 27 Aug 2022 20:15 UTC
  2 points
  0
  Parent
  (And it seems like further one could identify the feature in pixelspace by taking the gradient of $v \cdot \nabla_{v} f (x, Θ + v)$ with respect to the pixels? Might be useful for interpretability? Not sure.)