Gurkenglas comments on Computing Natural Abstractions: Linear Approximation

Gurkenglas 15 Apr 2021 19:52 UTC
11 points
What a small world—I was thinking up a very similar transparency tool since two weeks ago. The function f from inputs to activations-of-every-neuron isn’t linear but it is differentiable, aka linear near (input space, not pixel coordinates!) an input. The jacobian J at an input x is exactly the cross-covariance matrix between a normal distribution 𝓝(x,Σ) and its image 𝓝(f(x),JΣJᵀ), right? Then if you can permute a submatrix of JΣJᵀ into a block-diagonal matrix, you’ve found two modules that work with different properties of x. If the user gives you two modules, you could find an input where they work with different properties, and then vary that input in ways that change activations in one module but not the other to show the user what each module does. And by something like counting the near-zero entries in the matrix, you could differentiably measure the network’s modularity, then train it to be more modular.
Train a (GAN-)generator on the training inputs and attach it to the front of the network—now you know the input distribution is uniform, the (reciprocals of) singular values say the density of the output distribution in the direction of their singular vector, and the inputs you show the user are all in-distribution.
And I’ve thought this up shortly before learning terms like cross-covariance matrix, so please point out terms that describe parts of this. Or expand on it. Or run away with it, would be good to get scooped.
- Gurkenglas 15 Apr 2021 21:34 UTC
  2 points
  Parent
  With differential geometry, there’s probably a way to translate properties between points. And a way to analyze the geometry of the training distribution: Train the generator to be locally injective and give it an input space uniformly distributed on the unit circle, and whether it successfully trains tells you whether the training distribution has a cycle. Try different input topologies to nail down the distribution’s topology. But just like J’s rank tells you the dimension of the input distribution if you just give the generator enough numbers to work with, a powerful generator ought to tell you the entire topology in one training run...
  If the generator’s input distribution is uniform, Σ is diagonal, and the left SVD component of J is also the left (and transposed right) SVD component of JΣJᵀ. Is that useful?
  - johnswentworth 16 Apr 2021 15:08 UTC
    2 points
    Parent
    I’d be curious to know whether something like this actually works in practice. It certainly shouldn’t work all the time, since it’s tackling the #P-Hard part of the problem pretty directly, but if it works well in practice that would solve a lot of problems.