Zach Furman comments on Coordinate-Free Interpretability Theory

Zach Furman 22 Mar 2023 4:10 UTC
17 points
0
Since nobody here has made the connection yet, I feel obliged to write something, late as I am.
To make the problem more tractable, suppose we restrict our set of coordinate changes to ones where the resulting functions can still (approximately) be written as a neural network. (These are usually called “reparameterizations.”) This occurs when multiple neural networks implement (approximately) the same function; they’re redundant. One trivial example of this is the invariance of ReLU networks to scaling one layer by a constant, and the next layer by the inverse of that constant.
Then, in the language of parametric statistics, this phenomenon has a name: non-identifiability! Lucky for us, there’s a decent chunk of literature on identifiability in neural networks out there. At first glance, we have what seems like a somewhat disappointing result: ReLU networks are identifiable up to permutation and rescaling symmetries.
But there’s a catch—this is only true except for a set of measure zero. (The other catch is that the results don’t cover approximate symmetries.) This is important because there are reasons to suggest real neural networks are pushed close to this set during training. This set of measure zero corresponds to “reducible” or “degenerate” neural networks—those that can be expressed with fewer parameters. And hey, funny enough, aren’t neural networks quite easily pruned?
In other parts of the literature, this problem has been phrased differently, under the framework of “structure-function symmetries” or “canonicalization.” It’s also often covered when discussing the concepts of “inverse stability” and “stable recovery.” For more on this, including a review of the literature, I highly recommend Matthew Farrugia-Roberts’ excellent master’s thesis on the topic.
(Separately, I’m currently working on the issue of coordinate-free sparsity. I believe I have a solution to this—stay tuned, or reach out if interested.)
- johnswentworth 22 Mar 2023 4:16 UTC
  3 points
  0
  Parent
  That’s a great connection which I had indeed not made, thanks! Strong-upvoted.