Leaving a meta-comment about priors: on one hand, almost-linear features seem very plausible (a priori) for almost-linear neural networks; on the other, linear algebra is probably the single mathematical tool I’d expect ML researchers to be incredibly well-versed in, and the fact that we haven’t found a “smoking gun” at this point with so much potential scrutiny makes me suspect.
And while this is a very natural hypothesis to test, and I’m excited for people to do so, it seems possible that the field’s familiarity with linear methods is a hammer that makes everything look like a nail. It’s easy to focus on linear interpretability because the alternative seems too hard (a response I often get) - I think this is wrong, and there are tractable directions in the nonlinear case too, as long as you’re willing to go slightly further afield.
I also have some skepticism on the object-level here too, but it was taking me too long to write it up, so that will have to wait. I think this is definitely a topic worth spending more time on—appreciate the post!
Great discussion here!
Leaving a meta-comment about priors: on one hand, almost-linear features seem very plausible (a priori) for almost-linear neural networks; on the other, linear algebra is probably the single mathematical tool I’d expect ML researchers to be incredibly well-versed in, and the fact that we haven’t found a “smoking gun” at this point with so much potential scrutiny makes me suspect.
And while this is a very natural hypothesis to test, and I’m excited for people to do so, it seems possible that the field’s familiarity with linear methods is a hammer that makes everything look like a nail. It’s easy to focus on linear interpretability because the alternative seems too hard (a response I often get) - I think this is wrong, and there are tractable directions in the nonlinear case too, as long as you’re willing to go slightly further afield.
I also have some skepticism on the object-level here too, but it was taking me too long to write it up, so that will have to wait. I think this is definitely a topic worth spending more time on—appreciate the post!