Charlie Steiner comments on The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks

Charlie Steiner 15 Jun 2024 12:23 UTC
LW: 2 AF: 2
0
AF
This was super interesting. I hadn’t really thought about the tension between SLT and superposition before, but this is in the middle of it.

Like, there’s nothing logically inconsistent with the best local basis for the weights being undercomplete while the best basis for the activations is overcomplete. But if both are true, it seems like the relationship to the data distribution has to be quite special (and potentially fragile).