maybe this is really naive (I just randomly thought of it), and you mention you do some obvious stuff like looking at the singular vectors of activations which might rule it out, but could the low-frequency cluster be linked something simple like the fact that the use of ReLUs, GeLUs etc. means the neuron activations are going to be biased towards the positive quadrant of the activation space in terms of magnitude (because negative components of any vector in the activation basis would be cut off). I wonder if the singular vectors would catch this.
maybe this is really naive (I just randomly thought of it), and you mention you do some obvious stuff like looking at the singular vectors of activations which might rule it out, but could the low-frequency cluster be linked something simple like the fact that the use of ReLUs, GeLUs etc. means the neuron activations are going to be biased towards the positive quadrant of the activation space in terms of magnitude (because negative components of any vector in the activation basis would be cut off). I wonder if the singular vectors would catch this.
Ah, I did compare it to the mean activations and didn’t find much, alas. Good idea though!