Hi Lee, if I may ask, when you say “geometric analysis” of the router, do you mean analysis of the parameters or activations? Are there any papers that perform the sort of analysis you’d like seen done? Asking from the perspective of someone who understands nns thoroughly but is new to mechinterp.
Both of these seem like interesting directions (I had parameters in mind, but params and activations are too closely linked to ignore one or the other). And I don’t have a super clear idea but something like representational similarity analysis between SwitchSAEs and regular SAEs could be interesting. This is just one possibility of many though. I haven’t thought about it for long enough to be able to list many more, but it feels like a direction with low hanging fruit for sure. For papers, here’s a good place to start for RSA: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3730178/
Hi Lee, if I may ask, when you say “geometric analysis” of the router, do you mean analysis of the parameters or activations? Are there any papers that perform the sort of analysis you’d like seen done? Asking from the perspective of someone who understands nns thoroughly but is new to mechinterp.
Both of these seem like interesting directions (I had parameters in mind, but params and activations are too closely linked to ignore one or the other). And I don’t have a super clear idea but something like representational similarity analysis between SwitchSAEs and regular SAEs could be interesting. This is just one possibility of many though. I haven’t thought about it for long enough to be able to list many more, but it feels like a direction with low hanging fruit for sure. For papers, here’s a good place to start for RSA: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3730178/
Thank you very much for your reply—I appreciate the commentary and direction