Who else is actively pursuing sparse feature circuits in addition to Sam Marks? I’m curious because the code breaks in the forward pass of the linear layer in gpt2 since the dimensions are different from Pythia’s (768).
SAEs are model specific. You need Pythia SAEs to investigate Pythia. I don’t have a comprehensive list but you can look at the sparse autoencoder tag on LW for relevant papers.
Thank you for the feedback, and thanks for this.
Who else is actively pursuing sparse feature circuits in addition to Sam Marks? I’m curious because the code breaks in the forward pass of the linear layer in gpt2 since the dimensions are different from Pythia’s (768).
SAEs are model specific. You need Pythia SAEs to investigate Pythia. I don’t have a comprehensive list but you can look at the sparse autoencoder tag on LW for relevant papers.