SAEs are model specific. You need Pythia SAEs to investigate Pythia. I don’t have a comprehensive list but you can look at the sparse autoencoder tag on LW for relevant papers.
SAEs are model specific. You need Pythia SAEs to investigate Pythia. I don’t have a comprehensive list but you can look at the sparse autoencoder tag on LW for relevant papers.