This was an up-skilling project I worked on throughout the past months. Even though I don’t think it is anything fancy or highly relevant to the research around SAEs, I find it valuable since I learned a lot and refined my understanding of how mechinterp fits in the holistic, bigger picture of AI Safety.
In the mid-term future I hope to engage in more challenging and impactful projects.
P.D.: brutally honest feedback is completely welcome :p
Good work! I’m sure you learned a lot while doing this and am a big fan of people publishing artifacts produced during upskilling. ARENA just updated it’s SAE content so that might also be a good next step for you!