hugofry comments on Towards Multimodal Interpretability: Learning Sparse Interpretable Features in Vision Transformers

hugofry 30 Apr 2024 22:03 UTC
3 points
2
Thanks for the feedback! Yeah I was also surprised SAEs seem to work on ViTs pretty much straight out of the box (I didn’t even need to play around with the hyper parameters too much)! As I mentioned in the post I think it would be really interesting to train on a much larger (more typical) dataset—similar to the dataset the CLIP model was trained on.

I also agree that I probably should have emphasised the “guess the image” game as a result rather than an aside, I’ll bare that in mind for future posts!