samshap comments on Stitching SAEs of different sizes

samshap 16 Jul 2024 21:49 UTC
1 point
0
Thanks for sharing that study. It looks like your team is already well-versed in this subject!
You wouldn’t want something that’s too hard to extract, but I think restricting yourself to a single encoder layer is too conservative—LLMs don’t have to be able to fully extract the information from a layer in a single step.
I’d be curious to see how much closer a two-layer encoder would get to the ITO results.