The other baseline would be to compare one L1-trained SAE against another L1-trained SAE—if you see a similar approximate “1/10 have cossim > 0.9, 1⁄3 have cossim > 0.8, 1⁄2 have cossim > 0.7” pattern, that’s not definitive proof that both approaches find “the same kind of features” but it would strongly suggest that, at least to me.
The other baseline would be to compare one L1-trained SAE against another L1-trained SAE—if you see a similar approximate “1/10 have cossim > 0.9, 1⁄3 have cossim > 0.8, 1⁄2 have cossim > 0.7” pattern, that’s not definitive proof that both approaches find “the same kind of features” but it would strongly suggest that, at least to me.