RSS

TomasD

Karma: 136

A Bunch of Ma­tryoshka SAEs

Apr 4, 2025, 2:53 PM
21 points
0 comments8 min readLW link

Fea­ture Hedg­ing: Another way cor­re­lated fea­tures break SAEs

Mar 25, 2025, 2:33 PM
19 points
0 comments18 min readLW link

Toy Models of Fea­ture Ab­sorp­tion in SAEs

Oct 7, 2024, 9:56 AM
49 points
8 comments10 min readLW link

[Paper] A is for Ab­sorp­tion: Study­ing Fea­ture Split­ting and Ab­sorp­tion in Sparse Autoencoders

Sep 25, 2024, 9:31 AM
73 points
16 comments3 min readLW link
(arxiv.org)

To­masD’s Shortform

TomasDMar 14, 2024, 3:03 PM
1 point
0 commentsLW link