lsgos

Karma: 227

Improving Dictionary Learning with Gated Sparse Autoencoders

Senthooran Rajamanoharan, Arthur Conmy, lsgos, Tom Lieberum, Vikrant Varma, János Kramár, Rohin Shah and Neel Nanda

25 Apr 2024 18:43 UTC

62 points

35 comments1 min readLW link

(arxiv.org)

[Full Post] Progress Update #1 from the GDM Mech Interp Team

Neel Nanda, Arthur Conmy, lsgos, Senthooran Rajamanoharan, Tom Lieberum, János Kramár and Vikrant Varma

19 Apr 2024 19:06 UTC

71 points

8 comments8 min readLW link

[Summary] Progress Update #1 from the GDM Mech Interp Team

Neel Nanda, Arthur Conmy, lsgos, Senthooran Rajamanoharan, Tom Lieberum, János Kramár and Vikrant Varma

19 Apr 2024 19:06 UTC

68 points

0 comments3 min readLW link

Dropout can create a privileged basis in the ReLU output model.

lsgos28 Apr 2023 1:59 UTC

24 points

3 comments5 min readLW link