RSS

Neel Nanda

Karma: 9,853

Nega­tive Re­sults for SAEs On Down­stream Tasks and Depri­ori­tis­ing SAE Re­search (GDM Mech In­terp Team Progress Up­date #2)

Mar 26, 2025, 7:07 PM
81 points
11 comments29 min readLW link
(deepmindsafetyresearch.medium.com)

Good Re­search Takes are Not Suffi­cient for Good Strate­gic Takes

Neel NandaMar 22, 2025, 10:13 AM
224 points
11 comments4 min readLW link
(www.neelnanda.io)

Take­aways From Our Re­cent Work on SAE Probing

Mar 3, 2025, 7:50 PM
30 points
0 comments5 min readLW link

The GDM AGI Safety+Align­ment Team is Hiring for Ap­plied In­ter­pretabil­ity Research

Feb 24, 2025, 2:17 AM
48 points
1 comment7 min readLW link