RSS

Neel Nanda

Karma: 11,426

Thought An­chors: Which LLM Rea­son­ing Steps Mat­ter?

Jul 2, 2025, 8:16 PM
33 points
0 comments6 min readLW link
(www.thought-anchors.com)

SAE on ac­ti­va­tion differences

Jun 30, 2025, 5:50 PM
43 points
2 comments5 min readLW link

What We Learned Try­ing to Diff Base and Chat Models (And Why It Mat­ters)

Jun 30, 2025, 5:17 PM
94 points
2 comments7 min readLW link

Agen­tic In­ter­pretabil­ity: A Strat­egy Against Grad­ual Disempowerment

Jun 17, 2025, 2:52 PM
16 points
6 comments2 min readLW link

Con­ver­gent Lin­ear Rep­re­sen­ta­tions of Emer­gent Misalignment

Jun 16, 2025, 3:47 PM
64 points
0 comments8 min readLW link