Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Senthooran Rajamanoharan
Karma:
578
All
Posts
Comments
New
Top
Old
Interim Research Report: Mechanisms of Awareness
Josh Engels
,
Neel Nanda
and
Senthooran Rajamanoharan
May 2, 2025, 8:29 PM
43
points
6
comments
8
min read
LW
link
Negative Results for SAEs On Downstream Tasks and Deprioritising SAE Research (GDM Mech Interp Team Progress Update #2)
lewis smith
,
Senthooran Rajamanoharan
,
Arthur Conmy
,
CallumMcDougall
,
Tom Lieberum
,
János Kramár
,
Rohin Shah
and
Neel Nanda
Mar 26, 2025, 7:07 PM
113
points
15
comments
29
min read
LW
link
(deepmindsafetyresearch.medium.com)
Takeaways From Our Recent Work on SAE Probing
Josh Engels
,
Subhash Kantamneni
,
Senthooran Rajamanoharan
and
Neel Nanda
Mar 3, 2025, 7:50 PM
30
points
0
comments
5
min read
LW
link
SAE Probing: What is it good for?
Subhash Kantamneni
,
Josh Engels
,
Senthooran Rajamanoharan
and
Neel Nanda
Nov 1, 2024, 7:23 PM
33
points
0
comments
11
min read
LW
link
JumpReLU SAEs + Early Access to Gemma 2 SAEs
Senthooran Rajamanoharan
,
Tom Lieberum
,
nps29
,
Arthur Conmy
,
Vikrant Varma
,
János Kramár
and
Neel Nanda
Jul 19, 2024, 4:10 PM
49
points
10
comments
1
min read
LW
link
(storage.googleapis.com)
Improving Dictionary Learning with Gated Sparse Autoencoders
Senthooran Rajamanoharan
,
Arthur Conmy
,
lewis smith
,
Tom Lieberum
,
Vikrant Varma
,
János Kramár
,
Rohin Shah
and
Neel Nanda
Apr 25, 2024, 6:43 PM
63
points
38
comments
1
min read
LW
link
(arxiv.org)
[Full Post] Progress Update #1 from the GDM Mech Interp Team
Neel Nanda
,
Arthur Conmy
,
lewis smith
,
Senthooran Rajamanoharan
,
Tom Lieberum
,
János Kramár
and
Vikrant Varma
Apr 19, 2024, 7:06 PM
79
points
10
comments
8
min read
LW
link
[Summary] Progress Update #1 from the GDM Mech Interp Team
Neel Nanda
,
Arthur Conmy
,
lewis smith
,
Senthooran Rajamanoharan
,
Tom Lieberum
,
János Kramár
and
Vikrant Varma
Apr 19, 2024, 7:06 PM
73
points
0
comments
3
min read
LW
link
Case Studies in Reverse-Engineering Sparse Autoencoder Features by Using MLP Linearization
Jacob Dunefsky
,
Philippe Chlenski
,
Senthooran Rajamanoharan
and
Neel Nanda
Jan 14, 2024, 2:06 AM
24
points
0
comments
42
min read
LW
link
Fact Finding: Do Early Layers Specialise in Local Processing? (Post 5)
Neel Nanda
,
Senthooran Rajamanoharan
,
János Kramár
and
Rohin Shah
Dec 23, 2023, 2:46 AM
18
points
0
comments
4
min read
LW
link
Fact Finding: How to Think About Interpreting Memorisation (Post 4)
Senthooran Rajamanoharan
,
Neel Nanda
,
János Kramár
and
Rohin Shah
Dec 23, 2023, 2:46 AM
22
points
0
comments
9
min read
LW
link
Fact Finding: Trying to Mechanistically Understanding Early MLPs (Post 3)
Neel Nanda
,
Senthooran Rajamanoharan
,
János Kramár
and
Rohin Shah
23 Dec 2023 2:46 UTC
10
points
1
comment
16
min read
LW
link
Fact Finding: Simplifying the Circuit (Post 2)
Senthooran Rajamanoharan
,
Neel Nanda
,
János Kramár
and
Rohin Shah
23 Dec 2023 2:45 UTC
25
points
3
comments
14
min read
LW
link
Fact Finding: Attempting to Reverse-Engineer Factual Recall on the Neuron Level (Post 1)
Neel Nanda
,
Senthooran Rajamanoharan
,
János Kramár
and
Rohin Shah
23 Dec 2023 2:44 UTC
106
points
10
comments
22
min read
LW
link
2
reviews
Back to top
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel