Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
rajashree
Karma:
106
All
Posts
Comments
New
Top
Old
[Replication] Crosscoder-based Stage-Wise Model Diffing
Anna Soligo
,
Thomas Read
,
Oliver Clive-Griffin
,
dmanningcoe
,
Chun Hei Yip
,
rajashree
and
Jason Gross
22 Mar 2025 18:35 UTC
21
points
0
comments
7
min read
LW
link
Measuring Nonlinear Feature Interactions in Sparse Crosscoders [Project Proposal]
Jason Gross
and
rajashree
6 Jan 2025 4:22 UTC
19
points
0
comments
12
min read
LW
link
Compact Proofs of Model Performance via Mechanistic Interpretability
LawrenceC
,
rajashree
,
Adrià Garriga-alonso
and
Jason Gross
24 Jun 2024 19:27 UTC
96
points
4
comments
8
min read
LW
link
(arxiv.org)
Back to top
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel