Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
jenny
Karma:
313
All
Posts
Comments
New
Top
Old
jenny’s Shortform
jenny
Jan 24, 2025, 8:09 PM
4
points
0
comments
LW
link
Attributing to interactions with GCPD and GWPD
jenny
Oct 11, 2023, 3:06 PM
20
points
0
comments
6
min read
LW
link
Impact stories for model internals: an exercise for interpretability researchers
jenny
Sep 25, 2023, 11:15 PM
29
points
3
comments
7
min read
LW
link
Causal scrubbing: results on induction heads
LawrenceC
,
Adrià Garriga-alonso
,
Nicholas Goldowsky-Dill
,
ryan_greenblatt
,
Tao Lin
,
jenny
,
Ansh Radhakrishnan
,
Buck
and
Nate Thomas
Dec 3, 2022, 12:59 AM
34
points
1
comment
17
min read
LW
link
Causal scrubbing: results on a paren balance checker
LawrenceC
,
Adrià Garriga-alonso
,
Nicholas Goldowsky-Dill
,
ryan_greenblatt
,
Tao Lin
,
jenny
,
Ansh Radhakrishnan
,
Buck
and
Nate Thomas
Dec 3, 2022, 12:59 AM
34
points
2
comments
30
min read
LW
link
Causal scrubbing: Appendix
LawrenceC
,
Adrià Garriga-alonso
,
Nicholas Goldowsky-Dill
,
ryan_greenblatt
,
jenny
,
Ansh Radhakrishnan
,
Buck
and
Nate Thomas
Dec 3, 2022, 12:58 AM
18
points
4
comments
20
min read
LW
link
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
LawrenceC
,
Adrià Garriga-alonso
,
Nicholas Goldowsky-Dill
,
ryan_greenblatt
,
jenny
,
Ansh Radhakrishnan
,
Buck
and
Nate Thomas
Dec 3, 2022, 12:58 AM
206
points
35
comments
20
min read
LW
link
1
review
Back to top
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel