Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
neverix
Karma:
146
All
Posts
Comments
New
Top
Old
Evolutionary prompt optimization for SAE feature visualization
neverix
,
Daniel Tan
,
Dmitrii Kharlapenko
,
Neel Nanda
and
Arthur Conmy
Nov 14, 2024, 1:06 PM
21
points
0
comments
9
min read
LW
link
SAE features for refusal and sycophancy steering vectors
neverix
,
Dmitrii Kharlapenko
,
Arthur Conmy
and
Neel Nanda
Oct 12, 2024, 2:54 PM
29
points
4
comments
7
min read
LW
link
Extracting SAE task features for in-context learning
Dmitrii Kharlapenko
,
neverix
,
Neel Nanda
and
Arthur Conmy
Aug 12, 2024, 8:34 PM
31
points
1
comment
9
min read
LW
link
Self-explaining SAE features
Dmitrii Kharlapenko
,
neverix
,
Neel Nanda
and
Arthur Conmy
Aug 5, 2024, 10:20 PM
60
points
13
comments
10
min read
LW
link
Back to top
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel