Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Robert_AIZI
Karma:
1,388
All
Posts
Comments
New
Top
Old
Page
1
SAEs you can See: Applying Sparse Autoencoders to Clustering
Robert_AIZI
Oct 28, 2024, 2:48 PM
27
points
0
comments
10
min read
LW
link
Comments on Anthropic’s Scaling Monosemanticity
Robert_AIZI
Jun 3, 2024, 12:15 PM
98
points
8
comments
7
min read
LW
link
Explaining a Math Magic Trick
Robert_AIZI
May 5, 2024, 7:41 PM
99
points
10
comments
5
min read
LW
link
Research Report: Sparse Autoencoders find only 9/180 board state features in OthelloGPT
Robert_AIZI
Mar 5, 2024, 1:55 PM
61
points
24
comments
10
min read
LW
link
(aizi.substack.com)
Rating my AI Predictions
Robert_AIZI
Dec 21, 2023, 2:07 PM
22
points
5
comments
2
min read
LW
link
(aizi.substack.com)
Comparing Anthropic’s Dictionary Learning to Ours
Robert_AIZI
Oct 7, 2023, 11:30 PM
137
points
8
comments
4
min read
LW
link
Sparse Autoencoders Find Highly Interpretable Directions in Language Models
Logan Riggs
,
Hoagy
,
Aidan Ewart
and
Robert_AIZI
Sep 21, 2023, 3:30 PM
159
points
8
comments
5
min read
LW
link
Unsafe AI as Dynamical Systems
Robert_AIZI
Jul 14, 2023, 3:31 PM
11
points
0
comments
3
min read
LW
link
(aizi.substack.com)
AIs teams will probably be more superintelligent than individual AIs
Robert_AIZI
Jul 4, 2023, 2:06 PM
3
points
1
comment
2
min read
LW
link
(aizi.substack.com)
[Research Update] Sparse Autoencoder features are bimodal
Robert_AIZI
Jun 22, 2023, 1:15 PM
24
points
1
comment
5
min read
LW
link
(aizi.substack.com)
Explaining “Taking features out of superposition with sparse autoencoders”
Robert_AIZI
Jun 16, 2023, 1:59 PM
10
points
0
comments
8
min read
LW
link
(aizi.substack.com)
[Question]
Question for Prediction Market people: where is the money supposed to come from?
Robert_AIZI
Jun 8, 2023, 1:58 PM
25
points
26
comments
1
min read
LW
link
Is behavioral safety “solved” in non-adversarial conditions?
Robert_AIZI
May 25, 2023, 5:56 PM
26
points
8
comments
2
min read
LW
link
(aizi.substack.com)
Research Report: Incorrectness Cascades (Corrected)
Robert_AIZI
May 9, 2023, 9:54 PM
9
points
0
comments
9
min read
LW
link
(aizi.substack.com)
I was Wrong, Simulator Theory is Real
Robert_AIZI
Apr 26, 2023, 5:45 PM
75
points
7
comments
3
min read
LW
link
(aizi.substack.com)
The Toxoplasma of AGI Doom and Capabilities?
Robert_AIZI
Apr 24, 2023, 6:11 PM
72
points
12
comments
1
min read
LW
link
Study 1b: This One Weird Trick does NOT cause incorrectness cascades
Robert_AIZI
Apr 20, 2023, 6:10 PM
5
points
0
comments
6
min read
LW
link
(aizi.substack.com)
Research Report: Incorrectness Cascades
Robert_AIZI
Apr 14, 2023, 12:49 PM
19
points
0
comments
10
min read
LW
link
(aizi.substack.com)
Pre-registering a study
Robert_AIZI
Apr 7, 2023, 3:46 PM
10
points
0
comments
6
min read
LW
link
(aizi.substack.com)
Invocations: The Other Capabilities Overhang?
Robert_AIZI
Apr 4, 2023, 1:38 PM
29
points
4
comments
4
min read
LW
link
(aizi.substack.com)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel