RSS

Robert_AIZI

Karma: 1,388

SAEs you can See: Ap­ply­ing Sparse Au­toen­coders to Clustering

Robert_AIZIOct 28, 2024, 2:48 PM
27 points
0 comments10 min readLW link

Com­ments on An­thropic’s Scal­ing Monosemanticity

Robert_AIZIJun 3, 2024, 12:15 PM
98 points
8 comments7 min readLW link

Ex­plain­ing a Math Magic Trick

Robert_AIZIMay 5, 2024, 7:41 PM
99 points
10 comments5 min readLW link

Re­search Re­port: Sparse Au­toen­coders find only 9/​180 board state fea­tures in OthelloGPT

Robert_AIZIMar 5, 2024, 1:55 PM
61 points
24 comments10 min readLW link
(aizi.substack.com)

Rat­ing my AI Predictions

Robert_AIZIDec 21, 2023, 2:07 PM
22 points
5 comments2 min readLW link
(aizi.substack.com)

Com­par­ing An­thropic’s Dic­tionary Learn­ing to Ours

Robert_AIZIOct 7, 2023, 11:30 PM
137 points
8 comments4 min readLW link

Sparse Au­toen­coders Find Highly In­ter­pretable Direc­tions in Lan­guage Models

Sep 21, 2023, 3:30 PM
159 points
8 comments5 min readLW link

Un­safe AI as Dy­nam­i­cal Systems

Robert_AIZIJul 14, 2023, 3:31 PM
11 points
0 comments3 min readLW link
(aizi.substack.com)

AIs teams will prob­a­bly be more su­per­in­tel­li­gent than in­di­vi­d­ual AIs

Robert_AIZIJul 4, 2023, 2:06 PM
3 points
1 comment2 min readLW link
(aizi.substack.com)

[Re­search Up­date] Sparse Au­toen­coder fea­tures are bimodal

Robert_AIZIJun 22, 2023, 1:15 PM
24 points
1 comment5 min readLW link
(aizi.substack.com)

Ex­plain­ing “Tak­ing fea­tures out of su­per­po­si­tion with sparse au­toen­coders”

Robert_AIZIJun 16, 2023, 1:59 PM
10 points
0 comments8 min readLW link
(aizi.substack.com)

[Question] Ques­tion for Pre­dic­tion Mar­ket peo­ple: where is the money sup­posed to come from?

Robert_AIZIJun 8, 2023, 1:58 PM
25 points
26 comments1 min readLW link

Is be­hav­ioral safety “solved” in non-ad­ver­sar­ial con­di­tions?

Robert_AIZIMay 25, 2023, 5:56 PM
26 points
8 comments2 min readLW link
(aizi.substack.com)

Re­search Re­port: In­cor­rect­ness Cas­cades (Cor­rected)

Robert_AIZIMay 9, 2023, 9:54 PM
9 points
0 comments9 min readLW link
(aizi.substack.com)

I was Wrong, Si­mu­la­tor The­ory is Real

Robert_AIZIApr 26, 2023, 5:45 PM
75 points
7 comments3 min readLW link
(aizi.substack.com)

The Tox­o­plasma of AGI Doom and Ca­pa­bil­ities?

Robert_AIZIApr 24, 2023, 6:11 PM
72 points
12 comments1 min readLW link

Study 1b: This One Weird Trick does NOT cause in­cor­rect­ness cascades

Robert_AIZIApr 20, 2023, 6:10 PM
5 points
0 comments6 min readLW link
(aizi.substack.com)

Re­search Re­port: In­cor­rect­ness Cascades

Robert_AIZIApr 14, 2023, 12:49 PM
19 points
0 comments10 min readLW link
(aizi.substack.com)

Pre-reg­is­ter­ing a study

Robert_AIZIApr 7, 2023, 3:46 PM
10 points
0 comments6 min readLW link
(aizi.substack.com)

In­vo­ca­tions: The Other Ca­pa­bil­ities Over­hang?

Robert_AIZIApr 4, 2023, 1:38 PM
29 points
4 comments4 min readLW link
(aizi.substack.com)