RSS

Joseph Miller

Karma: 1,778

Gra­di­ent Rout­ing: Mask­ing Gra­di­ents to Lo­cal­ize Com­pu­ta­tion in Neu­ral Networks

Dec 6, 2024, 10:19 PM
165 points
12 comments11 min readLW link
(arxiv.org)

Trans­former Cir­cuit Faith­ful­ness Met­rics Are Not Robust

Jul 12, 2024, 3:47 AM
104 points
5 comments7 min readLW link
(arxiv.org)

Joseph Miller’s Shortform

Joseph MillerMay 21, 2024, 8:50 PM
5 points
44 commentsLW link

How To Do Patch­ing Fast

Joseph MillerMay 11, 2024, 8:13 PM
44 points
8 comments4 min readLW link

Why I’m do­ing PauseAI

Joseph MillerApr 30, 2024, 4:21 PM
108 points
16 comments4 min readLW link

Global Pause AI Protest 10/​21

Oct 14, 2023, 3:20 AM
5 points
0 comments1 min readLW link

The In­ter­na­tional PauseAI Protest: Ac­tivism un­der uncertainty

Joseph MillerOct 12, 2023, 5:36 PM
32 points
1 commentLW link

Even Su­per­hu­man Go AIs Have Sur­pris­ing Failure Modes

Jul 20, 2023, 5:31 PM
129 points
22 comments10 min readLW link
(far.ai)

We Found An Neu­ron in GPT-2

Feb 11, 2023, 6:27 PM
143 points
23 comments7 min readLW link
(clementneo.com)