RSS

Joseph Miller

Karma: 1,087

Trans­former Cir­cuit Faith­ful­ness Met­rics Are Not Robust

12 Jul 2024 3:47 UTC
104 points
5 comments7 min readLW link
(arxiv.org)

Joseph Miller’s Shortform

Joseph Miller21 May 2024 20:50 UTC
5 points
2 comments1 min readLW link

How To Do Patch­ing Fast

Joseph Miller11 May 2024 20:13 UTC
40 points
6 comments4 min readLW link

Why I’m do­ing PauseAI

Joseph Miller30 Apr 2024 16:21 UTC
106 points
16 comments4 min readLW link

Global Pause AI Protest 10/​21

14 Oct 2023 3:20 UTC
5 points
0 comments1 min readLW link

The In­ter­na­tional PauseAI Protest: Ac­tivism un­der uncertainty

Joseph Miller12 Oct 2023 17:36 UTC
32 points
1 comment1 min readLW link

Even Su­per­hu­man Go AIs Have Sur­pris­ing Failure Modes

20 Jul 2023 17:31 UTC
129 points
22 comments10 min readLW link
(far.ai)

We Found An Neu­ron in GPT-2

11 Feb 2023 18:27 UTC
143 points
23 comments7 min readLW link
(clementneo.com)