RSS

David Udell

Karma: 2,549

Why Can’t We Hy­poth­e­size After the Fact?

David UdellFeb 26, 2025, 10:41 PM
40 points
3 comments2 min readLW link

Causal Graphs of GPT-2-Small’s Resi­d­ual Stream

David UdellJul 9, 2024, 10:06 PM
53 points
7 comments7 min readLW link

Sparse Cod­ing, for Mechanis­tic In­ter­pretabil­ity and Ac­ti­va­tion Engineering

David UdellSep 23, 2023, 7:16 PM
42 points
7 comments34 min readLW link

Ac­tAdd: Steer­ing Lan­guage Models with­out Optimization

Sep 6, 2023, 5:21 PM
105 points
3 comments2 min readLW link
(arxiv.org)

Steer­ing GPT-2-XL by adding an ac­ti­va­tion vector

May 13, 2023, 6:42 PM
437 points
98 comments50 min readLW link1 review

Un­der­stand­ing and con­trol­ling a maze-solv­ing policy network

Mar 11, 2023, 6:59 PM
333 points
28 comments23 min readLW link

Be­neath My Epistemic Dignity

David UdellFeb 28, 2023, 4:02 AM
6 points
3 comments2 min readLW link

Prob­a­bil­ity The­ory: The Logic of Science, Jaynes

David UdellFeb 16, 2023, 9:57 PM
29 points
0 comments18 min readLW link

Round­ing Some­one Off

David UdellJan 24, 2023, 12:03 AM
25 points
0 comments5 min readLW link

Con­se­quen­tial­ists: One-Way Pat­tern Traps

David UdellJan 16, 2023, 8:48 PM
59 points
3 comments14 min readLW link

Lin­ear Alge­bra Done Right, Axler

David UdellJan 2, 2023, 10:54 PM
57 points
6 comments9 min readLW link

Naive Set The­ory, Halmos

David UdellDec 22, 2022, 2:34 AM
11 points
1 comment8 min readLW link

Moorean Statements

David UdellOct 22, 2022, 12:50 AM
11 points
11 comments1 min readLW link

Dath Ilan’s Views on Stop­gap Corrigibility

David UdellSep 22, 2022, 4:16 PM
78 points
19 comments13 min readLW link
(www.glowfic.com)

Guidelines for Mad Entrepreneurs

David UdellSep 16, 2022, 6:33 AM
31 points
0 comments11 min readLW link

Fram­ing AI Childhoods

David UdellSep 6, 2022, 11:40 PM
37 points
8 comments4 min readLW link

The Shard The­ory Align­ment Scheme

David UdellAug 25, 2022, 4:52 AM
47 points
32 comments2 min readLW link

“What Mis­takes Are You Mak­ing Right Now?”

David UdellAug 15, 2022, 9:19 PM
13 points
2 comments1 min readLW link

Shard The­ory: An Overview

David UdellAug 11, 2022, 5:44 AM
166 points
34 comments10 min readLW link

Team Shard Sta­tus Report

David UdellAug 9, 2022, 5:33 AM
38 points
8 comments3 min readLW link