RSS

David Scott Krueger (formerly: capybaralet)(David Krueger)

Karma: 1,841

I’m more active on Twitter than LW/​AF these days: https://​​twitter.com/​​DavidSKrueger

Bio from https://​​www.davidscottkrueger.com/​​:
I am an Assistant Professor at the University of Cambridge and a member of Cambridge’s Computational and Biological Learning lab (CBL). My research group focuses on Deep Learning, AI Alignment, and AI safety. I’m broadly interested in work (including in areas outside of Machine Learning, e.g. AI governance) that could reduce the risk of human extinction (“x-risk”) resulting from out-of-control AI systems. Particular interests include:

Test­ing for con­se­quence-blind­ness in LLMs us­ing the HI-ADS unit test.

David Scott Krueger (formerly: capybaralet)24 Nov 2023 23:35 UTC
25 points
2 comments2 min readLW link

“Pub­lish or Per­ish” (a quick note on why you should try to make your work leg­ible to ex­ist­ing aca­demic com­mu­ni­ties)

David Scott Krueger (formerly: capybaralet)18 Mar 2023 19:01 UTC
98 points
48 comments1 min readLW link

[Question] What or­ga­ni­za­tions other than Con­jec­ture have (esp. pub­lic) info-haz­ard poli­cies?

David Scott Krueger (formerly: capybaralet)16 Mar 2023 14:49 UTC
20 points
1 comment1 min readLW link

A (EtA: quick) note on ter­minol­ogy: AI Align­ment != AI x-safety

David Scott Krueger (formerly: capybaralet)8 Feb 2023 22:33 UTC
46 points
20 comments1 min readLW link

Why I hate the “ac­ci­dent vs. mi­suse” AI x-risk di­chotomy (quick thoughts on “struc­tural risk”)

David Scott Krueger (formerly: capybaralet)30 Jan 2023 18:50 UTC
32 points
41 comments2 min readLW link

Quick thoughts on “scal­able over­sight” /​ “su­per-hu­man feed­back” research

David Scott Krueger (formerly: capybaralet)25 Jan 2023 12:55 UTC
26 points
9 comments2 min readLW link

Mechanis­tic In­ter­pretabil­ity as Re­v­erse Eng­ineer­ing (fol­low-up to “cars and elephants”)

David Scott Krueger (formerly: capybaralet)3 Nov 2022 23:19 UTC
28 points
3 comments1 min readLW link

“Cars and Elephants”: a hand­wavy ar­gu­ment/​anal­ogy against mechanis­tic interpretability

David Scott Krueger (formerly: capybaralet)31 Oct 2022 21:26 UTC
48 points
25 comments2 min readLW link

[Question] I’m plan­ning to start cre­at­ing more write-ups sum­ma­riz­ing my thoughts on var­i­ous is­sues, mostly re­lated to AI ex­is­ten­tial safety. What do you want to hear my nu­anced takes on?

David Scott Krueger (formerly: capybaralet)24 Sep 2022 12:38 UTC
9 points
10 comments1 min readLW link

[An email with a bunch of links I sent an ex­pe­rienced ML re­searcher in­ter­ested in learn­ing about Align­ment /​ x-safety.]

David Scott Krueger (formerly: capybaralet)8 Sep 2022 22:28 UTC
47 points
1 comment5 min readLW link

An Up­date on Academia vs. In­dus­try (one year into my fac­ulty job)

David Scott Krueger (formerly: capybaralet)3 Sep 2022 20:43 UTC
121 points
18 comments4 min readLW link

Causal con­fu­sion as an ar­gu­ment against the scal­ing hypothesis

20 Jun 2022 10:54 UTC
86 points
30 comments18 min readLW link

[Question] Do FDT (or similar) recom­mend repa­ra­tions?

David Scott Krueger (formerly: capybaralet)29 Apr 2022 17:34 UTC
13 points
3 comments1 min readLW link

[Question] What’s a good prob­a­bil­ity dis­tri­bu­tion fam­ily (e.g. “log-nor­mal”) to use for AGI timelines?

David Scott Krueger (formerly: capybaralet)13 Apr 2022 4:45 UTC
9 points
11 comments1 min readLW link

[Question] Is “gears-level” just a syn­onym for “mechanis­tic”?

David Scott Krueger (formerly: capybaralet)13 Dec 2021 4:11 UTC
48 points
29 comments1 min readLW link

[Question] Is there a name for the the­ory that “There will be fast take­off in real-world ca­pa­bil­ities be­cause al­most ev­ery­thing is AGI-com­plete”?

David Scott Krueger (formerly: capybaralet)2 Sep 2021 23:00 UTC
31 points
8 comments1 min readLW link

[Question] What do we know about how much pro­tec­tion COVID vac­cines provide against trans­mit­ting the virus to oth­ers?

David Scott Krueger (formerly: capybaralet)6 May 2021 7:39 UTC
9 points
0 comments1 min readLW link

[Question] What do we know about how much pro­tec­tion COVID vac­cines provide against long COVID?

David Scott Krueger (formerly: capybaralet)6 May 2021 7:39 UTC
8 points
0 comments1 min readLW link

[Question] What do the re­ported lev­els of pro­tec­tion offered by var­i­ous vac­cines mean?

David Scott Krueger (formerly: capybaralet)4 May 2021 22:06 UTC
10 points
4 comments1 min readLW link

[Question] Did they use serolog­i­cal test­ing for COVID vac­cine tri­als?

David Scott Krueger (formerly: capybaralet)4 May 2021 21:48 UTC
4 points
0 comments1 min readLW link