RSS

AXRP

TagLast edit: 30 Dec 2024 10:23 UTC by Dakara

AI X-Risk Research Podcast is a podcast hosted by Daniel Filan.

See also: Audio, Interviews

Video/​an­i­ma­tion: Neel Nanda ex­plains what mechanis­tic in­ter­pretabil­ity is

DanielFilan22 Feb 2023 22:42 UTC
24 points
7 comments1 min readLW link
(youtu.be)

AXRP Epi­sode 32 - Un­der­stand­ing Agency with Jan Kulveit

DanielFilan30 May 2024 3:50 UTC
20 points
0 comments53 min readLW link

AXRP Epi­sode 27 - AI Con­trol with Buck Sh­legeris and Ryan Greenblatt

DanielFilan11 Apr 2024 21:30 UTC
69 points
10 comments107 min readLW link

AXRP Epi­sode 28 - Su­ing Labs for AI Risk with Gabriel Weil

DanielFilan17 Apr 2024 21:42 UTC
12 points
0 comments65 min readLW link

AXRP Epi­sode 29 - Science of Deep Learn­ing with Vikrant Varma

DanielFilan25 Apr 2024 19:10 UTC
20 points
1 comment63 min readLW link

AXRP Epi­sode 38.3 - Erik Jen­ner on Learned Look-Ahead

DanielFilan12 Dec 2024 5:40 UTC
20 points
0 comments16 min readLW link

AXRP Epi­sode 30 - AI Se­cu­rity with Jeffrey Ladish

DanielFilan1 May 2024 2:50 UTC
25 points
0 comments79 min readLW link

AXRP Epi­sode 31 - Sin­gu­lar Learn­ing The­ory with Daniel Murfet

DanielFilan7 May 2024 3:50 UTC
72 points
4 comments71 min readLW link

AXRP Epi­sode 33 - RLHF Prob­lems with Scott Emmons

DanielFilan12 Jun 2024 3:30 UTC
34 points
0 comments56 min readLW link

AXRP Epi­sode 34 - AI Eval­u­a­tions with Beth Barnes

DanielFilan28 Jul 2024 3:30 UTC
23 points
0 comments69 min readLW link

AXRP Epi­sode 36 - Adam Shai and Paul Riech­ers on Com­pu­ta­tional Mechanics

DanielFilan29 Sep 2024 5:50 UTC
25 points
0 comments55 min readLW link

AXRP Epi­sode 37 - Jaime Sevilla on Fore­cast­ing AI

DanielFilan4 Oct 2024 21:00 UTC
21 points
3 comments56 min readLW link

AXRP Epi­sode 38.0 - Zhijing Jin on LLMs, Causal­ity, and Multi-Agent Systems

DanielFilan14 Nov 2024 7:00 UTC
14 points
0 comments12 min readLW link

AXRP Epi­sode 38.1 - Alan Chan on Agent Infrastructure

DanielFilan16 Nov 2024 23:30 UTC
12 points
0 comments14 min readLW link

AXRP Epi­sode 38.2 - Jesse Hoogland on Sin­gu­lar Learn­ing Theory

DanielFilan27 Nov 2024 6:30 UTC
34 points
0 comments10 min readLW link

AXRP Epi­sode 39 - Evan Hub­inger on Model Or­ganisms of Misalignment

DanielFilan1 Dec 2024 6:00 UTC
41 points
0 comments67 min readLW link

AXRP Epi­sode 20 - ‘Re­form’ AI Align­ment with Scott Aaronson

DanielFilan12 Apr 2023 21:30 UTC
22 points
2 comments68 min readLW link

AXRP Epi­sode 26 - AI Gover­nance with Eliz­a­beth Seger

DanielFilan26 Nov 2023 23:00 UTC
14 points
0 comments66 min readLW link

AXRP Epi­sode 21 - In­ter­pretabil­ity for Eng­ineers with Stephen Casper

DanielFilan2 May 2023 0:50 UTC
12 points
1 comment66 min readLW link

AXRP Epi­sode 22 - Shard The­ory with Quintin Pope

DanielFilan15 Jun 2023 19:00 UTC
52 points
11 comments93 min readLW link

AXRP an­nounce­ment: Sur­vey, Store Clos­ing, Patreon

DanielFilan28 Jun 2023 23:40 UTC
14 points
0 comments1 min readLW link

AXRP Epi­sode 23 - Mechanis­tic Ano­maly De­tec­tion with Mark Xu

DanielFilan27 Jul 2023 1:50 UTC
22 points
0 comments72 min readLW link

AXRP Epi­sode 24 - Su­per­al­ign­ment with Jan Leike

DanielFilan27 Jul 2023 4:00 UTC
55 points
3 comments69 min readLW link

AXRP Epi­sode 25 - Co­op­er­a­tive AI with Cas­par Oesterheld

DanielFilan3 Oct 2023 21:50 UTC
43 points
0 comments92 min readLW link

AXRP Epi­sode 13 - First Prin­ci­ples of AGI Safety with Richard Ngo

DanielFilan31 Mar 2022 5:20 UTC
24 points
1 comment48 min readLW link

AXRP Epi­sode 14 - In­fra-Bayesian Phys­i­cal­ism with Vanessa Kosoy

DanielFilan5 Apr 2022 23:10 UTC
25 points
10 comments52 min readLW link

AXRP Epi­sode 15 - Nat­u­ral Ab­strac­tions with John Wentworth

DanielFilan23 May 2022 5:40 UTC
34 points
1 comment58 min readLW link

AXRP Epi­sode 16 - Prepar­ing for De­bate AI with Ge­offrey Irving

DanielFilan1 Jul 2022 22:20 UTC
20 points
0 comments37 min readLW link

AXRP Epi­sode 17 - Train­ing for Very High Reli­a­bil­ity with Daniel Ziegler

DanielFilan21 Aug 2022 23:50 UTC
16 points
0 comments35 min readLW link

AXRP Epi­sode 18 - Con­cept Ex­trap­o­la­tion with Stu­art Armstrong

DanielFilan3 Sep 2022 23:12 UTC
12 points
1 comment39 min readLW link

AXRP Epi­sode 19 - Mechanis­tic In­ter­pretabil­ity with Neel Nanda

DanielFilan4 Feb 2023 3:00 UTC
45 points
0 comments117 min readLW link

AXRP: Store, Pa­treon, Video

DanielFilan7 Feb 2023 4:50 UTC
12 points
0 comments1 min readLW link

AXRP Epi­sode 35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization

DanielFilan24 Aug 2024 22:30 UTC
21 points
0 comments74 min readLW link

AXRP Epi­sode 11 - At­tain­able Utility and Power with Alex Turner

DanielFilan25 Sep 2021 21:10 UTC
19 points
5 comments53 min readLW link

AXRP Epi­sode 8 - As­sis­tance Games with Dy­lan Had­field-Menell

DanielFilan8 Jun 2021 23:20 UTC
22 points
1 comment72 min readLW link

AXRP Epi­sode 2 - Learn­ing Hu­man Bi­ases with Ro­hin Shah

DanielFilan29 Dec 2020 20:43 UTC
13 points
0 comments35 min readLW link

AXRP Epi­sode 1 - Ad­ver­sar­ial Poli­cies with Adam Gleave

DanielFilan29 Dec 2020 20:41 UTC
12 points
5 comments34 min readLW link

An­nounc­ing AXRP, the AI X-risk Re­search Podcast

DanielFilan23 Dec 2020 20:00 UTC
54 points
5 comments1 min readLW link
(danielfilan.com)

“In­fra-Bayesi­anism with Vanessa Kosoy” – Watch/​Dis­cuss Party

Ben Pace22 Mar 2021 23:44 UTC
27 points
45 comments1 min readLW link

AXRP Epi­sode 12 - AI Ex­is­ten­tial Risk with Paul Christiano

DanielFilan2 Dec 2021 2:20 UTC
38 points
0 comments126 min readLW link

AXRP Epi­sode 9 - Finite Fac­tored Sets with Scott Garrabrant

DanielFilan24 Jun 2021 22:10 UTC
59 points
2 comments59 min readLW link

AXRP Epi­sode 4 - Risks from Learned Op­ti­miza­tion with Evan Hubinger

DanielFilan18 Feb 2021 0:03 UTC
43 points
10 comments87 min readLW link

AXRP Epi­sode 10 - AI’s Fu­ture and Im­pacts with Katja Grace

DanielFilan23 Jul 2021 22:10 UTC
34 points
2 comments77 min readLW link

AXRP Epi­sode 7 - Side Effects with Vic­to­ria Krakovna

DanielFilan14 May 2021 3:50 UTC
34 points
6 comments43 min readLW link

AXRP Epi­sode 3 - Ne­go­tiable Re­in­force­ment Learn­ing with An­drew Critch

DanielFilan29 Dec 2020 20:45 UTC
27 points
0 comments28 min readLW link

AXRP Epi­sode 5 - In­fra-Bayesi­anism with Vanessa Kosoy

DanielFilan10 Mar 2021 4:30 UTC
35 points
12 comments36 min readLW link

AXRP Epi­sode 7.5 - Fore­cast­ing Trans­for­ma­tive AI from Biolog­i­cal An­chors with Ajeya Cotra

DanielFilan28 May 2021 0:20 UTC
24 points
1 comment67 min readLW link

AXRP Epi­sode 6 - De­bate and Imi­ta­tive Gen­er­al­iza­tion with Beth Barnes

DanielFilan8 Apr 2021 21:20 UTC
26 points
3 comments60 min readLW link
No comments.