RSS

AXRP

TagLast edit: Jan 28, 2025, 2:39 AM by Ruby

AI X-Risk Research Podcast is a podcast hosted by Daniel Filan.

See also: Audio, Interviews

Video/​an­i­ma­tion: Neel Nanda ex­plains what mechanis­tic in­ter­pretabil­ity is

DanielFilanFeb 22, 2023, 10:42 PM
24 points
7 comments1 min readLW link
(youtu.be)

AXRP Epi­sode 32 - Un­der­stand­ing Agency with Jan Kulveit

DanielFilanMay 30, 2024, 3:50 AM
20 points
0 comments53 min readLW link

AXRP Epi­sode 27 - AI Con­trol with Buck Sh­legeris and Ryan Greenblatt

DanielFilanApr 11, 2024, 9:30 PM
69 points
10 comments107 min readLW link

AXRP Epi­sode 28 - Su­ing Labs for AI Risk with Gabriel Weil

DanielFilanApr 17, 2024, 9:42 PM
12 points
0 comments65 min readLW link

AXRP Epi­sode 29 - Science of Deep Learn­ing with Vikrant Varma

DanielFilanApr 25, 2024, 7:10 PM
20 points
1 comment63 min readLW link

AXRP Epi­sode 38.3 - Erik Jen­ner on Learned Look-Ahead

DanielFilanDec 12, 2024, 5:40 AM
20 points
0 comments16 min readLW link

AXRP Epi­sode 30 - AI Se­cu­rity with Jeffrey Ladish

DanielFilanMay 1, 2024, 2:50 AM
25 points
0 comments79 min readLW link

AXRP Epi­sode 31 - Sin­gu­lar Learn­ing The­ory with Daniel Murfet

DanielFilanMay 7, 2024, 3:50 AM
72 points
4 comments71 min readLW link

AXRP Epi­sode 33 - RLHF Prob­lems with Scott Emmons

DanielFilanJun 12, 2024, 3:30 AM
34 points
0 comments56 min readLW link

AXRP Epi­sode 34 - AI Eval­u­a­tions with Beth Barnes

DanielFilanJul 28, 2024, 3:30 AM
23 points
0 comments69 min readLW link

AXRP Epi­sode 36 - Adam Shai and Paul Riech­ers on Com­pu­ta­tional Mechanics

DanielFilanSep 29, 2024, 5:50 AM
25 points
0 comments55 min readLW link

AXRP Epi­sode 37 - Jaime Sevilla on Fore­cast­ing AI

DanielFilanOct 4, 2024, 9:00 PM
21 points
3 comments56 min readLW link

AXRP Epi­sode 38.0 - Zhijing Jin on LLMs, Causal­ity, and Multi-Agent Systems

DanielFilanNov 14, 2024, 7:00 AM
14 points
0 comments12 min readLW link

AXRP Epi­sode 38.1 - Alan Chan on Agent Infrastructure

DanielFilanNov 16, 2024, 11:30 PM
12 points
0 comments14 min readLW link

AXRP Epi­sode 38.2 - Jesse Hoogland on Sin­gu­lar Learn­ing Theory

DanielFilanNov 27, 2024, 6:30 AM
34 points
0 comments10 min readLW link

AXRP Epi­sode 39 - Evan Hub­inger on Model Or­ganisms of Misalignment

DanielFilanDec 1, 2024, 6:00 AM
41 points
0 comments67 min readLW link

AXRP Epi­sode 38.8 - David Du­ve­naud on Sab­o­tage Eval­u­a­tions and the Post-AGI Future

DanielFilanMar 1, 2025, 1:20 AM
13 points
0 comments13 min readLW link

AXRP Epi­sode 38.4 - Sha­keel Hashim on AI Journalism

DanielFilanJan 5, 2025, 12:20 AM
11 points
0 comments12 min readLW link

AXRP Epi­sode 38.5 - Adrià Gar­riga-Alonso on De­tect­ing AI Scheming

DanielFilanJan 20, 2025, 12:40 AM
9 points
0 comments16 min readLW link

AXRP Epi­sode 38.6 - Joel Lehman on Pos­i­tive Vi­sions of AI

DanielFilanJan 24, 2025, 11:00 PM
10 points
0 comments9 min readLW link

AXRP Epi­sode 38.7 - An­thony Aguirre on the Fu­ture of Life Institute

DanielFilanFeb 9, 2025, 1:10 AM
10 points
0 comments12 min readLW link

AXRP Epi­sode 20 - ‘Re­form’ AI Align­ment with Scott Aaronson

DanielFilanApr 12, 2023, 9:30 PM
22 points
2 comments68 min readLW link

AXRP Epi­sode 26 - AI Gover­nance with Eliz­a­beth Seger

DanielFilanNov 26, 2023, 11:00 PM
14 points
0 comments66 min readLW link

AXRP Epi­sode 21 - In­ter­pretabil­ity for Eng­ineers with Stephen Casper

DanielFilanMay 2, 2023, 12:50 AM
12 points
1 comment66 min readLW link

AXRP Epi­sode 22 - Shard The­ory with Quintin Pope

DanielFilanJun 15, 2023, 7:00 PM
52 points
11 comments93 min readLW link

AXRP Epi­sode 40 - Ja­son Gross on Com­pact Proofs and Interpretability

DanielFilanMar 28, 2025, 6:40 PM
23 points
0 comments89 min readLW link

AXRP an­nounce­ment: Sur­vey, Store Clos­ing, Patreon

DanielFilanJun 28, 2023, 11:40 PM
14 points
0 comments1 min readLW link

AXRP Epi­sode 23 - Mechanis­tic Ano­maly De­tec­tion with Mark Xu

DanielFilanJul 27, 2023, 1:50 AM
22 points
0 comments72 min readLW link

AXRP Epi­sode 24 - Su­per­al­ign­ment with Jan Leike

DanielFilanJul 27, 2023, 4:00 AM
55 points
3 comments69 min readLW link

AXRP Epi­sode 25 - Co­op­er­a­tive AI with Cas­par Oesterheld

DanielFilanOct 3, 2023, 9:50 PM
43 points
0 comments92 min readLW link

AXRP Epi­sode 13 - First Prin­ci­ples of AGI Safety with Richard Ngo

DanielFilanMar 31, 2022, 5:20 AM
25 points
1 comment48 min readLW link

AXRP Epi­sode 14 - In­fra-Bayesian Phys­i­cal­ism with Vanessa Kosoy

DanielFilanApr 5, 2022, 11:10 PM
25 points
10 comments52 min readLW link

AXRP Epi­sode 15 - Nat­u­ral Ab­strac­tions with John Wentworth

DanielFilanMay 23, 2022, 5:40 AM
34 points
1 comment58 min readLW link

AXRP Epi­sode 16 - Prepar­ing for De­bate AI with Ge­offrey Irving

DanielFilanJul 1, 2022, 10:20 PM
20 points
0 comments37 min readLW link

AXRP Epi­sode 17 - Train­ing for Very High Reli­a­bil­ity with Daniel Ziegler

DanielFilanAug 21, 2022, 11:50 PM
16 points
0 comments35 min readLW link

AXRP Epi­sode 18 - Con­cept Ex­trap­o­la­tion with Stu­art Armstrong

DanielFilanSep 3, 2022, 11:12 PM
12 points
1 comment39 min readLW link

AXRP Epi­sode 19 - Mechanis­tic In­ter­pretabil­ity with Neel Nanda

DanielFilanFeb 4, 2023, 3:00 AM
45 points
0 comments117 min readLW link

AXRP: Store, Pa­treon, Video

DanielFilanFeb 7, 2023, 4:50 AM
12 points
0 comments1 min readLW link

AXRP Epi­sode 35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization

DanielFilanAug 24, 2024, 10:30 PM
21 points
0 comments74 min readLW link

AXRP Epi­sode 11 - At­tain­able Utility and Power with Alex Turner

DanielFilanSep 25, 2021, 9:10 PM
19 points
5 comments53 min readLW link

AXRP Epi­sode 8 - As­sis­tance Games with Dy­lan Had­field-Menell

DanielFilanJun 8, 2021, 11:20 PM
22 points
1 comment72 min readLW link

AXRP Epi­sode 2 - Learn­ing Hu­man Bi­ases with Ro­hin Shah

DanielFilanDec 29, 2020, 8:43 PM
13 points
0 comments35 min readLW link

AXRP Epi­sode 1 - Ad­ver­sar­ial Poli­cies with Adam Gleave

DanielFilanDec 29, 2020, 8:41 PM
12 points
5 comments34 min readLW link

An­nounc­ing AXRP, the AI X-risk Re­search Podcast

DanielFilanDec 23, 2020, 8:00 PM
54 points
5 comments1 min readLW link
(danielfilan.com)

“In­fra-Bayesi­anism with Vanessa Kosoy” – Watch/​Dis­cuss Party

Ben PaceMar 22, 2021, 11:44 PM
27 points
45 comments1 min readLW link

AXRP Epi­sode 12 - AI Ex­is­ten­tial Risk with Paul Christiano

DanielFilanDec 2, 2021, 2:20 AM
38 points
0 comments126 min readLW link

AXRP Epi­sode 9 - Finite Fac­tored Sets with Scott Garrabrant

DanielFilanJun 24, 2021, 10:10 PM
59 points
2 comments59 min readLW link

AXRP Epi­sode 4 - Risks from Learned Op­ti­miza­tion with Evan Hubinger

DanielFilanFeb 18, 2021, 12:03 AM
43 points
10 comments87 min readLW link

AXRP Epi­sode 10 - AI’s Fu­ture and Im­pacts with Katja Grace

DanielFilanJul 23, 2021, 10:10 PM
34 points
2 comments77 min readLW link

AXRP Epi­sode 7 - Side Effects with Vic­to­ria Krakovna

DanielFilanMay 14, 2021, 3:50 AM
34 points
6 comments43 min readLW link

AXRP Epi­sode 3 - Ne­go­tiable Re­in­force­ment Learn­ing with An­drew Critch

DanielFilanDec 29, 2020, 8:45 PM
27 points
0 comments28 min readLW link

AXRP Epi­sode 5 - In­fra-Bayesi­anism with Vanessa Kosoy

DanielFilanMar 10, 2021, 4:30 AM
35 points
12 comments36 min readLW link

AXRP Epi­sode 7.5 - Fore­cast­ing Trans­for­ma­tive AI from Biolog­i­cal An­chors with Ajeya Cotra

DanielFilanMay 28, 2021, 12:20 AM
24 points
1 comment67 min readLW link

AXRP Epi­sode 6 - De­bate and Imi­ta­tive Gen­er­al­iza­tion with Beth Barnes

DanielFilanApr 8, 2021, 9:20 PM
26 points
3 comments60 min readLW link
No comments.