Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
DanielFilan
Karma:
8,576
All
Posts
Comments
New
Top
Old
Page
1
AXRP Episode 38.6 - Joel Lehman on Positive Visions of AI
DanielFilan
24 Jan 2025 23:00 UTC
10
points
0
comments
9
min read
LW
link
AXRP Episode 38.5 - Adrià Garriga-Alonso on Detecting AI Scheming
DanielFilan
20 Jan 2025 0:40 UTC
9
points
0
comments
16
min read
LW
link
MATS mentor selection
DanielFilan
and
Ryan Kidd
10 Jan 2025 3:12 UTC
41
points
11
comments
6
min read
LW
link
AXRP Episode 38.4 - Shakeel Hashim on AI Journalism
DanielFilan
5 Jan 2025 0:20 UTC
9
points
0
comments
12
min read
LW
link
AXRP Episode 38.3 - Erik Jenner on Learned Look-Ahead
DanielFilan
12 Dec 2024 5:40 UTC
20
points
0
comments
16
min read
LW
link
AXRP Episode 39 - Evan Hubinger on Model Organisms of Misalignment
DanielFilan
1 Dec 2024 6:00 UTC
41
points
0
comments
67
min read
LW
link
AXRP Episode 38.2 - Jesse Hoogland on Singular Learning Theory
DanielFilan
27 Nov 2024 6:30 UTC
34
points
0
comments
10
min read
LW
link
AXRP Episode 38.1 - Alan Chan on Agent Infrastructure
DanielFilan
16 Nov 2024 23:30 UTC
12
points
0
comments
14
min read
LW
link
AXRP Episode 38.0 - Zhijing Jin on LLMs, Causality, and Multi-Agent Systems
DanielFilan
14 Nov 2024 7:00 UTC
14
points
0
comments
12
min read
LW
link
MATS AI Safety Strategy Curriculum v2
DanielFilan
and
Ryan Kidd
7 Oct 2024 22:44 UTC
42
points
6
comments
13
min read
LW
link
AXRP Episode 37 - Jaime Sevilla on Forecasting AI
DanielFilan
4 Oct 2024 21:00 UTC
21
points
3
comments
56
min read
LW
link
AXRP Episode 36 - Adam Shai and Paul Riechers on Computational Mechanics
DanielFilan
29 Sep 2024 5:50 UTC
25
points
0
comments
55
min read
LW
link
AXRP Episode 35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization
DanielFilan
24 Aug 2024 22:30 UTC
21
points
0
comments
74
min read
LW
link
AXRP Episode 34 - AI Evaluations with Beth Barnes
DanielFilan
28 Jul 2024 3:30 UTC
23
points
0
comments
69
min read
LW
link
Why keep a diary, and why wish for large language models
DanielFilan
14 Jun 2024 16:10 UTC
9
points
1
comment
2
min read
LW
link
(danielfilan.com)
AXRP Episode 33 - RLHF Problems with Scott Emmons
DanielFilan
12 Jun 2024 3:30 UTC
34
points
0
comments
56
min read
LW
link
AXRP Episode 32 - Understanding Agency with Jan Kulveit
DanielFilan
30 May 2024 3:50 UTC
20
points
0
comments
53
min read
LW
link
AXRP Episode 31 - Singular Learning Theory with Daniel Murfet
DanielFilan
7 May 2024 3:50 UTC
72
points
4
comments
71
min read
LW
link
AXRP Episode 30 - AI Security with Jeffrey Ladish
DanielFilan
1 May 2024 2:50 UTC
25
points
0
comments
79
min read
LW
link
AXRP Episode 29 - Science of Deep Learning with Vikrant Varma
DanielFilan
25 Apr 2024 19:10 UTC
20
points
1
comment
63
min read
LW
link
Back to top
Next