Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
David Lindner
Karma:
335
All
Posts
Comments
New
Top
Old
On scalable oversight with weak LLMs judging strong LLMs
zac_kenton
,
Noah Siegel
,
janos
,
Jonah Brown-Cohen
,
Samuel Albanie
,
David Lindner
and
Rohin Shah
8 Jul 2024 8:59 UTC
49
points
18
comments
7
min read
LW
link
(arxiv.org)
VLM-RM: Specifying Rewards with Natural Language
ChengCheng
,
David Lindner
and
Ethan Perez
23 Oct 2023 14:11 UTC
20
points
2
comments
5
min read
LW
link
(far.ai)
Practical Pitfalls of Causal Scrubbing
Jérémy Scheurer
,
Phil3
,
tony
,
jacquesthibs
and
David Lindner
27 Mar 2023 7:47 UTC
87
points
17
comments
13
min read
LW
link
Threat Model Literature Review
zac_kenton
,
Rohin Shah
,
David Lindner
,
Vikrant Varma
,
Vika
,
Mary Phuong
,
Ramana Kumar
and
Elliot Catt
1 Nov 2022 11:03 UTC
77
points
4
comments
25
min read
LW
link
Clarifying AI X-risk
zac_kenton
,
Rohin Shah
,
David Lindner
,
Vikrant Varma
,
Vika
,
Mary Phuong
,
Ramana Kumar
and
Elliot Catt
1 Nov 2022 11:03 UTC
127
points
24
comments
4
min read
LW
link
1
review
Back to top