Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
HoldenKarnofsky
Karma:
7,075
All
Posts
Comments
New
Top
Old
Page
1
Sabotage Evaluations for Frontier Models
David Duvenaud
,
Joe Benton
,
Sam Bowman
,
evhub
,
mishajw
,
Eric Christiansen
,
HoldenKarnofsky
,
Ethan Perez
and
Buck
18 Oct 2024 22:33 UTC
93
points
55
comments
6
min read
LW
link
(assets.anthropic.com)
Case studies on social-welfare-based standards in various industries
HoldenKarnofsky
20 Jun 2024 13:33 UTC
42
points
0
comments
1
min read
LW
link
Good job opportunities for helping with the most important century
HoldenKarnofsky
18 Jan 2024 17:30 UTC
36
points
0
comments
4
min read
LW
link
(www.cold-takes.com)
We’re Not Ready: thoughts on “pausing” and responsible scaling policies
HoldenKarnofsky
27 Oct 2023 15:19 UTC
200
points
33
comments
8
min read
LW
link
3 levels of threat obfuscation
HoldenKarnofsky
2 Aug 2023 14:58 UTC
69
points
14
comments
7
min read
LW
link
A Playbook for AI Risk Reduction (focused on misaligned AI)
HoldenKarnofsky
6 Jun 2023 18:05 UTC
90
points
41
comments
14
min read
LW
link
Seeking (Paid) Case Studies on Standards
HoldenKarnofsky
26 May 2023 17:58 UTC
69
points
9
comments
11
min read
LW
link
Success without dignity: a nearcasting story of avoiding catastrophe by luck
HoldenKarnofsky
14 Mar 2023 19:23 UTC
76
points
17
comments
15
min read
LW
link
Discussion with Nate Soares on a key alignment difficulty
HoldenKarnofsky
13 Mar 2023 21:20 UTC
256
points
42
comments
22
min read
LW
link
What does Bing Chat tell us about AI risk?
HoldenKarnofsky
28 Feb 2023 17:40 UTC
80
points
21
comments
2
min read
LW
link
(www.cold-takes.com)
How major governments can help with the most important century
HoldenKarnofsky
24 Feb 2023 18:20 UTC
29
points
0
comments
4
min read
LW
link
(www.cold-takes.com)
What AI companies can do today to help with the most important century
HoldenKarnofsky
20 Feb 2023 17:00 UTC
38
points
3
comments
9
min read
LW
link
(www.cold-takes.com)
Jobs that can help with the most important century
HoldenKarnofsky
10 Feb 2023 18:20 UTC
24
points
0
comments
19
min read
LW
link
(www.cold-takes.com)
Spreading messages to help with the most important century
HoldenKarnofsky
25 Jan 2023 18:20 UTC
75
points
4
comments
18
min read
LW
link
(www.cold-takes.com)
How we could stumble into AI catastrophe
HoldenKarnofsky
13 Jan 2023 16:20 UTC
71
points
18
comments
18
min read
LW
link
(www.cold-takes.com)
Transformative AI issues (not just misalignment): an overview
HoldenKarnofsky
5 Jan 2023 20:20 UTC
34
points
6
comments
18
min read
LW
link
(www.cold-takes.com)
Racing through a minefield: the AI deployment problem
HoldenKarnofsky
22 Dec 2022 16:10 UTC
38
points
2
comments
13
min read
LW
link
(www.cold-takes.com)
High-level hopes for AI alignment
HoldenKarnofsky
15 Dec 2022 18:00 UTC
58
points
3
comments
19
min read
LW
link
(www.cold-takes.com)
AI Safety Seems Hard to Measure
HoldenKarnofsky
8 Dec 2022 19:50 UTC
71
points
6
comments
14
min read
LW
link
(www.cold-takes.com)
Why Would AI “Aim” To Defeat Humanity?
HoldenKarnofsky
29 Nov 2022 19:30 UTC
69
points
10
comments
33
min read
LW
link
(www.cold-takes.com)
Back to top
Next