Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
New
Hot
Active
Old
Page
1
Language Models Model Us
eggsyntax
17 May 2024 21:00 UTC
81
points
19
comments
7
min read
LW
link
Instruction-following AGI is easier and more likely than value aligned AGI
Seth Herd
15 May 2024 19:38 UTC
35
points
18
comments
12
min read
LW
link
Ilya Sutskever and Jan Leike resign from OpenAI [updated]
Zach Stein-Perlman
15 May 2024 0:45 UTC
230
points
84
comments
2
min read
LW
link
DeepMind’s “Frontier Safety Framework” is weak and unambitious
Zach Stein-Perlman
18 May 2024 3:00 UTC
118
points
10
comments
4
min read
LW
link
Using GPT-3 for preventing conflict during messaging — a pitch for an app
Eli_
17 Mar 2022 11:02 UTC
22
points
17
comments
3
min read
LW
link
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
Joar Skalse
17 May 2024 19:13 UTC
47
points
2
comments
2
min read
LW
link
“If we go extinct due to misaligned AI, at least nature will continue, right? … right?”
plex
18 May 2024 14:09 UTC
40
points
9
comments
2
min read
LW
link
(aisafety.info)
Scientific Notation Options
jefftk
18 May 2024 15:10 UTC
19
points
7
comments
1
min read
LW
link
(www.jefftk.com)
[Question]
In the context of AI interp. What is a feature exactly?
f3mi
14 May 2024 13:46 UTC
9
points
1
comment
1
min read
LW
link
[Question]
Is acausal extortion possible?
sisyphus
11 Nov 2022 19:48 UTC
−20
points
36
comments
3
min read
LW
link
Do you believe in hundred dollar bills lying on the ground? Consider humming
Elizabeth
16 May 2024 0:00 UTC
128
points
11
comments
6
min read
LW
link
(acesounderglass.com)
Einstein’s Arrogance
Eliezer Yudkowsky
25 Sep 2007 1:29 UTC
155
points
90
comments
3
min read
LW
link
Why you should learn a musical instrument
cata
15 May 2024 20:36 UTC
48
points
23
comments
3
min read
LW
link
How much AI inference can we do?
Benjamin_Todd
14 May 2024 15:10 UTC
16
points
6
comments
5
min read
LW
link
(benjamintodd.substack.com)
A Dozen Ways to Get More Dakka
Davidmanheim
8 Apr 2024 4:45 UTC
97
points
9
comments
3
min read
LW
link
What Are Non-Zero-Sum Games?—A Primer
James Stephen Brown
18 May 2024 9:19 UTC
4
points
1
comment
3
min read
LW
link
[Crosspost] Introducing the Save State Paradox
Suzie. EXE
18 May 2024 17:00 UTC
1
point
0
comments
7
min read
LW
link
Advice for Activists from the History of Environmentalism
Jeffrey Heninger
16 May 2024 18:40 UTC
75
points
5
comments
6
min read
LW
link
(blog.aiimpacts.org)
Davidad’s Bold Plan for Alignment: An In-Depth Explanation
Charbel-Raphaël
and
Gabin
19 Apr 2023 16:09 UTC
154
points
32
comments
21
min read
LW
link
What Do We Mean By “Rationality”?
Eliezer Yudkowsky
16 Mar 2009 22:33 UTC
333
points
18
comments
6
min read
LW
link
Back to top
Next