Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
New
Hot
Active
Old
Page
1
Reward Hacking at the 1937 World’s Fair
frmsaul
12 Jun 2026 17:47 UTC
16
points
1
comment
3
min read
LW
link
Bunk in AF
Fernand0
12 Jun 2026 17:41 UTC
1
point
0
comments
1
min read
LW
link
Building and evaluating model diffing agents
bilalchughtai
,
Josh Engels
and
Neel Nanda
12 Jun 2026 17:14 UTC
33
points
0
comments
12
min read
LW
link
“AF needs empirical grounding” is a meaningless valley of compromise
Fernand0
12 Jun 2026 16:37 UTC
1
point
0
comments
1
min read
LW
link
How bad would it be if GPS satellites were shot down?
Jackson Wagner
12 Jun 2026 16:34 UTC
12
points
0
comments
21
min read
LW
link
Sympathy for both sides of the egregious misalignment debate
Steven Byrnes
12 Jun 2026 16:26 UTC
64
points
3
comments
4
min read
LW
link
The Uncertainty That Matters Isn’t Fundamental
jimmy
12 Jun 2026 16:23 UTC
13
points
0
comments
13
min read
LW
link
Citations Needed: Magic Encyclopedias to Save the World
Oliver Sourbut
12 Jun 2026 15:35 UTC
14
points
0
comments
5
min read
LW
link
(www.oliversourbut.net)
If you, a human, can imagine red and green being swapped, you are probably conscious
vals tutor
12 Jun 2026 13:28 UTC
1
point
12
comments
7
min read
LW
link
Simulating Simulators
kromem
12 Jun 2026 12:56 UTC
24
points
1
comment
15
min read
LW
link
Parkinson’s Heuristic: The Only Time To Do Anything
Ben Pace
12 Jun 2026 6:55 UTC
71
points
5
comments
5
min read
LW
link
PSA: Almost nobody is working on alignment
Chi Nguyen
and
peterbarnett
12 Jun 2026 5:17 UTC
170
points
18
comments
1
min read
LW
link
Honey is Good
G Wood
12 Jun 2026 4:07 UTC
7
points
0
comments
3
min read
LW
link
The Aestheticising Vice by Paul Seabright
Linch
12 Jun 2026 2:20 UTC
21
points
2
comments
2
min read
LW
link
Celene’s thoughts on consciousness
ToasterLightning
12 Jun 2026 0:55 UTC
44
points
29
comments
18
min read
LW
link
(terminuspoint.substack.com)
Construct validity of Claude Opus 4.8′s System Card – A commentary
Maria Federica Martino Lena
11 Jun 2026 23:33 UTC
7
points
0
comments
16
min read
LW
link
you won’t one-shot a perfect system, but try anyway
PossiblyElaine
11 Jun 2026 22:43 UTC
9
points
0
comments
4
min read
LW
link
(possiblyelaine.substack.com)
The long arc of alignment: second-order instrumental convergence
Emma Leonhart
11 Jun 2026 21:12 UTC
−2
points
0
comments
3
min read
LW
link
Newcomb’s problem from the grand-system and petty-system views
transhumanist_atom_understander
11 Jun 2026 20:58 UTC
12
points
0
comments
5
min read
LW
link
[New Paper] Prioritizing Risks from AI: A Delphi Study of 272 Experts
peterslattery
11 Jun 2026 20:57 UTC
14
points
0
comments
2
min read
LW
link
(airisk.mit.edu)
Back to top
Next