Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
New
Hot
Active
Old
Page
1
Clarifying the role of the behavioral selection model
Alex Mallen
10 May 2026 19:41 UTC
13
points
0
comments
4
min read
LW
link
Alignment as Equilibrium Design
Elad Hazan
10 May 2026 18:56 UTC
3
points
0
comments
5
min read
LW
link
Claude Does Not Actually Taste Bananas: Potassium-Based Synthetic Phenomenology In Language Models
Noah Weinberger
10 May 2026 17:13 UTC
4
points
0
comments
10
min read
LW
link
(huggingface.co)
The Darwinian Honeymoon—Why I am not as impressed by human progress as I used to be
Elias Schmied
10 May 2026 15:55 UTC
48
points
5
comments
4
min read
LW
link
Reinforcement learning scaling might incentivise hidden reasoning architectures for AI
Oliver Sourbut
10 May 2026 15:30 UTC
18
points
0
comments
6
min read
LW
link
(www.oliversourbut.net)
Asymmetry Between Defensive and Acquisitive Instrumental Deception
keith_wynroe
10 May 2026 12:33 UTC
13
points
1
comment
5
min read
LW
link
Context Modification as a Negative Alignment Tax
Florian_Dietz
10 May 2026 11:32 UTC
7
points
0
comments
4
min read
LW
link
The AI Industrial Explosion — Part 2: Transition Dynamics
djbinder
10 May 2026 1:02 UTC
22
points
0
comments
12
min read
LW
link
(defensesindepth.bio)
International Law Cannot Prevent Extinction Either
Sausage Vector Machine
9 May 2026 22:34 UTC
62
points
8
comments
5
min read
LW
link
Do capabilities generalize across propensities?
Emil Ryd
9 May 2026 21:39 UTC
12
points
0
comments
8
min read
LW
link
Neural Networks learn Bloom Filters
Alex Gibson
9 May 2026 20:32 UTC
50
points
1
comment
12
min read
LW
link
Explaining Volition Without Resorting to Free Will
joseph_c
9 May 2026 18:57 UTC
12
points
10
comments
1
min read
LW
link
Second order thoughts on current AI agents
Michael Flood
9 May 2026 18:40 UTC
12
points
0
comments
2
min read
LW
link
If digital computers are conscious, they are conscious at the hardware level
cube_flipper
9 May 2026 15:08 UTC
41
points
34
comments
19
min read
LW
link
(smoothbrains.net)
Does Opus 4.7 Generate Deceptive Denials About Its Own Guardrails?
usize
9 May 2026 4:12 UTC
10
points
0
comments
3
min read
LW
link
(usize.github.io)
Bad Problems Don’t Stop Being Bad Because Somebody’s Wrong About Fault Analysis
Linch
9 May 2026 1:30 UTC
131
points
31
comments
3
min read
LW
link
We Should Have Mandatory Media/Communications Training For All Communicators
Darren McKee
8 May 2026 20:29 UTC
4
points
6
comments
3
min read
LW
link
Chess as a prediction model of the artificial intelligence impact on culture
849
8 May 2026 20:19 UTC
−10
points
1
comment
5
min read
LW
link
(lojkine.art)
The Saturation View: some responses
wdmacaskill
8 May 2026 17:32 UTC
25
points
4
comments
8
min read
LW
link
Is ProgramBench Impossible?
frmsaul
8 May 2026 17:04 UTC
78
points
6
comments
2
min read
LW
link
Back to top
Next