Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
New
Hot
Active
Old
Page
1
Claude’s malicious compliance and normalization of deviance
Steff
5 Jul 2026 17:18 UTC
17
points
0
comments
8
min read
LW
link
Probing the loss-band sparsity assumption in Scientist AI
Alejandro Tlaie
5 Jul 2026 16:25 UTC
8
points
0
comments
7
min read
LW
link
Book Review: The God Test
PeterMcCluskey
5 Jul 2026 16:11 UTC
14
points
0
comments
4
min read
LW
link
We need 3rd party Training-Run Assessments
Alex Meinke
5 Jul 2026 15:55 UTC
26
points
0
comments
10
min read
LW
link
Harry Potter and the Rules of Quidditch
Tomás B.
5 Jul 2026 14:32 UTC
57
points
3
comments
3
min read
LW
link
A Normal Argument for AI Risk
Silent Swift
5 Jul 2026 9:32 UTC
15
points
0
comments
8
min read
LW
link
(silentswift.substack.com)
The case for the fleshman
dr_s
5 Jul 2026 9:28 UTC
14
points
11
comments
4
min read
LW
link
Reevaluating AI-2027: timelines, takeoff, alignment and China
StanislavKrym
5 Jul 2026 4:00 UTC
14
points
2
comments
5
min read
LW
link
Success Per Tokens
michaelwaves
5 Jul 2026 2:25 UTC
8
points
0
comments
3
min read
LW
link
A case for LLMs as Self-predictors
Ashe Vazquez Nuñez
5 Jul 2026 0:25 UTC
30
points
0
comments
10
min read
LW
link
Defining interpretation, and establishing a framework for it
Yaroven
4 Jul 2026 16:31 UTC
6
points
0
comments
5
min read
LW
link
The Lace (short story)
Michael Soareverix
4 Jul 2026 4:43 UTC
22
points
0
comments
4
min read
LW
link
Approximate Natural Latents Have Exact Prices
Haru
4 Jul 2026 1:57 UTC
18
points
0
comments
6
min read
LW
link
I think alignment work is more promising than control work
Alec Harris
3 Jul 2026 23:40 UTC
85
points
9
comments
8
min read
LW
link
On “gendertropes” in dath ilan
Eliezer Yudkowsky
3 Jul 2026 22:20 UTC
59
points
1
comment
3
min read
LW
link
American AI if the boom is a bubble: the Karp-Zitron scenario
Mitchell_Porter
3 Jul 2026 21:46 UTC
6
points
1
comment
2
min read
LW
link
(Don’t fear) the strangelet
djbinder
3 Jul 2026 17:39 UTC
101
points
2
comments
22
min read
LW
link
(defensesindepth.bio)
The Reverse AI Box
James_Miller
3 Jul 2026 16:08 UTC
9
points
1
comment
6
min read
LW
link
Pragmatic FDT, and predictors as game theory
Stuart_Armstrong
3 Jul 2026 13:22 UTC
32
points
10
comments
11
min read
LW
link
Scheming Evals Mislead in Both Directions
Chijioke Ugwuanyi
,
eric-z
and
TerryJCZhang
3 Jul 2026 11:49 UTC
21
points
0
comments
10
min read
LW
link
Back to top
Next