Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
New
Hot
Active
Old
Page
1
On revolutionary love in AI safety
Troy Tian
22 Jun 2026 3:48 UTC
2
points
0
comments
4
min read
LW
link
Do AI Biorisk Thresholds Need Intermediate Warning Levels?
Lukas Frei
22 Jun 2026 1:09 UTC
9
points
0
comments
3
min read
LW
link
NLA explanations can be shortened without harming reconstruction
loops
22 Jun 2026 0:57 UTC
22
points
2
comments
3
min read
LW
link
Introducing MonitoringBench
monika_j
21 Jun 2026 18:43 UTC
33
points
0
comments
6
min read
LW
link
How persona training could fail
Simon Lermen
21 Jun 2026 16:38 UTC
12
points
0
comments
4
min read
LW
link
A high-level model of AI bargaining
Anthony DiGiovanni
21 Jun 2026 15:37 UTC
11
points
1
comment
5
min read
LW
link
Policy changes should be rolled out gradually
Yair Halberstadt
21 Jun 2026 11:07 UTC
24
points
2
comments
3
min read
LW
link
A misalignment taxonomy
Alec Harris
21 Jun 2026 10:20 UTC
12
points
2
comments
3
min read
LW
link
The Cookie Monster Explains AI Safety
michaelwaves
21 Jun 2026 0:52 UTC
12
points
2
comments
2
min read
LW
link
How are there 0 studies (maybe 1) on sex-concordant hormone therapy?
Util
20 Jun 2026 22:36 UTC
14
points
0
comments
3
min read
LW
link
Against Planet-Eating Nanoreplicators
SurvivalBias
20 Jun 2026 20:27 UTC
10
points
7
comments
5
min read
LW
link
How transparent is DiffusionGemma (and why it matters)
Josh Engels
,
Callum McDougall
,
bilalchughtai
,
János Kramár
,
Senthooran Rajamanoharan
,
Arthur Conmy
,
Rohin Shah
and
Neel Nanda
20 Jun 2026 20:05 UTC
71
points
2
comments
4
min read
LW
link
The Invisible Side of AI Governance
Charbel-Raphaël
20 Jun 2026 18:54 UTC
94
points
4
comments
14
min read
LW
link
Would anybody here be interested in a “mistake postmortem” discussion group?
SK2
20 Jun 2026 12:03 UTC
47
points
7
comments
4
min read
LW
link
The LLM shoggoth meme is weirder than you think
HedonicEscalator
19 Jun 2026 23:35 UTC
126
points
8
comments
7
min read
LW
link
(hedonicescalator.substack.com)
How I think developers of frontier AI systems and regulators ought to act in the face of existential AI risk
WilliamKiely
19 Jun 2026 22:22 UTC
12
points
0
comments
12
min read
LW
link
Hyperstition as the Natural Enemy of Rationality
alseph
19 Jun 2026 21:12 UTC
29
points
8
comments
3
min read
LW
link
World-modeling the US vs. Anthropic Standoff on Claude Fable
dschwarz
19 Jun 2026 20:04 UTC
20
points
4
comments
8
min read
LW
link
Thoughts on Likelihood of Existential Risks by Misaligned AIs
Ishan Khire
19 Jun 2026 19:17 UTC
3
points
0
comments
6
min read
LW
link
(ishankhire.substack.com)
Why should AI be moral?
Zach Thornton
19 Jun 2026 19:13 UTC
12
points
3
comments
9
min read
LW
link
Back to top
Next