Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
New
Hot
Active
Old
Page
1
The 1890 Census as a fun cluster
Fernand0
14 Jun 2026 15:41 UTC
4
points
2
comments
1
min read
LW
link
The Hidden Structures of Problems
spencerg
14 Jun 2026 13:51 UTC
41
points
0
comments
3
min read
LW
link
(www.spencergreenberg.com)
Agent Identity Standardisation Efforts
tr5tn
14 Jun 2026 11:30 UTC
2
points
0
comments
2
min read
LW
link
Wikipedia’s national flavors—French
Fernand0
14 Jun 2026 10:29 UTC
5
points
1
comment
2
min read
LW
link
Low-temperature bunk
Fernand0
14 Jun 2026 7:59 UTC
7
points
0
comments
1
min read
LW
link
I Bet Abliteration’s Cost Was Sloppy Implementation. I Was Wrong
christian-mc
14 Jun 2026 6:03 UTC
2
points
0
comments
6
min read
LW
link
Don’t just aim for Frontier Labs
emile delcourt
14 Jun 2026 4:41 UTC
3
points
0
comments
28
min read
LW
link
Anthropic Is Taking AI Welfare Seriously. I’m Not Sure It Knows What It’s Measuring.
Failfinder70
13 Jun 2026 20:54 UTC
−1
points
1
comment
3
min read
LW
link
A cheap specialist judge gets used by agents but fails to reduce alignment audit costs
burnssa
13 Jun 2026 20:38 UTC
8
points
0
comments
8
min read
LW
link
Not telling is lying
Fernand0
13 Jun 2026 18:12 UTC
12
points
14
comments
3
min read
LW
link
A simple argument for trying less hard
Elias Schmied
13 Jun 2026 18:12 UTC
13
points
1
comment
3
min read
LW
link
How might continual learning affect safety and alignment?
Rauno Arike
,
RohanS
,
Owen Terry
,
Achu Menon
,
Zhijing Jin
,
Francis Rhys Ward
and
Seth Herd
13 Jun 2026 17:34 UTC
51
points
2
comments
16
min read
LW
link
Presentfulness: Lucidity, Osmosis, and Dissociation
Astrid Callender
13 Jun 2026 17:21 UTC
4
points
2
comments
5
min read
LW
link
How to Suffer Less
Gordon Seidoh Worley
13 Jun 2026 17:10 UTC
18
points
1
comment
6
min read
LW
link
(www.uncertainupdates.com)
Somewhat Contra Ted Chiang on AI Consciousness
ThomasJ
13 Jun 2026 16:49 UTC
5
points
0
comments
10
min read
LW
link
The term “AGI” is almost useless at this point [Linkpost]
Noosphere89
13 Jun 2026 16:15 UTC
31
points
1
comment
5
min read
LW
link
(helentoner.substack.com)
SFT Drives Gemini’s Safety Properties
Josh Engels
,
Arthur Conmy
,
bilalchughtai
and
Neel Nanda
13 Jun 2026 15:31 UTC
54
points
1
comment
1
min read
LW
link
AML for AI as a verification mechanism
MarkelKori
13 Jun 2026 11:59 UTC
9
points
2
comments
2
min read
LW
link
Pulling hedonic utilitarianism out of ethical emotivism
Bill Jackson
13 Jun 2026 11:50 UTC
6
points
1
comment
6
min read
LW
link
(billjackson7.substack.com)
Tequila Sunset at the Hog’s Head (A Scene)
Ben Pace
13 Jun 2026 6:53 UTC
20
points
0
comments
5
min read
LW
link
Back to top
Next