RSS

LLMs as am­plifiers, not assistants

Caleb Biddulph19 Jun 2025 17:21 UTC
9 points
0 comments7 min readLW link

How The Singer Sang His Tales

adamShimi19 Jun 2025 17:06 UTC
18 points
0 comments36 min readLW link
(formethods.substack.com)

Key paths, plans and strate­gies to AI safety success

Adam Jones19 Jun 2025 16:56 UTC
5 points
0 comments6 min readLW link
(bluedot.org)

AI safety tech­niques lev­er­ag­ing distillation

ryan_greenblatt19 Jun 2025 14:31 UTC
53 points
0 comments12 min readLW link

Poli­ti­cal Fund­ing Ex­per­tise (Post 6 of 7 on AI Gover­nance)

Mass_Driver19 Jun 2025 14:14 UTC
14 points
0 comments14 min readLW link

Doc­u­ments Are Dead. Long Live the Con­ver­sa­tional Proxy.

8harath19 Jun 2025 14:01 UTC
−8 points
1 comment1 min readLW link

A deep cri­tique of AI 2027’s bad timeline models

titotal19 Jun 2025 13:29 UTC
176 points
2 comments39 min readLW link
(titotal.substack.com)

AI can win a con­flict against us

19 Jun 2025 7:20 UTC
4 points
0 comments2 min readLW link

Differ­ent goals may bring AI into con­flict with us

19 Jun 2025 7:19 UTC
5 points
0 comments2 min readLW link

My Failed AI Safety Re­search Pro­jects (Q1/​Q2 2025)

Adam Newgas19 Jun 2025 3:55 UTC
17 points
0 comments3 min readLW link

On May 1, 2033, hu­man­ity dis­cov­ered that AI had been al­igned by de­fault.

Yitz18 Jun 2025 19:57 UTC
11 points
2 comments1 min readLW link

New Ethics for the AI Age

Matthieu Tehenan18 Jun 2025 19:30 UTC
1 point
0 comments6 min readLW link

Fac­tored Cog­ni­tion Strength­ens Mon­i­tor­ing and Thwarts Attacks

18 Jun 2025 18:28 UTC
23 points
0 comments25 min readLW link

Sparsely-con­nected Cross-layer Transcoders

jacob_drori18 Jun 2025 17:13 UTC
41 points
2 comments12 min readLW link

Mo­ral Align­ment: An Idea I’m Em­bar­rassed I Didn’t Think of Myself

Gordon Seidoh Worley18 Jun 2025 15:42 UTC
14 points
50 comments2 min readLW link

This was meant for you

Logan Kieller18 Jun 2025 15:26 UTC
3 points
0 comments8 min readLW link
(agenticconjectures.substack.com)

Chil­dren of War: Hid­den dan­gers of an AI arms race

Peter Kuhn18 Jun 2025 15:19 UTC
4 points
0 comments7 min readLW link

Fic­tional Think­ing and Real Thinking

johnswentworth17 Jun 2025 19:13 UTC
50 points
8 comments4 min readLW link

The Cu­ri­ous Case of the bos_token

larry-dial17 Jun 2025 19:00 UTC
11 points
1 comment10 min readLW link

Com­par­ing Sparse Au­toen­coder Fea­tures from In­di­vi­d­ual and Com­bined Datasets

Greg B17 Jun 2025 18:41 UTC
1 point
0 comments9 min readLW link