RSS

How Go Play­ers Disem­power Them­selves to AI

Ashe Vazquez Nuñez1 May 2026 23:24 UTC
14 points
2 comments8 min readLW link

Early-stage em­piri­cal work on “spillway mo­ti­va­tions”

1 May 2026 21:29 UTC
16 points
0 comments8 min readLW link

Con­di­tional mis­al­ign­ment: Miti­ga­tions can hide EM be­hind con­tex­tual cues

1 May 2026 20:09 UTC
50 points
1 comment11 min readLW link

Am­bi­tious Mech In­terp w/​ Ten­sor-trans­form­ers on toy lan­guages [Pro­ject Pro­posal]

Logan Riggs1 May 2026 19:17 UTC
18 points
0 comments2 min readLW link

Risk from fit­ness-seek­ing AIs: mechanisms and mitigations

Alex Mallen1 May 2026 17:42 UTC
66 points
0 comments32 min readLW link

Your four-di­men­sional body

PatrickDFarley1 May 2026 17:22 UTC
7 points
0 comments3 min readLW link

Qualia are in­ter­nal vari­ables but they are taken from differ­ent realm

avturchin1 May 2026 10:43 UTC
9 points
13 comments2 min readLW link

Open strate­gic ques­tions for digi­tal minds

lucius1 May 2026 9:56 UTC
17 points
1 comment13 min readLW link
(outpaced.substack.com)

11 ways to be less deferential

KatjaGrace1 May 2026 8:00 UTC
19 points
3 comments2 min readLW link
(worldspiritsockpuppet.com)

San­ity-check­ing “In­com­press­ible Knowl­edge Probes”

1 May 2026 6:52 UTC
37 points
0 comments16 min readLW link

Au­tomat­ing In­ter­pretabil­ity with Agents

1 May 2026 2:59 UTC
8 points
0 comments10 min readLW link

How much should the ideal per­son cry wolf?

KatjaGrace30 Apr 2026 23:10 UTC
31 points
6 comments2 min readLW link
(worldspiritsockpuppet.com)

Cam­bridge: the kettle

KatjaGrace30 Apr 2026 23:10 UTC
15 points
0 comments4 min readLW link
(worldspiritsockpuppet.com)

AI un­em­ploy­ment and AI ex­tinc­tion are of­ten the same

KatjaGrace30 Apr 2026 23:10 UTC
26 points
4 comments2 min readLW link
(worldspiritsockpuppet.com)

SFF’s HSEE grant round; hu­man in­tel­li­gence am­plifi­ca­tion pro­jects I’d like to see

TsviBT30 Apr 2026 21:41 UTC
25 points
0 comments11 min readLW link

To what ex­tent is Qwen3-32B pre­dict­ing its per­sona?

30 Apr 2026 21:09 UTC
64 points
3 comments10 min readLW link

Pro­jects that might help ac­cel­er­ate strong reprogenetics

TsviBT30 Apr 2026 20:55 UTC
10 points
0 comments12 min readLW link

Ex­plor­ing the ca­pa­bil­ities spike with METR’s time hori­zon data: no clear signal

Ben_Snodin30 Apr 2026 20:54 UTC
15 points
0 comments5 min readLW link
(www.bensnodin.com)

Align­ment Fak­ing in Deep­Seek V4

Amina Keldibek30 Apr 2026 20:23 UTC
9 points
0 comments5 min readLW link

Cy­borg evals

30 Apr 2026 17:31 UTC
32 points
2 comments5 min readLW link