RSS

Land with no aunties

thellimist26 Apr 2025 1:20 UTC
1 point
0 comments1 min readLW link
(kanyilmaz.me)

AI 2027 Thoughts

PeterMcCluskey26 Apr 2025 0:00 UTC
22 points
0 comments6 min readLW link
(bayesianinvestor.com)

Who’s Work­ing On It? AI-Con­trol­led Experiments

sarahconstantin25 Apr 2025 21:40 UTC
18 points
0 comments1 min readLW link
(sarahconstantin.substack.com)

[Linkpost] AI War seems un­likely to pre­vent AI Doom

thenoviceoof25 Apr 2025 20:44 UTC
2 points
2 comments2 min readLW link
(thenoviceoof.com)

Wor­ries About AI Are Usu­ally Com­ple­ments Not Substitutes

Zvi25 Apr 2025 20:00 UTC
29 points
1 comment4 min readLW link
(thezvi.wordpress.com)

Why would AI com­pa­nies use hu­man-level AI to do al­ign­ment re­search?

MichaelDickens25 Apr 2025 19:12 UTC
22 points
3 commentsLW link

Will Pro­gram­mer Com­pen­sa­tion De­cou­ple from Pro­duc­tivity?

Gordon Seidoh Worley25 Apr 2025 15:32 UTC
14 points
2 comments2 min readLW link
(uncertainupdates.substack.com)

A re­view of “Why Did En­vi­ron­men­tal­ism Be­come Par­ti­san?”

David Scott Krueger (formerly: capybaralet)25 Apr 2025 5:12 UTC
15 points
0 comments4 min readLW link

LLM Pareto Fron­tier But Live

winstonBosan24 Apr 2025 21:22 UTC
7 points
0 comments1 min readLW link

Mod­ify­ing LLM Beliefs with Syn­thetic Doc­u­ment Finetuning

24 Apr 2025 21:15 UTC
65 points
11 comments2 min readLW link
(alignment.anthropic.com)

Se­vere con­trol over AI agents as a tool for mass-surveillance

Andrey Seryakov24 Apr 2025 20:27 UTC
1 point
0 comments3 min readLW link

To­ken and Taboo

Guive24 Apr 2025 20:17 UTC
30 points
6 comments4 min readLW link
(guive.substack.com)

Trou­ble at Min­ing­town: Prologue

Quinn24 Apr 2025 19:09 UTC
9 points
0 comments4 min readLW link

Train­ing-time schemers vs be­hav­ioral schemers

Alex Mallen24 Apr 2025 19:07 UTC
23 points
0 comments6 min readLW link

Re­ward hack­ing is be­com­ing more so­phis­ti­cated and de­liber­ate in fron­tier LLMs

Kei24 Apr 2025 16:03 UTC
36 points
4 comments1 min readLW link

Find­ing an Er­ror-De­tec­tion Fea­ture in Deep­Seek-R1

keith_wynroe24 Apr 2025 16:03 UTC
13 points
0 comments7 min readLW link

An­ti­ci­pat­ing AI: Keep­ing Up With What We Build

Alvin Ånestrand24 Apr 2025 15:23 UTC
2 points
0 comments11 min readLW link
(forecastingaifutures.substack.com)

Does Re­in­force­ment Learn­ing Really In­cen­tivize Rea­son­ing Ca­pac­ity in LLMs Beyond the Base Model?

Matrice Jacobine24 Apr 2025 14:11 UTC
7 points
3 comments1 min readLW link
(limit-of-rlvr.github.io)

Academia as a happy place?

24 Apr 2025 14:03 UTC
9 points
0 comments19 min readLW link

“The Era of Ex­pe­rience” has an un­solved tech­ni­cal al­ign­ment problem

Steven Byrnes24 Apr 2025 13:57 UTC
90 points
14 comments23 min readLW link