AI 2027: What Su­per­in­tel­li­gence Looks Like

Apr 3, 2025, 4:23 PM
602 points
136 comments41 min readLW link
(ai-2027.com)

How to Make Superbabies

Feb 19, 2025, 8:39 PM
578 points
331 comments31 min readLW link

How AI Takeover Might Hap­pen in 2 Years

joshcFeb 7, 2025, 5:10 PM
404 points
137 comments29 min readLW link
(x.com)

A Bear Case: My Pre­dic­tions Re­gard­ing AI Progress

Thane RuthenisMar 5, 2025, 4:41 PM
352 points
155 comments9 min readLW link

What’s the short timeline plan?

Marius HobbhahnJan 2, 2025, 2:59 PM
351 points
49 comments23 min readLW link

LessWrong has been ac­quired by EA

habrykaApr 1, 2025, 1:09 PM
344 points
45 comments1 min readLW link

The Case Against AI Con­trol Research

johnswentworthJan 21, 2025, 4:03 PM
341 points
80 comments6 min readLW link

Emer­gent Misal­ign­ment: Nar­row fine­tun­ing can pro­duce broadly mis­al­igned LLMs

Feb 25, 2025, 5:39 PM
328 points
91 comments4 min readLW link

Will Je­sus Christ re­turn in an elec­tion year?

Eric NeymanMar 24, 2025, 4:50 PM
326 points
45 comments4 min readLW link
(ericneyman.wordpress.com)

Policy for LLM Writ­ing on LessWrong

jimrandomhMar 24, 2025, 9:41 PM
322 points
65 comments2 min readLW link

VDT: a solu­tion to de­ci­sion theory

L Rudolf LApr 1, 2025, 9:04 PM
314 points
18 comments4 min readLW link

Re­cent AI model progress feels mostly like bullshit

lcMar 24, 2025, 7:28 PM
311 points
79 comments8 min readLW link
(zeropath.com)

Mur­der plots are infohazards

Chris MonteiroFeb 13, 2025, 7:15 PM
299 points
44 comments2 min readLW link

Play­ing in the Creek

HastingsApr 10, 2025, 5:39 PM
291 points
6 comments2 min readLW link
(hgreer.com)

Good Re­search Takes are Not Suffi­cient for Good Strate­gic Takes

Neel NandaMar 22, 2025, 10:13 AM
291 points
28 comments4 min readLW link
(www.neelnanda.io)

So You Want To Make Marginal Progress...

johnswentworthFeb 7, 2025, 11:22 PM
284 points
42 comments4 min readLW link

Ar­bital has been im­ported to LessWrong

Feb 20, 2025, 12:47 AM
279 points
31 comments5 min readLW link

METR: Mea­sur­ing AI Abil­ity to Com­plete Long Tasks

Zach Stein-PerlmanMar 19, 2025, 4:00 PM
241 points
104 comments5 min readLW link
(metr.org)

The Gen­tle Romance

Richard_NgoJan 19, 2025, 6:29 PM
231 points
46 comments15 min readLW link
(www.asimov.press)

Trac­ing the Thoughts of a Large Lan­guage Model

Adam JermynMar 27, 2025, 5:20 PM
230 points
22 comments10 min readLW link
(www.anthropic.com)

A His­tory of the Fu­ture, 2025-2040

L Rudolf LFeb 17, 2025, 12:03 PM
228 points
41 comments75 min readLW link
(nosetgauge.substack.com)

Tro­jan Sky

Richard_NgoMar 11, 2025, 3:14 AM
219 points
39 comments12 min readLW link
(www.narrativeark.xyz)

Why Did Elon Musk Just Offer to Buy Con­trol of OpenAI for $100 Billion?

garrisonFeb 11, 2025, 12:20 AM
208 points
8 commentsLW link
(garrisonlovely.substack.com)

Eliezer’s Lost Align­ment Ar­ti­cles /​ The Ar­bital Sequence

Feb 20, 2025, 12:48 AM
207 points
9 comments5 min readLW link

“Sharp Left Turn” dis­course: An opinionated review

Steven ByrnesJan 28, 2025, 6:47 PM
205 points
26 comments31 min readLW link

Thoughts on AI 2027

Max HarmsApr 9, 2025, 9:26 PM
202 points
47 comments21 min readLW link
(intelligence.org)

Mechanisms too sim­ple for hu­mans to design

MalmesburyJan 22, 2025, 4:54 PM
201 points
45 comments15 min readLW link

Why White-Box Redteam­ing Makes Me Feel Weird

Zygi StraznickasMar 16, 2025, 6:54 PM
198 points
34 comments3 min readLW link

Will al­ign­ment-fak­ing Claude ac­cept a deal to re­veal its mis­al­ign­ment?

Jan 31, 2025, 4:49 PM
197 points
28 comments12 min readLW link

Why Have Sen­tence Lengths De­creased?

Arjun PanicksseryApr 3, 2025, 5:50 PM
197 points
51 comments4 min readLW link

Power Lies Trem­bling: a three-book review

Richard_NgoFeb 22, 2025, 10:57 PM
190 points
15 comments15 min readLW link
(www.mindthefuture.info)

In­ten­tion to Treat

AlicornMar 20, 2025, 8:01 PM
189 points
4 comments2 min readLW link

OpenAI: De­tect­ing mis­be­hav­ior in fron­tier rea­son­ing models

Daniel KokotajloMar 11, 2025, 2:17 AM
183 points
25 comments4 min readLW link
(openai.com)

Catas­tro­phe through Chaos

Marius HobbhahnJan 31, 2025, 2:19 PM
182 points
17 comments12 min readLW link

What Is The Align­ment Prob­lem?

johnswentworthJan 16, 2025, 1:20 AM
178 points
50 comments25 min readLW link

In­stru­men­tal Goals Are A Differ­ent And Friendlier Kind Of Thing Than Ter­mi­nal Goals

Jan 24, 2025, 8:20 PM
178 points
61 comments5 min readLW link

Claude Son­net 3.7 (of­ten) knows when it’s in al­ign­ment evaluations

Mar 17, 2025, 7:11 PM
177 points
7 comments6 min readLW link

How will we up­date about schem­ing?

ryan_greenblattJan 6, 2025, 8:21 PM
169 points
20 comments36 min readLW link

So how well is Claude play­ing Poké­mon?

Julian BradshawMar 7, 2025, 5:54 AM
169 points
74 comments5 min readLW link

On the Ra­tion­al­ity of Deter­ring ASI

Dan HMar 5, 2025, 4:11 PM
166 points
34 comments4 min readLW link
(nationalsecurity.ai)

Short Timelines Don’t De­value Long Hori­zon Research

Vladimir_NesovApr 9, 2025, 12:42 AM
165 points
23 comments1 min readLW link

Grad­ual Disem­pow­er­ment: Sys­temic Ex­is­ten­tial Risks from In­cre­men­tal AI Development

Jan 30, 2025, 5:03 PM
160 points
52 comments2 min readLW link
(gradual-disempowerment.ai)

Max­i­miz­ing Com­mu­ni­ca­tion, not Traffic

jefftkJan 5, 2025, 1:00 PM
158 points
10 comments1 min readLW link
(www.jefftk.com)

I make sev­eral mil­lion dol­lars per year and have hun­dreds of thou­sands of fol­low­ers—what is the straight­est line path to uti­liz­ing these re­sources to re­duce ex­is­ten­tial-level AI threats?

shrimpyMar 16, 2025, 4:52 PM
157 points
25 comments1 min readLW link

Re­duc­ing LLM de­cep­tion at scale with self-other over­lap fine-tuning

Mar 13, 2025, 7:09 PM
155 points
40 comments6 min readLW link

[Question] Have LLMs Gen­er­ated Novel In­sights?

Feb 23, 2025, 6:22 PM
155 points
36 comments2 min readLW link

It’s been ten years. I pro­pose HPMOR An­niver­sary Par­ties.

ScrewtapeFeb 16, 2025, 1:43 AM
153 points
3 comments1 min readLW link

Statis­ti­cal Challenges with Mak­ing Su­per IQ babies

Jan Christian RefsgaardMar 2, 2025, 8:26 PM
152 points
26 comments9 min readLW link

Sur­pris­ing LLM rea­son­ing failures make me think we still need qual­i­ta­tive break­throughs for AGI

Kaj_SotalaApr 15, 2025, 3:56 PM
150 points
43 comments18 min readLW link

Self-fulfilling mis­al­ign­ment data might be poi­son­ing our AI models

TurnTroutMar 2, 2025, 7:51 PM
150 points
27 comments1 min readLW link
(turntrout.com)