AI 2027: What Su­per­in­tel­li­gence Looks Like

Apr 3, 2025, 4:23 PM
635 points
211 comments41 min readLW link
(ai-2027.com)

Ac­countabil­ity Sinks

Martin SustrikApr 22, 2025, 5:00 AM
368 points
53 comments15 min readLW link
(250bpm.substack.com)

LessWrong has been ac­quired by EA

habrykaApr 1, 2025, 1:09 PM
343 points
47 comments1 min readLW link

VDT: a solu­tion to de­ci­sion theory

L Rudolf LApr 1, 2025, 9:04 PM
337 points
26 comments4 min readLW link

Play­ing in the Creek

HastingsApr 10, 2025, 5:39 PM
307 points
6 comments2 min readLW link
(hgreer.com)

Why Have Sen­tence Lengths De­creased?

Arjun PanicksseryApr 3, 2025, 5:50 PM
266 points
89 comments4 min readLW link
(arjunpanickssery.substack.com)

Why Should I As­sume CCP AGI is Worse Than USG AGI?

Tomás B.Apr 19, 2025, 2:47 PM
242 points
83 comments1 min readLW link

To Un­der­stand His­tory, Keep Former Pop­u­la­tion Distri­bu­tions In Mind

Arjun PanicksseryApr 23, 2025, 4:51 AM
224 points
13 comments2 min readLW link
(arjunpanickssery.substack.com)

Jaan Tal­linn’s 2024 Philan­thropy Overview

jaanApr 23, 2025, 11:06 AM
221 points
8 comments1 min readLW link
(jaan.info)

Thoughts on AI 2027

Max HarmsApr 9, 2025, 9:26 PM
215 points
49 comments21 min readLW link
(intelligence.org)

Im­pact, agency, and taste

benkuhnApr 19, 2025, 9:10 PM
201 points
10 comments8 min readLW link
(www.benkuhn.net)

Short Timelines Don’t De­value Long Hori­zon Research

Vladimir_NesovApr 9, 2025, 12:42 AM
165 points
23 comments1 min readLW link

Sur­pris­ing LLM rea­son­ing failures make me think we still need qual­i­ta­tive break­throughs for AGI

Kaj_SotalaApr 15, 2025, 3:56 PM
163 points
48 comments18 min readLW link

Fron­tier AI Models Still Fail at Ba­sic Phys­i­cal Tasks: A Man­u­fac­tur­ing Case Study

Adam KarvonenApr 14, 2025, 5:38 PM
147 points
42 comments7 min readLW link
(adamkarvonen.github.io)

Align­ment Fak­ing Re­vis­ited: Im­proved Clas­sifiers and Open Source Extensions

Apr 8, 2025, 5:32 PM
145 points
20 comments12 min readLW link

Train­ing AGI in Se­cret would be Un­safe and Unethical

Daniel KokotajloApr 18, 2025, 12:27 PM
137 points
15 comments6 min readLW link

AI-en­abled coups: a small group could use AI to seize power

Apr 16, 2025, 4:51 PM
128 points
18 comments7 min readLW link

AI 2027 is a Bet Against Am­dahl’s Law

snewmanApr 21, 2025, 3:09 AM
123 points
54 comments9 min readLW link

Ctrl-Z: Con­trol­ling AI Agents via Resampling

Apr 16, 2025, 4:21 PM
122 points
0 comments20 min readLW link

Learned pain as a lead­ing cause of chronic pain

SoerenMindApr 9, 2025, 11:57 AM
122 points
13 comments9 min readLW link

Re­search Notes: Run­ning Claude 3.7, Gem­ini 2.5 Pro, and o3 on Poké­mon Red

Julian BradshawApr 21, 2025, 3:52 AM
118 points
19 comments14 min readLW link

“The Era of Ex­pe­rience” has an un­solved tech­ni­cal al­ign­ment problem

Steven ByrnesApr 24, 2025, 1:57 PM
114 points
42 comments23 min readLW link

Three Months In, Eval­u­at­ing Three Ra­tion­al­ist Cases for Trump

Arjun PanicksseryApr 18, 2025, 8:27 AM
114 points
32 comments4 min readLW link

Among Us: A Sand­box for Agen­tic Deception

Apr 5, 2025, 6:24 AM
110 points
7 comments7 min readLW link

New Cause Area Proposal

CallumMcDougallApr 1, 2025, 7:12 AM
108 points
4 comments1 min readLW link

We should try to au­to­mate AI safety work asap

Marius HobbhahnApr 26, 2025, 4:35 PM
106 points
10 comments15 min readLW link

AI 2027: Responses

ZviApr 8, 2025, 12:50 PM
106 points
3 comments30 min readLW link
(thezvi.wordpress.com)

How train­ing-gamers might func­tion (and win)

Vivek HebbarApr 11, 2025, 9:26 PM
105 points
5 comments13 min readLW link

Show, not tell: GPT-4o is more opinionated in images than in text

Apr 2, 2025, 8:51 AM
103 points
41 comments3 min readLW link

The Lizard­man and the Black Hat Bobcat

ScrewtapeApr 6, 2025, 7:02 PM
96 points
13 comments9 min readLW link

How to Build a Third Place on Focusmate

Parker ConleyApr 28, 2025, 11:46 PM
92 points
3 comments5 min readLW link
(parconley.com)

ASI ex­is­ten­tial risk: Re­con­sid­er­ing Align­ment as a Goal

habrykaApr 15, 2025, 7:57 PM
91 points
14 comments19 min readLW link
(michaelnotebook.com)

How To Believe False Things

EneaszApr 2, 2025, 4:28 PM
89 points
10 comments3 min readLW link

One-shot steer­ing vec­tors cause emer­gent mis­al­ign­ment, too

Jacob DunefskyApr 14, 2025, 6:40 AM
88 points
6 comments11 min readLW link

Is Gem­ini now bet­ter than Claude at Poké­mon?

Julian BradshawApr 19, 2025, 11:34 PM
88 points
12 comments5 min readLW link

The Uses of Complacency

sarahconstantinApr 21, 2025, 6:50 PM
86 points
5 comments8 min readLW link
(sarahconstantin.substack.com)

o3 Is a Ly­ing Liar

ZviApr 23, 2025, 8:00 PM
84 points
19 comments9 min readLW link
(thezvi.wordpress.com)

Mis­rep­re­sen­ta­tion as a Bar­rier for In­terp (Part I)

Apr 29, 2025, 5:07 PM
84 points
9 comments7 min readLW link

A Slow Guide to Con­fronting Doom

RubyApr 6, 2025, 2:10 AM
83 points
20 comments14 min readLW link

$500 Bounty Prob­lem: Are (Ap­prox­i­mately) Deter­minis­tic Nat­u­ral La­tents All You Need?

Apr 21, 2025, 8:19 PM
83 points
12 comments3 min readLW link

7+ tractable di­rec­tions in AI control

Apr 28, 2025, 5:12 PM
82 points
1 comment13 min readLW link

Keltham’s Lec­tures in Pro­ject Lawful

MorpheusApr 1, 2025, 10:39 AM
81 points
5 comments2 min readLW link

You will crash your car in front of my house within the next week

Richard Korzekwa Apr 1, 2025, 9:43 PM
80 points
6 comments1 min readLW link

What Makes an AI Startup “Net Pos­i­tive” for Safety?

jacquesthibsApr 18, 2025, 8:33 PM
80 points
23 comments2 min readLW link

An­nounc­ing ILIAD2: ODYSSEY

Apr 3, 2025, 5:01 PM
80 points
1 comment1 min readLW link

Band­width Rules Every­thing Around Me: Oliver Habryka on OpenPhil and GoodVentures

ElizabethApr 29, 2025, 8:40 PM
78 points
15 comments1 min readLW link
(acesounderglass.com)

Why does LW not put much more fo­cus on AI gov­er­nance and out­reach?

Apr 12, 2025, 2:24 PM
78 points
31 comments2 min readLW link

New Paper: In­fra-Bayesian De­ci­sion-Es­ti­ma­tion Theory

Apr 10, 2025, 9:17 AM
77 points
4 comments1 min readLW link
(arxiv.org)

PauseAI and E/​Acc Should Switch Sides

WillPetilloApr 1, 2025, 11:25 PM
76 points
6 comments2 min readLW link

Re­ward hack­ing is be­com­ing more so­phis­ti­cated and de­liber­ate in fron­tier LLMs

KeiApr 24, 2025, 4:03 PM
76 points
6 comments1 min readLW link