All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan FebMarApr May Jun Jul Aug Sep Oct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

The Waluigi Effect (mega-post)

Cleo NardoMar 3, 2023, 3:22 AM

629 points

188 comments16 min readLW link

My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”

Quintin PopeMar 21, 2023, 12:06 AM

359 points

233 comments39 min readLW link 1 review

Shutting Down the Lightcone Offices

habryka and Ben Pace

Mar 14, 2023, 10:47 PM

338 points

103 comments17 min readLW link 2 reviews

Understanding and controlling a maze-solving policy network

TurnTrout, peligrietzer, Ulisse Mini, Monte M and David Udell

Mar 11, 2023, 6:59 PM

333 points

28 comments23 min readLW link

The Parable of the King and the Random Process

moridinamaelMar 1, 2023, 10:18 PM

312 points

26 comments6 min readLW link 3 reviews

Pausing AI Developments Isn’t Enough. We Need to Shut it All Down by Eliezer Yudkowsky

jacquesthibsMar 29, 2023, 11:16 PM

291 points

297 comments3 min readLW link

(time.com)

Discussion with Nate Soares on a key alignment difficulty

HoldenKarnofskyMar 13, 2023, 9:20 PM

267 points

43 comments22 min readLW link 1 review

“Carefully Bootstrapped Alignment” is organizationally hard

RaemonMar 17, 2023, 6:00 PM

262 points

23 comments11 min readLW link 1 review

Deep Deceptiveness

So8resMar 21, 2023, 2:51 AM

258 points

60 comments14 min readLW link 1 review

Natural Abstractions: Key Claims, Theorems, and Critiques

LawrenceC, Leon Lang and Erik Jenner

Mar 16, 2023, 4:37 PM

241 points

26 comments45 min readLW link 3 reviews

More information about the dangerous capability evaluations we did with GPT-4 and Claude.

Beth BarnesMar 19, 2023, 12:25 AM

233 points

54 comments8 min readLW link

(evals.alignment.org)

The salt in pasta water fallacy

Thomas SepulchreMar 27, 2023, 2:53 PM

213 points

46 comments3 min readLW link 2 reviews

An AI risk argument that resonates with NYTimes readers

Julian BradshawMar 12, 2023, 11:09 PM

212 points

14 comments1 min readLW link

Actually, Othello-GPT Has A Linear Emergent World Representation

Neel NandaMar 29, 2023, 10:13 PM

211 points

26 comments19 min readLW link

(neelnanda.io)

GPT-4 Plugs In

ZviMar 27, 2023, 12:10 PM

198 points

47 comments6 min readLW link

(thezvi.wordpress.com)

Acausal normalcy

Andrew_CritchMar 3, 2023, 11:34 PM

195 points

36 comments8 min readLW link 1 review

Why Not Just… Build Weak AI Tools For AI Alignment Research?

johnswentworthMar 5, 2023, 12:12 AM

184 points

18 comments6 min readLW link

ChatGPT (and now GPT4) is very easily distracted from its rules

dmcsMar 15, 2023, 5:55 PM

180 points

42 comments1 min readLW link

A rough and incomplete review of some of John Wentworth’s research

So8resMar 28, 2023, 6:52 PM

175 points

18 comments18 min readLW link

Anthropic’s Core Views on AI Safety

Zac Hatfield-DoddsMar 9, 2023, 4:55 PM

172 points

39 comments2 min readLW link

(www.anthropic.com)

A stylized dialogue on John Wentworth’s claims about markets and optimization

So8resMar 25, 2023, 10:32 PM

169 points

22 comments8 min readLW link

What Discovering Latent Knowledge Did and Did Not Find

Fabien RogerMar 13, 2023, 7:29 PM

166 points

17 comments11 min readLW link

Towards understanding-based safety evaluations

evhubMar 15, 2023, 6:18 PM

164 points

16 comments5 min readLW link

What would a compute monitoring plan look like? [Linkpost]

Orpheus16Mar 26, 2023, 7:33 PM

158 points

10 comments4 min readLW link

(arxiv.org)

Inside the mind of a superhuman Go model: How does Leela Zero read ladders?

Haoxing DuMar 1, 2023, 1:47 AM

157 points

8 comments30 min readLW link

POC || GTFO culture as partial antidote to alignment wordcelism

lcMar 15, 2023, 10:21 AM

155 points

13 comments7 min readLW link 2 reviews

AI: Practical Advice for the Worried

ZviMar 1, 2023, 12:30 PM

155 points

49 comments16 min readLW link 2 reviews

(thezvi.wordpress.com)

GPT-4

nzMar 14, 2023, 5:02 PM

151 points

150 comments1 min readLW link

(openai.com)

Why Not Just Outsource Alignment Research To An AI?

johnswentworthMar 9, 2023, 9:49 PM

151 points

50 comments9 min readLW link 1 review

Why I’m not into the Free Energy Principle

Steven ByrnesMar 2, 2023, 7:27 PM

150 points

50 comments9 min readLW link 1 review

Comments on OpenAI’s “Planning for AGI and beyond”

So8resMar 3, 2023, 11:01 PM

148 points

2 comments14 min readLW link

Dan Luu on “You can only communicate one top priority”

RaemonMar 18, 2023, 6:55 PM

148 points

18 comments3 min readLW link

(twitter.com)

Remarks 1–18 on GPT (compressed)

Cleo NardoMar 20, 2023, 10:27 PM

145 points

35 comments31 min readLW link

The Translucent Thoughts Hypotheses and Their Implications

Fabien RogerMar 9, 2023, 4:30 PM

142 points

7 comments19 min readLW link

Speed running everyone through the bad alignment bingo. $5k bounty for a LW conversational agent

ArthurBMar 9, 2023, 9:26 AM

140 points

33 comments2 min readLW link

Against LLM Reductionism

Erich_GrunewaldMar 8, 2023, 3:52 PM

140 points

17 comments18 min readLW link

(www.erichgrunewald.com)

Conceding a short timelines bet early

Matthew BarnettMar 16, 2023, 9:49 PM

133 points

17 comments1 min readLW link

Good News, Everyone!

jbashMar 25, 2023, 1:48 PM

132 points

23 comments2 min readLW link

We have to Upgrade

Jed McCalebMar 23, 2023, 5:53 PM

131 points

35 comments2 min readLW link

High Status Eschews Quantification of Performance

niplavMar 19, 2023, 10:14 PM

128 points

36 comments5 min readLW link

[Linkpost] Some high-level thoughts on the DeepMind alignment team’s strategy

Vika and Rohin Shah

Mar 7, 2023, 11:55 AM

128 points

13 comments5 min readLW link

(drive.google.com)

FLI open letter: Pause giant AI experiments

Zach Stein-PerlmanMar 29, 2023, 4:04 AM

126 points

123 comments2 min readLW link

(futureoflife.org)

How bad a future do ML researchers expect?

KatjaGraceMar 9, 2023, 4:50 AM

122 points

8 comments2 min readLW link

(aiimpacts.org)

Manifold: If okay AGI, why?

Eliezer YudkowskyMar 25, 2023, 10:43 PM

120 points

37 comments1 min readLW link

(manifold.markets)

ARC tests to see if GPT-4 can escape human control; GPT-4 failed to do so

Christopher KingMar 15, 2023, 12:29 AM

116 points

22 comments2 min readLW link

Parasitic Language Games: maintaining ambiguity to hide conflict while burning the commons

HazardMar 12, 2023, 5:25 AM

115 points

17 comments13 min readLW link

GPT can write Quines now (GPT-4)

Andrew_CritchMar 14, 2023, 7:18 PM

112 points

30 comments1 min readLW link

“Publish or Perish” (a quick note on why you should try to make your work legible to existing academic communities)

David Scott Krueger (formerly: capybaralet)Mar 18, 2023, 7:01 PM

112 points

49 comments1 min readLW link 1 review

Here, have a calmness video

Kaj_SotalaMar 16, 2023, 10:00 AM

111 points

15 comments2 min readLW link

(www.youtube.com)

“Liquidity” vs “solvency” in bank runs (and some notes on Silicon Valley Bank)

rossryMar 12, 2023, 9:16 AM

108 points

27 comments12 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer