All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

How much do you believe your results?

Eric NeymanMay 6, 2023, 8:31 PM

507 points

18 comments15 min readLW link 4 reviews

(ericneyman.wordpress.com)

Steering GPT-2-XL by adding an activation vector

TurnTrout, Monte M, David Udell, lisathiergart and Ulisse Mini

May 13, 2023, 6:42 PM

437 points

98 comments50 min readLW link 1 review

Statement on AI Extinction—Signed by AGI Labs, Top Academics, and Many Other Notable Figures

Dan HMay 30, 2023, 9:05 AM

382 points

78 comments1 min readLW link 1 review

(www.safe.ai)

How to have Polygenically Screened Children

GeneSmithMay 7, 2023, 4:01 PM

367 points

128 comments27 min readLW link 1 review

Book Review: How Minds Change

bc4026bd4aaa5b7feMay 25, 2023, 5:55 PM

312 points

52 comments15 min readLW link

Predictable updating about AI risk

Joe CarlsmithMay 8, 2023, 9:53 PM

293 points

25 comments36 min readLW link 1 review

My May 2023 priorities for AI x-safety: more empathy, more unification of concerns, and less vilification of OpenAI

Andrew_CritchMay 24, 2023, 12:02 AM

268 points

39 comments8 min readLW link

Mental Health and the Alignment Problem: A Compilation of Resources (updated April 2023)

Chris Scammell and DivineMango

May 10, 2023, 7:04 PM

256 points

54 comments21 min readLW link

Announcing Apollo Research

Marius Hobbhahn, beren, Lee Sharkey, Lucius Bushnaq, Dan Braun, Mikita Balesni and Jérémy Scheurer

May 30, 2023, 4:17 PM

217 points

11 comments8 min readLW link

Twiblings, four-parent babies and other reproductive technology

GeneSmithMay 20, 2023, 5:11 PM

191 points

33 comments6 min readLW link

When is Goodhart catastrophic?

Drake Thomas and Thomas Kwa

May 9, 2023, 3:59 AM

180 points

29 comments8 min readLW link 1 review

Decision Theory with the Magic Parts Highlighted

moridinamaelMay 16, 2023, 5:39 PM

175 points

24 comments5 min readLW link

Prizes for matrix completion problems

paulfchristianoMay 3, 2023, 11:30 PM

164 points

52 comments1 min readLW link

(www.alignment.org)

Conjecture internal survey: AGI timelines and probability of human extinction from advanced AI

Maris SalaMay 22, 2023, 2:31 PM

155 points

5 comments3 min readLW link

(www.conjecture.dev)

Request: stop advancing AI capabilities

So8resMay 26, 2023, 5:42 PM

154 points

24 comments1 min readLW link

Advice for newly busy people

Severin T. SeehrichMay 11, 2023, 4:46 PM

150 points

3 comments5 min readLW link

Dark Forest Theories

RaemonMay 12, 2023, 8:21 PM

145 points

53 comments2 min readLW link 2 reviews

Sentience matters

So8resMay 29, 2023, 9:25 PM

143 points

96 comments2 min readLW link

A brief collection of Hinton’s recent comments on AGI risk

Kaj_SotalaMay 4, 2023, 11:31 PM

143 points

9 comments11 min readLW link

Clarifying and predicting AGI

Richard_NgoMay 4, 2023, 3:55 PM

142 points

45 comments4 min readLW link

LeCun’s “A Path Towards Autonomous Machine Intelligence” has an unsolved technical alignment problem

Steven ByrnesMay 8, 2023, 7:35 PM

140 points

37 comments15 min readLW link

Trust develops gradually via making bids and setting boundaries

Richard_NgoMay 19, 2023, 10:16 PM

134 points

12 comments4 min readLW link

AGI safety career advice

Richard_NgoMay 2, 2023, 7:36 AM

132 points

24 comments13 min readLW link

From fear to excitement

Richard_NgoMay 15, 2023, 6:23 AM

131 points

9 comments3 min readLW link

Some background for reasoning about dual-use alignment research

Charlie SteinerMay 18, 2023, 2:50 PM

126 points

22 comments9 min readLW link 1 review

Who regulates the regulators? We need to go beyond the review-and-approval paradigm

jasoncrawfordMay 4, 2023, 10:11 PM

122 points

29 comments13 min readLW link

(rootsofprogress.org)

Solving the Mechanistic Interpretability challenges: EIS VII Challenge 1

StefanHex and Marius Hobbhahn

May 9, 2023, 7:41 PM

119 points

1 comment10 min readLW link

New User’s Guide to LessWrong

RubyMay 17, 2023, 12:55 AM

117 points

55 comments11 min readLW link 1 review

Investigating Fabrication

LoganStrohlMay 18, 2023, 5:46 PM

112 points

14 comments16 min readLW link

Retrospective: Lessons from the Failed Alignment Startup AISafety.com

Søren ElverlinMay 12, 2023, 6:07 PM

105 points

9 comments3 min readLW link

AI Safety in China: Part 2

Lao MeinMay 22, 2023, 2:50 PM

103 points

28 comments2 min readLW link

Bayesian Networks Aren’t Necessarily Causal

Zack_M_DavisMay 14, 2023, 1:42 AM

102 points

38 comments8 min readLW link 1 review

Open Thread With Experimental Feature: Reactions

jimrandomhMay 24, 2023, 4:46 PM

101 points

189 comments3 min readLW link

A Case for the Least Forgiving Take On Alignment

Thane RuthenisMay 2, 2023, 9:34 PM

100 points

85 comments22 min readLW link

Geoff Hinton Quits Google

Adam ShaiMay 1, 2023, 9:03 PM

98 points

14 comments1 min readLW link

Shah (DeepMind) and Leahy (Conjecture) Discuss Alignment Cruxes

OliviaJ, Rohin Shah, Connor Leahy and Andrea_Miotti

May 1, 2023, 4:47 PM

96 points

10 comments30 min readLW link

Judgments often smuggle in implicit standards

Richard_NgoMay 15, 2023, 6:50 PM

95 points

4 comments3 min readLW link

Most people should probably feel safe most of the time

Kaj_SotalaMay 9, 2023, 9:35 AM

95 points

28 comments10 min readLW link

What if they gave an Industrial Revolution and nobody came?

jasoncrawfordMay 17, 2023, 7:41 PM

94 points

10 comments19 min readLW link

(rootsofprogress.org)

DeepMind: Model evaluation for extreme risks

Zach Stein-PerlmanMay 25, 2023, 3:00 AM

94 points

12 comments1 min readLW link 1 review

(arxiv.org)

Input Swap Graphs: Discovering the role of neural network components at scale

Alexandre VariengienMay 12, 2023, 9:41 AM

92 points

0 comments33 min readLW link

 Yoshua Bengio: How Rogue AIs may Arise

harfeMay 23, 2023, 6:28 PM

92 points

12 comments18 min readLW link

(yoshuabengio.org)

An artificially structured argument for expecting AGI ruin

Rob BensingerMay 7, 2023, 9:52 PM

91 points

26 comments19 min readLW link

Coercion is an adaptation to scarcity; trust is an adaptation to abundance

Richard_NgoMay 23, 2023, 6:14 PM

90 points

11 comments4 min readLW link

The bullseye framework: My case against AI doom

titotalMay 30, 2023, 11:52 AM

89 points

35 comments LW link

LessWrong Community Weekend 2023 [Applications now closed]

Henry ProwbellMay 1, 2023, 9:08 AM

89 points

0 comments6 min readLW link

An Analogy for Understanding Transformers

CallumMcDougallMay 13, 2023, 12:20 PM

89 points

6 comments9 min readLW link

Reacts now enabled on 100% of posts, though still just experimenting

RubyMay 28, 2023, 5:36 AM

88 points

73 comments2 min readLW link

Conditional Prediction with Zero-Sum Training Solves Self-Fulfilling Prophecies

Rubi J. Hudson and Johannes Treutlein

May 26, 2023, 5:44 PM

88 points

13 comments24 min readLW link

How I apply (so-called) Non-Violent Communication

Kaj_SotalaMay 15, 2023, 9:56 AM

86 points

28 comments3 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer