All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

The hostile telepaths problem

ValentineOct 27, 2024, 3:26 PM

383 points

89 comments15 min readLW link

I got dysentery so you don’t have to

eukaryoteOct 22, 2024, 4:55 AM

321 points

6 comments17 min readLW link

(eukaryotewritesblog.com)

Overview of strong human intelligence amplification methods

TsviBTOct 8, 2024, 8:37 AM

279 points

144 comments10 min readLW link

The Hopium Wars: the AGI Entente Delusion

Max TegmarkOct 13, 2024, 5:00 PM

224 points

60 comments9 min readLW link

Why I’m not a Bayesian

Richard_NgoOct 6, 2024, 3:22 PM

211 points

101 comments10 min readLW link

(www.mindthefuture.info)

Information vs Assurance

johnswentworthOct 20, 2024, 11:16 PM

187 points

17 comments2 min readLW link

My motivation and theory of change for working in AI healthtech

Andrew_CritchOct 12, 2024, 12:36 AM

178 points

37 comments14 min readLW link

Three Subtle Examples of Data Leakage

abstractapplicOct 1, 2024, 8:45 PM

172 points

16 comments4 min readLW link

Overcoming Bias Anthology

Arjun PanicksseryOct 20, 2024, 2:01 AM

169 points

14 comments2 min readLW link

(overcoming-bias-anthology.com)

The Summoned Heroine’s Prediction Markets Keep Providing Financial Services To The Demon King!

abstractapplicOct 26, 2024, 12:34 PM

164 points

16 comments7 min readLW link

A Rocket–Interpretability Analogy

plexOct 21, 2024, 1:55 PM

155 points

31 comments1 min readLW link

Arithmetic is an underrated world-modeling technology

dynomightOct 17, 2024, 2:00 PM

152 points

33 comments6 min readLW link

(dynomight.net)

Momentum of Light in Glass

BenOct 9, 2024, 8:19 PM

143 points

44 comments11 min readLW link

Circuits in Superposition: Compressing many small neural networks into one

Lucius Bushnaq and jake_mendel

Oct 14, 2024, 1:06 PM

130 points

9 comments13 min readLW link

BIG-Bench Canary Contamination in GPT-4

JozdienOct 22, 2024, 3:40 PM

125 points

14 comments4 min readLW link

A bird’s eye view of ARC’s research

Jacob_HiltonOct 23, 2024, 3:50 PM

121 points

12 comments7 min readLW link

(www.alignment.org)

Miles Brundage resigned from OpenAI, and his AGI readiness team was disbanded

garrisonOct 23, 2024, 11:40 PM

118 points

1 comment7 min readLW link

(garrisonlovely.substack.com)

I turned decision theory problems into memes about trolleys

TapataktOct 30, 2024, 8:13 PM

104 points

23 comments1 min readLW link

LLMs can learn about themselves by introspection

Felix J Binder and Owain_Evans

Oct 18, 2024, 4:12 PM

102 points

38 comments9 min readLW link

Advice for journalists

Nathan YoungOct 7, 2024, 4:46 PM

101 points

53 comments9 min readLW link

(nathanpmyoung.substack.com)

Behavioral red-teaming is unlikely to produce clear, strong evidence that models aren’t scheming

BuckOct 10, 2024, 1:36 PM

100 points

4 comments13 min readLW link

The case for unlearning that removes information from LLM weights

Fabien RogerOct 14, 2024, 2:08 PM

96 points

18 comments6 min readLW link

Sabotage Evaluations for Frontier Models

David Duvenaud, Joe Benton, Sam Bowman, evhub, mishajw, Eric Christiansen, HoldenKarnofsky, Ethan Perez and Buck

Oct 18, 2024, 10:33 PM

95 points

56 comments6 min readLW link

(assets.anthropic.com)

Finishing The SB-1047 Documentary In 6 Weeks

Michaël TrazziOct 28, 2024, 8:17 PM

94 points

7 comments4 min readLW link

(manifund.org)

What is malevolence? On the nature, measurement, and distribution of dark traits

David Althaus, Chi Nguyen and Clare

Oct 23, 2024, 8:41 AM

93 points

23 comments LW link

Catastrophic sabotage as a major threat model for human-level AI systems

evhubOct 22, 2024, 8:57 PM

92 points

13 comments15 min readLW link

Three Notions of “Power”

johnswentworthOct 30, 2024, 6:10 AM

92 points

44 comments4 min readLW link

Self-prediction acts as an emergent regularizer

Cameron Berg, Judd Rosenblatt, Mike Vaiana, Diogo de Lucena, florin_pop and AE Studio

Oct 23, 2024, 10:27 PM

91 points

9 comments4 min readLW link

There is a globe in your LLM

jacob_droriOct 8, 2024, 12:43 AM

89 points

4 comments1 min readLW link

Scaffolding for “Noticing Metacognition”

RaemonOct 9, 2024, 5:54 PM

88 points

4 comments17 min readLW link

Self-Help Corner: Loop Detection

adamShimiOct 2, 2024, 8:33 AM

88 points

6 comments2 min readLW link

(formethods.substack.com)

Research update: Towards a Law of Iterated Expectations for Heuristic Estimators

Eric NeymanOct 7, 2024, 7:29 PM

87 points

2 comments22 min readLW link

Values Are Real Like Harry Potter

johnswentworth and David Lorell

Oct 9, 2024, 11:42 PM

86 points

21 comments5 min readLW link

5 homegrown EA projects, seeking small donors

Austin ChenOct 28, 2024, 11:24 PM

85 points

4 comments LW link

Newsom Vetoes SB 1047

ZviOct 1, 2024, 12:20 PM

84 points

6 comments32 min readLW link

(thezvi.wordpress.com)

[Intuitive self-models] 4. Trance

Steven ByrnesOct 8, 2024, 1:30 PM

82 points

7 comments24 min readLW link

Rationality Quotes—Fall 2024

ScrewtapeOct 10, 2024, 6:37 PM

79 points

27 comments1 min readLW link

Gwern: Why So Few Matt Levines?

kaveOct 29, 2024, 1:07 AM

78 points

10 comments1 min readLW link

(gwern.net)

[Intuitive self-models] 3. The Homunculus

Steven ByrnesOct 2, 2024, 3:20 PM

78 points

38 comments25 min readLW link

Bitter lessons about lucid dreaming

avturchinOct 16, 2024, 9:27 PM

77 points

62 comments2 min readLW link

Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)

ElizabethOct 22, 2024, 6:20 PM

76 points

82 comments1 min readLW link

(acesounderglass.com)

Brief analysis of OP Technical AI Safety Funding

22tomOct 25, 2024, 7:37 PM

76 points

5 comments1 min readLW link

Could randomly choosing people to serve as representatives lead to better government?

John HuangOct 21, 2024, 5:10 PM

75 points

13 comments10 min readLW link

Video lectures on the learning-theoretic agenda

Vanessa KosoyOct 27, 2024, 12:01 PM

75 points

0 comments1 min readLW link

(www.youtube.com)

[Question] Interest in Leetcode, but for Rationality?

Gregory Oct 16, 2024, 5:54 PM

74 points

20 comments2 min readLW link

Introducing Transluce — A Letter from the Founders

jsteinhardtOct 23, 2024, 6:10 PM

74 points

3 comments3 min readLW link

(bounded-regret.ghost.io)

A Narrow Path: a plan to deal with AI extinction risk

Andrea_Miotti, davekasten and Tolga

Oct 7, 2024, 1:02 PM

73 points

12 comments2 min readLW link

(www.narrowpath.co)

Joshua Achiam Public Statement Analysis

ZviOct 10, 2024, 12:50 PM

73 points

14 comments21 min readLW link

(thezvi.wordpress.com)

The Mask Comes Off: At What Price?

ZviOct 21, 2024, 11:50 PM

72 points

16 comments8 min readLW link

(thezvi.wordpress.com)

Automation collapse

Geoffrey Irving, Tomek Korbak and Benjamin Hilton

Oct 21, 2024, 2:50 PM

72 points

9 comments7 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer