All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

My hour of memoryless lucidity

Eric NeymanMay 4, 2024, 1:40 AM

372 points

35 comments5 min readLW link

(ericneyman.wordpress.com)

Notifications Received in 30 Minutes of Class

tanagrabeastMay 26, 2024, 5:02 PM

356 points

16 comments8 min readLW link

MIRI 2024 Communications Strategy

Gretta DulebaMay 29, 2024, 7:33 PM

325 points

216 comments7 min readLW link

Non-Disparagement Canaries for OpenAI

aysja and Adam Scholl

May 30, 2024, 7:20 PM

288 points

51 comments2 min readLW link

Truthseeking is the ground in which other principles grow

ElizabethMay 27, 2024, 1:09 AM

248 points

16 comments16 min readLW link

Ilya Sutskever and Jan Leike resign from OpenAI [updated]

Zach Stein-PerlmanMay 15, 2024, 12:45 AM

246 points

95 comments2 min readLW link

AI companies aren’t really using external evaluators

Zach Stein-PerlmanMay 24, 2024, 4:01 PM

242 points

15 comments4 min readLW link

OpenAI: Fallout

ZviMay 28, 2024, 1:20 PM

204 points

25 comments36 min readLW link

(thezvi.wordpress.com)

Jaan Tallinn’s 2023 Philanthropy Overview

jaanMay 20, 2024, 12:11 PM

203 points

5 comments1 min readLW link

(jaan.info)

Maybe Anthropic’s Long-Term Benefit Trust is powerless

Zach Stein-PerlmanMay 27, 2024, 1:00 PM

202 points

21 comments2 min readLW link

What’s Going on With OpenAI’s Messaging?

ozziegooenMay 21, 2024, 2:22 AM

191 points

13 comments LW link

DeepMind’s “Frontier Safety Framework” is weak and unambitious

Zach Stein-PerlmanMay 18, 2024, 3:00 AM

159 points

14 comments4 min readLW link

Deep Honesty

AletheophileMay 7, 2024, 8:31 PM

159 points

25 comments9 min readLW link

Language Models Model Us

eggsyntaxMay 17, 2024, 9:00 PM

158 points

55 comments7 min readLW link

EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024

scasperMay 21, 2024, 8:15 PM

157 points

16 comments3 min readLW link

Dyslucksia

Shoshannah TekofskyMay 9, 2024, 7:21 PM

154 points

45 comments6 min readLW link

OpenAI: Exodus

ZviMay 20, 2024, 1:10 PM

153 points

26 comments44 min readLW link

(thezvi.wordpress.com)

Value Claims (In Particular) Are Usually Bullshit

johnswentworthMay 30, 2024, 6:26 AM

144 points

18 comments2 min readLW link

The Pearly Gates

lsusrMay 30, 2024, 4:01 AM

127 points

6 comments3 min readLW link

Awakening

lsusrMay 30, 2024, 7:03 AM

124 points

79 comments9 min readLW link

Do you believe in hundred dollar bills lying on the ground? Consider humming

ElizabethMay 16, 2024, 12:00 AM

122 points

22 comments6 min readLW link

(acesounderglass.com)

[Question] Which skincare products are evidence-based?

Vanessa KosoyMay 2, 2024, 3:22 PM

120 points

48 comments1 min readLW link

Talent Needs of Technical AI Safety Teams

yams, Carson Jones, McKennaFitzgerald and Ryan Kidd

May 24, 2024, 12:36 AM

117 points

65 comments14 min readLW link

introduction to cancer vaccines

bhauthMay 5, 2024, 1:06 AM

113 points

19 comments5 min readLW link

(www.bhauth.com)

Key takeaways from our EA and alignment research surveys

Cameron Berg, Judd Rosenblatt, florin_pop and AE Studio

May 3, 2024, 6:10 PM

112 points

10 comments21 min readLW link

Clarifying METR’s Auditing Role

Beth BarnesMay 30, 2024, 6:41 PM

108 points

1 comment2 min readLW link

The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks

Lucius Bushnaq, jake_mendel, Dan Braun, StefanHex, Nicholas Goldowsky-Dill, Kaarel, Avery, Joern Stoehler, debrevitatevitae, Magdalena Wache and Marius Hobbhahn

May 20, 2024, 5:53 PM

107 points

4 comments3 min readLW link

Response to nostalgebraist: proudly waving my moral-antirealist battle flag

Steven ByrnesMay 29, 2024, 4:48 PM

103 points

29 comments11 min readLW link

Advice for Activists from the History of Environmentalism

Jeffrey HeningerMay 16, 2024, 6:40 PM

100 points

8 comments6 min readLW link

(blog.aiimpacts.org)

Explaining a Math Magic Trick

Robert_AIZIMay 5, 2024, 7:41 PM

99 points

10 comments5 min readLW link

We might be missing some key feature of AI takeoff; it’ll probably seem like “we could’ve seen this coming”

Lukas_GloorMay 9, 2024, 3:43 PM

98 points

36 comments5 min readLW link

Uncovering Deceptive Tendencies in Language Models: A Simulated Company AI Assistant

Olli Järviniemi and evhub

May 6, 2024, 7:07 AM

95 points

13 comments1 min readLW link

(arxiv.org)

[Question] How to get nerds fascinated about mysterious chronic illness research?

riceissaMay 27, 2024, 10:58 PM

95 points

50 comments2 min readLW link

I am the Golden Gate Bridge

ZviMay 27, 2024, 2:40 PM

95 points

6 comments27 min readLW link

(thezvi.wordpress.com)

Apollo Research 1-year update

Marius Hobbhahn, Lee Sharkey, Lucius Bushnaq, Dan Braun, Mikita Balesni, Jérémy Scheurer, Nicholas Goldowsky-Dill, StefanHex, jake_mendel, AlexMeinke and rusheb

May 29, 2024, 5:44 PM

93 points

0 comments7 min readLW link

“AI Safety for Fleshy Humans” an AI Safety explainer by Nicky Case

habrykaMay 3, 2024, 6:10 PM

90 points

11 comments4 min readLW link

(aisafety.dance)

Teaching CS During Take-Off

andrew carleMay 14, 2024, 10:45 PM

90 points

13 comments2 min readLW link

Hardshipification

Jonathan MoregårdMay 28, 2024, 8:02 PM

88 points

17 comments2 min readLW link

(honestliving.substack.com)

Review: Conor Moreton’s “Civilization & Cooperation”

Duncan Sabien (Inactive)May 26, 2024, 7:32 PM

88 points

8 comments38 min readLW link

MATS Winter 2023-24 Retrospective

utilistrutil, LauraVaughan, McKennaFitzgerald, Christian Smith, Juan Gil, Henry Sleight, Matthew Wearden and Ryan Kidd

May 11, 2024, 12:09 AM

86 points

28 comments49 min readLW link

OpenAI: Helen Toner Speaks

ZviMay 30, 2024, 9:10 PM

86 points

8 comments13 min readLW link

(thezvi.wordpress.com)

Environmentalism in the United States Is Unusually Partisan

Jeffrey HeningerMay 13, 2024, 9:23 PM

85 points

26 comments4 min readLW link

(blog.aiimpacts.org)

AISafety.com – Resources for AI Safety

Søren Elverlin, plex, Bryce Robertson and Melissa Samworth

May 17, 2024, 3:57 PM

83 points

3 comments1 min readLW link

My thesis (Algorithmic Bayesian Epistemology) explained in more depth

Eric NeymanMay 9, 2024, 7:43 PM

82 points

4 comments27 min readLW link

(ericneyman.wordpress.com)

New voluntary commitments (AI Seoul Summit)

Zach Stein-PerlmanMay 21, 2024, 11:00 AM

81 points

17 comments7 min readLW link

(www.gov.uk)

Instruction-following AGI is easier and more likely than value aligned AGI

Seth HerdMay 15, 2024, 7:38 PM

80 points

28 comments12 min readLW link

MIRI’s May 2024 Newsletter

HarlanMay 15, 2024, 12:13 AM

79 points

1 comment3 min readLW link

(intelligence.org)

Reward hacking behavior can generalize across tasks

Kei, Isaac Dunn, Henry Sleight, Miles Turpin, evhub, Carson Denison and Ethan Perez

May 28, 2024, 4:33 PM

79 points

5 comments21 min readLW link

LessWrong Community Weekend 2024, open for applications

UnplannedCauliflower and jt

May 1, 2024, 10:18 AM

79 points

2 comments7 min readLW link

ACX Covid Origins Post convinced readers

ErnestScribblerMay 1, 2024, 1:06 PM

77 points

7 comments2 min readLW link