All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All JanFebMar Apr May Jun Jul Aug Sep Oct Nov Dec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

Scale Was All We Needed, At First

Gabe MFeb 14, 2024, 1:49 AM

295 points

34 comments8 min readLW link

(aiacumen.substack.com)

Raising children on the eve of AI

juliawiseFeb 15, 2024, 9:28 PM

275 points

47 comments5 min readLW link

“No-one in my org puts money in their pension”

TobesFeb 16, 2024, 6:33 PM

271 points

16 comments9 min readLW link

(seekingtobejolly.substack.com)

Believing In

AnnaSalamonFeb 8, 2024, 7:06 AM

241 points

51 comments13 min readLW link

CFAR Takeaways: Andrew Critch

RaemonFeb 14, 2024, 1:37 AM

217 points

64 comments5 min readLW link

Brute Force Manufactured Consensus is Hiding the Crime of the Century

RokoFeb 3, 2024, 8:36 PM

209 points

156 comments9 min readLW link

Sam Altman’s Chip Ambitions Undercut OpenAI’s Safety Strategy

garrisonFeb 10, 2024, 7:52 PM

198 points

52 comments LW link

(garrisonlovely.substack.com)

Contra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party”

Ricki HeicklenFeb 22, 2024, 11:56 PM

186 points

5 comments4 min readLW link

(bayesshammai.substack.com)

Every “Every Bay Area House Party” Bay Area House Party

Richard_NgoFeb 16, 2024, 6:53 PM

181 points

6 comments4 min readLW link

Timaeus’s First Four Months

Jesse Hoogland, Daniel Murfet, Stan van Wingerden and Alexander Gietelink Oldenziel

Feb 28, 2024, 5:01 PM

173 points

6 comments6 min readLW link

And All the Shoggoths Merely Players

Zack_M_DavisFeb 10, 2024, 7:56 PM

170 points

57 comments12 min readLW link

Masterpiece

Richard_NgoFeb 13, 2024, 11:10 PM

166 points

21 comments4 min readLW link

(www.narrativeark.xyz)

2023 Survey Results

ScrewtapeFeb 16, 2024, 10:24 PM

150 points

26 comments44 min readLW link

Updatelessness doesn’t solve most problems

Martín SotoFeb 8, 2024, 5:30 PM

135 points

45 comments12 min readLW link

Things I’ve Grieved

RaemonFeb 18, 2024, 7:32 PM

125 points

6 comments2 min readLW link

The Pareto Best and the Curse of Doom

ScrewtapeFeb 21, 2024, 11:10 PM

120 points

21 comments9 min readLW link

Rationality Research Report: Towards 10x OODA Looping?

RaemonFeb 24, 2024, 9:06 PM

117 points

25 comments15 min readLW link

Attitudes about Applied Rationality

Camille Berger Feb 3, 2024, 2:42 PM

108 points

18 comments4 min readLW link

Skills I’d like my collaborators to have

RaemonFeb 9, 2024, 8:20 AM

106 points

9 comments8 min readLW link

A Chess-GPT Linear Emergent World Representation

Adam KarvonenFeb 8, 2024, 4:25 AM

105 points

14 comments7 min readLW link

(adamkarvonen.github.io)

New LessWrong review winner UI (“The LeastWrong” section and full-art post pages)

kaveFeb 28, 2024, 2:42 AM

105 points

64 comments1 min readLW link

Dreams of AI alignment: The danger of suggestive names

TurnTroutFeb 10, 2024, 1:22 AM

103 points

59 comments4 min readLW link

Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small

Joseph BloomFeb 2, 2024, 6:54 AM

103 points

37 comments15 min readLW link

Lsusr’s Rationality Dojo

lsusrFeb 13, 2024, 5:52 AM

103 points

17 comments2 min readLW link

Counting arguments provide no evidence for AI doom

Nora Belrose and Quintin Pope

Feb 27, 2024, 11:03 PM

101 points

188 comments14 min readLW link

My cover story in Jacobin on AI capitalism and the x-risk debates

garrisonFeb 12, 2024, 11:34 PM

98 points

5 comments LW link

(jacobin.com)

Announcing the London Initiative for Safe AI (LISA)

James Fox, mike_safeAI and Ryan Kidd

Feb 2, 2024, 11:17 PM

98 points

0 comments9 min readLW link

Things You’re Allowed to Do: University Edition

Saul MunnFeb 6, 2024, 12:36 AM

97 points

13 comments5 min readLW link

(www.brasstacks.blog)

OpenAI’s Sora is an agent

Caleb BiddulphFeb 16, 2024, 7:35 AM

97 points

25 comments4 min readLW link

Ideological Bayesians

Kevin DorstFeb 25, 2024, 2:17 PM

96 points

4 comments10 min readLW link

(kevindorst.substack.com)

Everything Wrong with Roko’s Claims about an Engineered Pandemic

WitheringWeightsFeb 22, 2024, 3:59 PM

94 points

10 comments16 min readLW link

How well do truth probes generalise?

mishajwFeb 24, 2024, 2:12 PM

93 points

11 comments9 min readLW link

How to train your own “Sleeper Agents”

evhubFeb 7, 2024, 12:31 AM

92 points

11 comments2 min readLW link

story-based decision-making

bhauthFeb 7, 2024, 2:35 AM

90 points

11 comments4 min readLW link

Debating with More Persuasive LLMs Leads to More Truthful Answers

Akbir Khan, John Hughes, Dan Valentine, Sam Bowman and Ethan Perez

Feb 7, 2024, 9:28 PM

89 points

14 comments9 min readLW link

(arxiv.org)

More Hyphenation

Arjun PanicksseryFeb 7, 2024, 7:43 PM

88 points

19 comments1 min readLW link

(arjunpanickssery.substack.com)

Addressing Feature Suppression in SAEs

Benjamin Wright and Lee Sharkey

Feb 16, 2024, 6:32 PM

86 points

4 comments10 min readLW link

AI #51: Altman’s Ambition

ZviFeb 20, 2024, 7:50 PM

83 points

5 comments38 min readLW link

(thezvi.wordpress.com)

Retirement Accounts and Short Timelines

jefftkFeb 19, 2024, 6:50 PM

83 points

35 comments2 min readLW link

(www.jefftk.com)

The Gemini Incident

ZviFeb 22, 2024, 9:00 PM

80 points

19 comments18 min readLW link

(thezvi.wordpress.com)

Wrong answer bias

lemonhopeFeb 1, 2024, 8:05 PM

78 points

23 comments1 min readLW link

Attention SAEs Scale to GPT-2 Small

Connor Kissane, robertzk, Arthur Conmy and Neel Nanda

Feb 3, 2024, 6:50 AM

78 points

4 comments8 min readLW link

Analogies between scaling labs and misaligned superintelligent AI

scasper21 Feb 2024 19:29 UTC

77 points

5 comments4 min readLW link

My guess at Conjecture’s vision: triggering a narrative bifurcation

Alexandre Variengien6 Feb 2024 19:10 UTC

75 points

12 comments16 min readLW link

Implementing activation steering

Annah5 Feb 2024 17:51 UTC

75 points

8 comments7 min readLW link

Do sparse autoencoders find “true features”?

Demian Till22 Feb 2024 18:06 UTC

74 points

33 comments11 min readLW link

The One and a Half Gemini

Zvi22 Feb 2024 13:10 UTC

73 points

4 comments8 min readLW link

(thezvi.wordpress.com)

Preventing model exfiltration with upload limits

ryan_greenblatt6 Feb 2024 16:29 UTC

71 points

22 comments14 min readLW link

Survey for alignment researchers!

Cameron Berg, Judd Rosenblatt and AE Studio

2 Feb 2024 20:41 UTC

71 points

11 comments1 min readLW link

Davidad’s Provably Safe AI Architecture—ARIA’s Programme Thesis

simeon_c1 Feb 2024 21:30 UTC

69 points

17 comments1 min readLW link

(www.aria.org.uk)