All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

AllJan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

An AI risk argument that resonates with NYTimes readers

Julian BradshawMar 12, 2023, 11:09 PM

212 points

14 comments1 min readLW link

Launching Lightspeed Grants (Apply by July 6th)

habrykaJun 7, 2023, 2:53 AM

211 points

42 comments5 min readLW link

Actually, Othello-GPT Has A Linear Emergent World Representation

Neel NandaMar 29, 2023, 10:13 PM

211 points

26 comments19 min readLW link

(neelnanda.io)

Giant (In)scrutable Matrices: (Maybe) the Best of All Possible Worlds

1a3ornApr 4, 2023, 5:39 PM

211 points

38 comments5 min readLW link 1 review

Thoughts on sharing information about language model capabilities

paulfchristianoJul 31, 2023, 4:04 PM

210 points

44 comments11 min readLW link 1 review

The Lighthaven Campus is open for bookings

habrykaSep 30, 2023, 1:08 AM

209 points

18 comments4 min readLW link

(www.lighthaven.space)

Evolution provides no evidence for the sharp left turn

Quintin PopeApr 11, 2023, 6:43 PM

206 points

65 comments15 min readLW link 1 review

My current LK99 questions

Eliezer YudkowskyAug 1, 2023, 10:48 PM

206 points

38 comments5 min readLW link

Feedbackloop-first Rationality

RaemonAug 7, 2023, 5:58 PM

205 points

69 comments8 min readLW link 2 reviews

Lightcone Infrastructure/LessWrong is looking for funding

habrykaJun 14, 2023, 4:45 AM

205 points

39 comments1 min readLW link

If interpretability research goes well, it may get dangerous

So8resApr 3, 2023, 9:48 PM

202 points

11 comments2 min readLW link

We’re Not Ready: thoughts on “pausing” and responsible scaling policies

HoldenKarnofskyOct 27, 2023, 3:19 PM

200 points

33 comments8 min readLW link

Comp Sci in 2027 (Short story by Eliezer Yudkowsky)

sudoOct 29, 2023, 11:09 PM

200 points

24 comments10 min readLW link 1 review

(nitter.net)

My tentative best guess on how EAs and Rationalists sometimes turn crazy

habrykaJun 21, 2023, 4:11 AM

199 points

110 comments8 min readLW link

GPT-4 Plugs In

ZviMar 27, 2023, 12:10 PM

198 points

47 comments6 min readLW link

(thezvi.wordpress.com)

My “2.9 trauma limit”

RaemonJul 1, 2023, 7:32 PM

198 points

31 comments7 min readLW link

Thoughts on “AI is easy to control” by Pope & Belrose

Steven ByrnesDec 1, 2023, 5:30 PM

197 points

63 comments14 min readLW link 1 review

Thinking By The Clock

ScrewtapeNov 8, 2023, 7:40 AM

197 points

29 comments8 min readLW link 1 review

Ability to solve long-horizon tasks correlates with wanting things in the behaviorist sense

So8resNov 24, 2023, 5:37 PM

197 points

84 comments5 min readLW link 1 review

Killing Socrates

Duncan Sabien (Inactive)Apr 11, 2023, 10:28 AM

196 points

146 comments8 min readLW link 1 review

Is being sexy for your homies?

ValentineDec 13, 2023, 8:37 PM

195 points

100 comments14 min readLW link 2 reviews

Cognitive Emulation: A Naive AI Safety Proposal

Connor Leahy and Gabriel Alfour

Feb 25, 2023, 7:35 PM

195 points

46 comments4 min readLW link

Inference-Time Intervention: Eliciting Truthful Answers from a Language Model

likennethJun 11, 2023, 5:38 AM

195 points

4 comments1 min readLW link

(arxiv.org)

Acausal normalcy

Andrew_CritchMar 3, 2023, 11:34 PM

195 points

36 comments8 min readLW link 1 review

Propaganda or Science: A Look at Open Source AI and Bioterrorism Risk

1a3ornNov 2, 2023, 6:20 PM

193 points

79 comments23 min readLW link

AI as a science, and three obstacles to alignment strategies

So8resOct 25, 2023, 9:00 PM

193 points

80 comments11 min readLW link

AI alignment researchers don’t (seem to) stack

So8resFeb 21, 2023, 12:48 AM

193 points

40 comments3 min readLW link

Towards Developmental Interpretability

Jesse Hoogland, Alexander Gietelink Oldenziel, Daniel Murfet and Stan van Wingerden

Jul 12, 2023, 7:33 PM

192 points

10 comments9 min readLW link 1 review

The ‘ petertodd’ phenomenon

mwatkinsApr 15, 2023, 12:59 AM

192 points

50 comments38 min readLW link 1 review

Sam Altman fired from OpenAI

LawrenceCNov 17, 2023, 8:42 PM

192 points

75 comments1 min readLW link

(openai.com)

“Humanity vs. AGI” Will Never Look Like “Humanity vs. AGI” to Humanity

Thane RuthenisDec 16, 2023, 8:08 PM

191 points

34 comments5 min readLW link

Twiblings, four-parent babies and other reproductive technology

GeneSmithMay 20, 2023, 5:11 PM

191 points

33 comments6 min readLW link

Grant applications and grand narratives

ElizabethJul 2, 2023, 12:16 AM

191 points

22 comments6 min readLW link

Transcript and Brief Response to Twitter Conversation between Yann LeCunn and Eliezer Yudkowsky

ZviApr 26, 2023, 1:10 PM

190 points

51 comments10 min readLW link

(thezvi.wordpress.com)

Cryonics and Regret

MvBJul 24, 2023, 9:16 AM

190 points

35 comments2 min readLW link 1 review

The King and the Golem

Richard_NgoSep 25, 2023, 7:51 PM

190 points

19 comments5 min readLW link 1 review

(narrativeark.substack.com)

The other side of the tidal wave

KatjaGraceNov 3, 2023, 5:40 AM

189 points

86 comments1 min readLW link

(worldspiritsockpuppet.com)

The basic reasons I expect AGI ruin

Rob BensingerApr 18, 2023, 3:37 AM

189 points

73 comments14 min readLW link

Announcing Timaeus

Jesse Hoogland, Daniel Murfet, Alexander Gietelink Oldenziel and Stan van Wingerden

Oct 22, 2023, 11:59 AM

188 points

15 comments4 min readLW link

Evaluating the historical value misspecification argument

Matthew BarnettOct 5, 2023, 6:34 PM

188 points

162 comments7 min readLW link 3 reviews

What a compute-centric framework says about AI takeoff speeds

Tom DavidsonJan 23, 2023, 4:02 AM

188 points

30 comments16 min readLW link 1 review

A Golden Age of Building? Excerpts and lessons from Empire State, Pentagon, Skunk Works and SpaceX

Bird ConceptSep 1, 2023, 4:03 AM

188 points

26 comments24 min readLW link 1 review

Effective Aspersions: How the Nonlinear Investigation Went Wrong

TracingWoodgrainsDec 19, 2023, 12:00 PM

188 points

172 comments LW link 2 reviews

How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

JanB, Owain_Evans and SoerenMind

Sep 28, 2023, 6:53 PM

187 points

39 comments3 min readLW link 1 review

Another medical miracle

DentinJun 25, 2023, 8:43 PM

186 points

48 comments3 min readLW link

EigenKarma: trust at scale

Henrik KarlssonFeb 8, 2023, 6:52 PM

186 points

52 comments5 min readLW link

What will GPT-2030 look like?

jsteinhardtJun 7, 2023, 11:40 PM

185 points

43 comments23 min readLW link

(bounded-regret.ghost.io)

Large Language Models will be Great for Censorship

Ethan EdwardsAug 21, 2023, 7:03 PM

185 points

14 comments8 min readLW link

(ethanedwards.substack.com)

Why Not Just… Build Weak AI Tools For AI Alignment Research?

johnswentworthMar 5, 2023, 12:12 AM

184 points

18 comments6 min readLW link

OpenAI API base models are not sycophantic, at any size

nostalgebraist29 Aug 2023 0:58 UTC

183 points

20 comments2 min readLW link

(colab.research.google.com)