LessWrong’s (first) album: I Have Been A Good Bing

1 Apr 2024 7:33 UTC
564 points
174 comments11 min readLW link

OpenAI Email Archives (from Musk v. Alt­man)

habryka16 Nov 2024 6:38 UTC
462 points
57 comments32 min readLW link

I would have shit in that alley, too

Declan Molony18 Jun 2024 4:41 UTC
431 points
134 comments4 min readLW link

Trans­form­ers Rep­re­sent Belief State Geom­e­try in their Resi­d­ual Stream

Adam Shai16 Apr 2024 21:16 UTC
411 points
100 comments12 min readLW link

Failures in Kindness

silentbob26 Mar 2024 21:30 UTC
401 points
60 comments9 min readLW link

Reli­able Sources: The Story of David Gerard

TracingWoodgrains10 Jul 2024 19:50 UTC
381 points
53 comments43 min readLW link

The Best Tacit Knowl­edge Videos on Every Subject

Parker Conley31 Mar 2024 17:14 UTC
375 points
143 comments16 min readLW link

There is way too much serendipity

Malmesbury19 Jan 2024 19:37 UTC
365 points
56 comments7 min readLW link

How I got 4.2M YouTube views with­out mak­ing a sin­gle video

Closed Limelike Curves3 Sep 2024 3:52 UTC
362 points
36 comments1 min readLW link

My hour of mem­o­ryless lucidity

Eric Neyman4 May 2024 1:40 UTC
358 points
35 comments5 min readLW link
(ericneyman.wordpress.com)

No­tifi­ca­tions Re­ceived in 30 Minutes of Class

tanagrabeast26 May 2024 17:02 UTC
353 points
16 comments8 min readLW link

Thoughts on seed oil

dynomight20 Apr 2024 12:29 UTC
347 points
128 comments17 min readLW link
(dynomight.net)

The hos­tile telepaths problem

Valentine27 Oct 2024 15:26 UTC
339 points
74 comments15 min readLW link

Safety isn’t safety with­out a so­cial model (or: dis­pel­ling the myth of per se tech­ni­cal safety)

Andrew_Critch14 Jun 2024 0:16 UTC
338 points
38 comments4 min readLW link

[April Fools’ Day] In­tro­duc­ing Open As­teroid Impact

Linch1 Apr 2024 8:14 UTC
334 points
29 comments1 min readLW link
(openasteroidimpact.org)

MIRI 2024 Com­mu­ni­ca­tions Strategy

Gretta Duleba29 May 2024 19:33 UTC
319 points
202 comments7 min readLW link

You don’t know how bad most things are nor pre­cisely how they’re bad.

Solenoid_Entity4 Aug 2024 14:12 UTC
313 points
48 comments5 min readLW link

Sur­vival with­out dignity

L Rudolf L4 Nov 2024 2:29 UTC
310 points
26 comments15 min readLW link
(nosetgauge.substack.com)

I got dysen­tery so you don’t have to

eukaryote22 Oct 2024 4:55 UTC
305 points
4 comments17 min readLW link
(eukaryotewritesblog.com)

Sleeper Agents: Train­ing De­cep­tive LLMs that Per­sist Through Safety Training

12 Jan 2024 19:51 UTC
305 points
95 comments3 min readLW link
(arxiv.org)

Gentle­ness and the ar­tifi­cial Other

Joe Carlsmith2 Jan 2024 18:21 UTC
291 points
33 comments11 min readLW link

Non-Dis­par­age­ment Ca­naries for OpenAI

30 May 2024 19:20 UTC
287 points
51 comments2 min readLW link

Would catch­ing your AIs try­ing to es­cape con­vince AI de­vel­op­ers to slow down or un­de­ploy?

Buck26 Aug 2024 16:46 UTC
285 points
69 comments4 min readLW link

Univer­sal Ba­sic In­come and Poverty

Eliezer Yudkowsky26 Jul 2024 7:23 UTC
281 points
131 comments9 min readLW link

Scale Was All We Needed, At First

Gabe M14 Feb 2024 1:49 UTC
279 points
32 comments8 min readLW link
(aiacumen.substack.com)

My AI Model Delta Com­pared To Yudkowsky

johnswentworth10 Jun 2024 16:12 UTC
276 points
102 comments4 min readLW link

80,000 hours should re­move OpenAI from the Job Board (and similar EA orgs should do similarly)

Raemon3 Jul 2024 20:34 UTC
272 points
71 comments1 min readLW link

Ex­press in­ter­est in an “FHI of the West”

habryka18 Apr 2024 3:32 UTC
268 points
41 comments3 min readLW link

Leav­ing MIRI, Seek­ing Funding

abramdemski8 Aug 2024 18:32 UTC
267 points
19 comments2 min readLW link

“No-one in my org puts money in their pen­sion”

Tobes16 Feb 2024 18:33 UTC
267 points
16 comments9 min readLW link
(seekingtobejolly.substack.com)

Overview of strong hu­man in­tel­li­gence am­plifi­ca­tion methods

TsviBT8 Oct 2024 8:37 UTC
264 points
141 comments10 min readLW link

On green

Joe Carlsmith21 Mar 2024 17:38 UTC
263 points
35 comments31 min readLW link

Get­ting 50% (SoTA) on ARC-AGI with GPT-4o

ryan_greenblatt17 Jun 2024 18:44 UTC
262 points
49 comments13 min readLW link

The Great Data In­te­gra­tion Schlep

sarahconstantin13 Sep 2024 15:40 UTC
258 points
16 comments9 min readLW link
(sarahconstantin.substack.com)

The case for en­sur­ing that pow­er­ful AIs are controlled

24 Jan 2024 16:11 UTC
258 points
66 comments28 min readLW link

Rais­ing chil­dren on the eve of AI

juliawise15 Feb 2024 21:28 UTC
258 points
47 comments5 min readLW link

My PhD the­sis: Al­gorith­mic Bayesian Epistemology

Eric Neyman16 Mar 2024 22:56 UTC
257 points
14 comments7 min readLW link
(arxiv.org)

Paul Chris­ti­ano named as US AI Safety In­sti­tute Head of AI Safety

Joel Burget16 Apr 2024 16:22 UTC
256 points
58 comments1 min readLW link
(www.commerce.gov)

My Clients, The Liars

ymeskhout5 Mar 2024 21:06 UTC
247 points
85 comments7 min readLW link

The Best Lay Ar­gu­ment is not a Sim­ple English Yud Essay

J Bostock10 Sep 2024 17:34 UTC
247 points
15 comments5 min readLW link

Ilya Sutskever and Jan Leike re­sign from OpenAI [up­dated]

Zach Stein-Perlman15 May 2024 0:45 UTC
246 points
95 comments2 min readLW link

Prin­ci­ples for the AGI Race

William_S30 Aug 2024 14:29 UTC
244 points
13 comments18 min readLW link

Truth­seek­ing is the ground in which other prin­ci­ples grow

Elizabeth27 May 2024 1:09 UTC
242 points
16 comments16 min readLW link

Laz­i­ness death spirals

PatrickDFarley19 Sep 2024 15:58 UTC
242 points
34 comments8 min readLW link

the case for CoT un­faith­ful­ness is overstated

nostalgebraist29 Sep 2024 22:07 UTC
242 points
40 comments11 min readLW link

AI com­pa­nies aren’t re­ally us­ing ex­ter­nal evaluators

Zach Stein-Perlman24 May 2024 16:01 UTC
240 points
15 comments4 min readLW link

Believ­ing In

AnnaSalamon8 Feb 2024 7:06 UTC
230 points
51 comments13 min readLW link

Re­fusal in LLMs is me­di­ated by a sin­gle direction

27 Apr 2024 11:13 UTC
228 points
93 comments10 min readLW link

Ex­plore More: A Bag of Tricks to Keep Your Life on the Rails

Shoshannah Tekofsky28 Sep 2024 21:38 UTC
225 points
13 comments11 min readLW link
(shoshanigans.substack.com)

In­tro­duc­ing AI Lab Watch

Zach Stein-Perlman30 Apr 2024 17:00 UTC
222 points
30 comments1 min readLW link
(ailabwatch.org)