How I got 4.2M YouTube views with­out mak­ing a sin­gle video

Closed Limelike Curves3 Sep 2024 3:52 UTC
362 points
36 comments1 min readLW link

The Great Data In­te­gra­tion Schlep

sarahconstantin13 Sep 2024 15:40 UTC
258 points
16 comments9 min readLW link
(sarahconstantin.substack.com)

The Best Lay Ar­gu­ment is not a Sim­ple English Yud Essay

J Bostock10 Sep 2024 17:34 UTC
247 points
15 comments5 min readLW link

Laz­i­ness death spirals

PatrickDFarley19 Sep 2024 15:58 UTC
242 points
34 comments8 min readLW link

the case for CoT un­faith­ful­ness is overstated

nostalgebraist29 Sep 2024 22:07 UTC
242 points
40 comments11 min readLW link

Ex­plore More: A Bag of Tricks to Keep Your Life on the Rails

Shoshannah Tekofsky28 Sep 2024 21:38 UTC
225 points
13 comments11 min readLW link
(shoshanigans.substack.com)

“Slow” take­off is a ter­rible term for “maybe even faster take­off, ac­tu­ally”

Raemon28 Sep 2024 23:38 UTC
214 points
69 comments1 min readLW link

The Sun is big, but su­per­in­tel­li­gences will not spare Earth a lit­tle sunlight

Eliezer Yudkowsky23 Sep 2024 3:39 UTC
203 points
141 comments13 min readLW link

Pay Risk Eval­u­a­tors in Cash, Not Equity

Adam Scholl7 Sep 2024 2:37 UTC
200 points
19 comments1 min readLW link

A ba­sic sys­tems ar­chi­tec­ture for AI agents that do au­tonomous research

Buck23 Sep 2024 13:58 UTC
187 points
15 comments8 min readLW link

Cry­on­ics is free

Mati_Roy29 Sep 2024 17:58 UTC
184 points
37 comments2 min readLW link

Con­tra pa­pers claiming su­per­hu­man AI forecasting

12 Sep 2024 18:10 UTC
180 points
16 comments7 min readLW link

Skills from a year of Pur­pose­ful Ra­tion­al­ity Practice

Raemon18 Sep 2024 2:05 UTC
178 points
18 comments7 min readLW link

[Question] Why is o1 so de­cep­tive?

abramdemski27 Sep 2024 17:27 UTC
177 points
24 comments3 min readLW link

Strug­gling like a Shadowmoth

Raemon24 Sep 2024 0:47 UTC
175 points
38 comments7 min readLW link

Did Christo­pher Hitchens change his mind about wa­ter­board­ing?

Isaac King15 Sep 2024 8:28 UTC
171 points
22 comments7 min readLW link

My takes on SB-1047

leogao9 Sep 2024 18:38 UTC
151 points
8 comments4 min readLW link

OpenAI o1

Zach Stein-Perlman12 Sep 2024 17:30 UTC
147 points
41 comments1 min readLW link

Stanis­lav Petrov Quar­terly Perfor­mance Review

Ricki Heicklen26 Sep 2024 21:20 UTC
145 points
3 comments5 min readLW link
(bayesshammai.substack.com)

That Alien Mes­sage—The Animation

Writer7 Sep 2024 14:53 UTC
144 points
9 comments8 min readLW link
(youtu.be)

The Check­list: What Suc­ceed­ing at AI Safety Will In­volve

Sam Bowman3 Sep 2024 18:18 UTC
142 points
49 comments22 min readLW link
(sleepinyourhat.github.io)

Sur­vey: How Do Elite Chi­nese Stu­dents Feel About the Risks of AI?

Nick Corvino2 Sep 2024 18:11 UTC
141 points
13 comments10 min readLW link

[Com­pleted] The 2024 Petrov Day Scenario

26 Sep 2024 8:08 UTC
136 points
114 comments5 min readLW link

My Num­ber 1 Episte­mol­ogy Book Recom­men­da­tion: In­vent­ing Temperature

adamShimi8 Sep 2024 14:30 UTC
116 points
18 comments3 min readLW link
(epistemologicalfascinations.substack.com)

Why I funded PIBBSS

Ryan Kidd15 Sep 2024 19:56 UTC
115 points
21 comments3 min readLW link

Back­doors as an anal­ogy for de­cep­tive alignment

6 Sep 2024 15:30 UTC
104 points
2 comments8 min readLW link
(www.alignment.org)

What hap­pens if you pre­sent 500 peo­ple with an ar­gu­ment that AI is risky?

4 Sep 2024 16:40 UTC
102 points
7 comments3 min readLW link
(blog.aiimpacts.org)

Re­fac­tor­ing cry­on­ics as struc­tural brain preservation

Andy_McKenzie11 Sep 2024 18:36 UTC
102 points
14 comments3 min readLW link

2024 Petrov Day Retrospective

28 Sep 2024 21:30 UTC
93 points
25 comments10 min readLW link

[Question] What are the best ar­gu­ments for/​against AIs be­ing “slightly ‘nice’”?

Raemon24 Sep 2024 2:00 UTC
93 points
54 comments31 min readLW link

You can, in fact, bam­boo­zle an un­al­igned AI into spar­ing your life

David Matolcsi29 Sep 2024 16:59 UTC
92 points
171 comments27 min readLW link

Ex­e­cutable philos­o­phy as a failed to­tal­iz­ing meta-worldview

jessicata4 Sep 2024 22:50 UTC
88 points
40 comments4 min readLW link
(unstableontology.com)

[In­tu­itive self-mod­els] 1. Preliminaries

Steven Byrnes19 Sep 2024 13:45 UTC
86 points
20 comments15 min readLW link

GPT-o1

Zvi16 Sep 2024 13:40 UTC
86 points
34 comments46 min readLW link
(thezvi.wordpress.com)

OpenAI o1, Llama 4, and AlphaZero of LLMs

Vladimir_Nesov14 Sep 2024 21:27 UTC
83 points
24 comments1 min readLW link

AI #83: The Mask Comes Off

Zvi26 Sep 2024 12:00 UTC
82 points
19 comments36 min readLW link
(thezvi.wordpress.com)

How to pre­vent col­lu­sion when us­ing un­trusted mod­els to mon­i­tor each other

Buck25 Sep 2024 18:58 UTC
81 points
6 comments22 min readLW link

Not ev­ery ac­com­mo­da­tion is a Curb Cut Effect: The Handi­capped Park­ing Effect, the Clap­per Effect, and more

Michael Cohn15 Sep 2024 5:27 UTC
80 points
39 comments10 min readLW link
(perplexedguide.net)

The case for a nega­tive al­ign­ment tax

18 Sep 2024 18:33 UTC
79 points
20 comments7 min readLW link

[In­tu­itive self-mod­els] 2. Con­scious Awareness

Steven Byrnes25 Sep 2024 13:29 UTC
79 points
48 comments16 min readLW link

Is “su­per­hu­man” AI fore­cast­ing BS? Some ex­per­i­ments on the “539″ bot from the Cen­tre for AI Safety

titotal18 Sep 2024 13:07 UTC
78 points
3 comments1 min readLW link
(open.substack.com)

My 10-year ret­ro­spec­tive on try­ing SSRIs

Kaj_Sotala22 Sep 2024 20:30 UTC
76 points
10 comments2 min readLW link
(kajsotala.fi)

The Oblique­ness Thesis

jessicata19 Sep 2024 0:26 UTC
75 points
16 comments17 min readLW link

Ex­cerpts from “A Reader’s Man­i­festo”

Arjun Panickssery6 Sep 2024 22:37 UTC
72 points
1 comment13 min readLW link
(arjunpanickssery.substack.com)

Adam Op­ti­mizer Causes Priv­ileged Ba­sis in Trans­former LM Resi­d­ual Stream

6 Sep 2024 17:55 UTC
70 points
7 comments4 min readLW link

In­ves­ti­gat­ing an in­surance-for-AI startup

21 Sep 2024 15:29 UTC
69 points
0 comments16 min readLW link
(www.strataoftheworld.com)

[Paper] A is for Ab­sorp­tion: Study­ing Fea­ture Split­ting and Ab­sorp­tion in Sparse Autoencoders

25 Sep 2024 9:31 UTC
69 points
15 comments3 min readLW link
(arxiv.org)

Es­ti­mat­ing Tail Risk in Neu­ral Networks

Mark Xu13 Sep 2024 20:00 UTC
68 points
9 comments23 min readLW link
(www.alignment.org)

o1-pre­view is pretty good at do­ing ML on an un­known dataset

Håvard Tveit Ihle20 Sep 2024 8:39 UTC
67 points
1 comment2 min readLW link

Book Re­view: On the Edge: The Fundamentals

Zvi23 Sep 2024 13:40 UTC
64 points
3 comments31 min readLW link
(thezvi.wordpress.com)