AGI Ruin: A List of Lethalities

Eliezer Yudkowsky5 Jun 2022 22:05 UTC
908 points
701 comments30 min readLW link3 reviews

Where I agree and dis­agree with Eliezer

paulfchristiano19 Jun 2022 19:15 UTC
888 points
220 comments18 min readLW link2 reviews

What an ac­tu­ally pes­simistic con­tain­ment strat­egy looks like

lc5 Apr 2022 0:19 UTC
675 points
138 comments6 min readLW link2 reviews

Simulators

janus2 Sep 2022 12:45 UTC
609 points
162 comments41 min readLW link8 reviews
(generative.ink)

Let’s think about slow­ing down AI

KatjaGrace22 Dec 2022 17:40 UTC
549 points
182 comments38 min readLW link3 reviews
(aiimpacts.org)

The Redac­tion Machine

Ben20 Sep 2022 22:03 UTC
500 points
48 comments27 min readLW link1 review

Luck based medicine: my re­sent­ful story of be­com­ing a med­i­cal miracle

Elizabeth16 Oct 2022 17:40 UTC
483 points
121 comments12 min readLW link3 reviews
(acesounderglass.com)

Los­ing the root for the tree

Adam Zerner20 Sep 2022 4:53 UTC
474 points
31 comments9 min readLW link1 review

Counter-the­ses on Sleep

Natália21 Mar 2022 23:21 UTC
444 points
131 comments15 min readLW link1 review

It’s Prob­a­bly Not Lithium

Natália28 Jun 2022 21:24 UTC
442 points
187 comments28 min readLW link1 review

chin­chilla’s wild implications

nostalgebraist31 Jul 2022 1:18 UTC
420 points
128 comments10 min readLW link1 review

(My un­der­stand­ing of) What Every­one in Tech­ni­cal Align­ment is Do­ing and Why

29 Aug 2022 1:23 UTC
413 points
90 comments37 min readLW link1 review

It Looks Like You’re Try­ing To Take Over The World

gwern9 Mar 2022 16:35 UTC
406 points
120 comments1 min readLW link1 review
(www.gwern.net)

Deep­Mind al­ign­ment team opinions on AGI ruin arguments

Vika12 Aug 2022 21:06 UTC
395 points
37 comments14 min readLW link1 review

Reflec­tions on six months of fatherhood

jasoncrawford31 Jan 2022 5:28 UTC
387 points
24 comments4 min readLW link1 review
(jasoncrawford.org)

Re­ward is not the op­ti­miza­tion target

TurnTrout25 Jul 2022 0:03 UTC
376 points
123 comments10 min readLW link3 reviews

Lies Told To Children

Eliezer Yudkowsky14 Apr 2022 11:25 UTC
375 points
94 comments7 min readLW link1 review

You Are Not Mea­sur­ing What You Think You Are Measuring

johnswentworth20 Sep 2022 20:04 UTC
374 points
44 comments8 min readLW link2 reviews

A Mechanis­tic In­ter­pretabil­ity Anal­y­sis of Grokking

15 Aug 2022 2:41 UTC
373 points
47 comments36 min readLW link1 review
(colab.research.google.com)

Coun­ter­ar­gu­ments to the ba­sic AI x-risk case

KatjaGrace14 Oct 2022 13:00 UTC
370 points
124 comments34 min readLW link1 review
(aiimpacts.org)

Without spe­cific coun­ter­mea­sures, the eas­iest path to trans­for­ma­tive AI likely leads to AI takeover

Ajeya Cotra18 Jul 2022 19:06 UTC
365 points
94 comments75 min readLW link1 review

Ac­count­ing For Col­lege Costs

johnswentworth1 Apr 2022 17:28 UTC
363 points
41 comments7 min readLW link

Se­cu­rity Mind­set: Les­sons from 20+ years of Soft­ware Se­cu­rity Failures Rele­vant to AGI Alignment

elspood21 Jun 2022 23:55 UTC
361 points
42 comments7 min readLW link1 review

What DALL-E 2 can and can­not do

Swimmer963 (Miranda Dixon-Luinenburg) 1 May 2022 23:51 UTC
353 points
303 comments9 min readLW link

Star­ing into the abyss as a core life skill

benkuhn22 Dec 2022 15:30 UTC
341 points
21 comments12 min readLW link1 review
(www.benkuhn.net)

MIRI an­nounces new “Death With Dig­nity” strategy

Eliezer Yudkowsky2 Apr 2022 0:43 UTC
339 points
545 comments18 min readLW link1 review

What should you change in re­sponse to an “emer­gency”? And AI risk

AnnaSalamon18 Jul 2022 1:11 UTC
336 points
60 comments6 min readLW link1 review

Why I think strong gen­eral AI is com­ing soon

porby28 Sep 2022 5:40 UTC
335 points
141 comments34 min readLW link1 review

Look­ing back on my al­ign­ment PhD

TurnTrout1 Jul 2022 3:19 UTC
331 points
64 comments11 min readLW link

Be­ware boast­ing about non-ex­is­tent fore­cast­ing track records

Jotto99920 May 2022 19:20 UTC
331 points
112 comments5 min readLW link1 review

Op­ti­mal­ity is the tiger, and agents are its teeth

Veedrac2 Apr 2022 0:46 UTC
318 points
42 comments16 min readLW link1 review

Models Don’t “Get Re­ward”

Sam Ringer30 Dec 2022 10:37 UTC
312 points
61 comments5 min readLW link1 review

Six Di­men­sions of Oper­a­tional Ad­e­quacy in AGI Projects

Eliezer Yudkowsky30 May 2022 17:00 UTC
309 points
66 comments13 min readLW link1 review

Epistemic Legibility

Elizabeth9 Feb 2022 18:10 UTC
306 points
30 comments20 min readLW link1 review
(acesounderglass.com)

On how var­i­ous plans miss the hard bits of the al­ign­ment challenge

So8res12 Jul 2022 2:49 UTC
305 points
88 comments29 min readLW link3 reviews

Why Agent Foun­da­tions? An Overly Ab­stract Explanation

johnswentworth25 Mar 2022 23:17 UTC
301 points
56 comments8 min readLW link1 review

A challenge for AGI or­ga­ni­za­tions, and a challenge for readers

1 Dec 2022 23:11 UTC
301 points
33 comments2 min readLW link

Two-year up­date on my per­sonal AI timelines

Ajeya Cotra2 Aug 2022 23:07 UTC
293 points
60 comments16 min readLW link

Mys­ter­ies of mode collapse

janus8 Nov 2022 10:37 UTC
283 points
57 comments14 min readLW link1 review

A cen­tral AI al­ign­ment prob­lem: ca­pa­bil­ities gen­er­al­iza­tion, and the sharp left turn

So8res15 Jun 2022 13:10 UTC
282 points
54 comments10 min readLW link1 review

We Choose To Align AI

johnswentworth1 Jan 2022 20:06 UTC
280 points
16 comments3 min readLW link1 review

Don’t die with dig­nity; in­stead play to your outs

Jeffrey Ladish6 Apr 2022 7:53 UTC
279 points
60 comments5 min readLW link

What Are You Track­ing In Your Head?

johnswentworth28 Jun 2022 19:30 UTC
279 points
83 comments4 min readLW link1 review

Is AI Progress Im­pos­si­ble To Pre­dict?

alyssavance15 May 2022 18:30 UTC
277 points
39 comments2 min readLW link

Sazen

Duncan Sabien (Deactivated)21 Dec 2022 7:54 UTC
276 points
83 comments12 min readLW link2 reviews

Toni Kurz and the In­san­ity of Climb­ing Mountains

GeneSmith3 Jul 2022 20:51 UTC
270 points
67 comments11 min readLW link2 reviews

Hu­mans are very re­li­able agents

alyssavance16 Jun 2022 22:02 UTC
266 points
35 comments3 min readLW link

12 in­ter­est­ing things I learned study­ing the dis­cov­ery of na­ture’s laws

Ben Pace19 Feb 2022 23:39 UTC
265 points
40 comments9 min readLW link1 review

Chang­ing the world through slack & hobbies

Steven Byrnes21 Jul 2022 18:11 UTC
260 points
13 comments10 min readLW link

Com­ment re­ply: my low-qual­ity thoughts on why CFAR didn’t get farther with a “real/​effi­ca­cious art of ra­tio­nal­ity”

AnnaSalamon9 Jun 2022 2:12 UTC
260 points
63 comments17 min readLW link1 review