RSS

Ex­is­ten­tial Risk

TagLast edit: 19 Mar 2023 21:27 UTC by Diabloto96

An existential risk (or x-risk) is a risk that poses astronomically large negative consequences for humanity, such as human extinction or permanent global totalitarianism.

Nick Bostrom introduced the term “existential risk” in his 2002 paper “Existential Risks: Analyzing Human Extinction Scenarios and Related Hazards.”1 In the paper, Bostrom defined an existential risk as:

One where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.

The Oxford Future of Humanity Institute (FHI) was founded by Bostrom in 2005 in part to study existential risks. Other institutions with a generalist focus on existential risk include the Centre for the Study of Existential Risk.

FHI’s existential-risk.org FAQ notes regarding the definition of “existential risk”:

An existential risk is one that threatens the entire future of humanity. [...]

“Humanity”, in this context, does not mean “the biological species Homo sapiens”. If we humans were to evolve into another species, or merge or replace ourselves with intelligent machines, this would not necessarily mean that an existential catastrophe had occurred — although it might if the quality of life enjoyed by those new life forms turns out to be far inferior to that enjoyed by humans.

Classification of Existential Risks

Bostrom2 proposes a series of classifications for existential risks:

The total negative results of an existential risk could amount to the total of potential future lives not being realized. A rough and conservative calculation3 gives us a total of 10^54 potential future humans lives – smarter, happier and kinder then we are. Hence, almost no other task would amount to so much positive impact than existential risk reduction.

Existential risks also present an unique challenge because of their irreversible nature. We will never, by definition, experience and survive an extinction risk4 and so cannot learn from our mistakes. They are subject to strong observational selection effects 5. One cannot estimate their future probability based on the past, because bayesianly speaking, the conditional probability of a past existential catastrophe given our present existence is always 0, no matter how high the probability of an existential risk really is. Instead, indirect estimates have to be used, such as possible existential catastrophes happening elsewhere. A high extinction risk probability could be functioning as a Great Filter and explain why there is no evidence of spacial colonization.

Another related idea is that of a suffering risk (or s-risk).

History

The focus on existential risks on LessWrong dates back to Bostrom’s 2002 paper Astronomical Waste: The Opportunity Cost of Delayed Technological Development. It argues that “the chief goal for utilitarians should be to reduce existential risk”. Bostrom writes:

If what we are concerned with is (something like) maximizing the expected number of worthwhile lives that we will create, then in addition to the opportunity cost of delayed colonization, we have to take into account the risk of failure to colonize at all. We might fall victim to an existential risk, one where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.[8] Because the lifespan of galaxies is measured in billions of years, whereas the time-scale of any delays that we could realistically affect would rather be measured in years or decades, the consideration of risk trumps the consideration of opportunity cost. For example, a single percentage point of reduction of existential risks would be worth (from a utilitarian expected utility point-of-view) a delay of over 10 million years.
Therefore, if our actions have even the slightest effect on the probability of eventual colonization, this will outweigh their effect on when colonization takes place. For standard utilitarians, priority number one, two, three and four should consequently be to reduce existential risk. The utilitarian imperative “Maximize expected aggregate utility!” can be simplified to the maxim “Minimize existential risk!”.

The concept is expanded upon in his 2012 paper Existential Risk Prevention as Global Priority

Organizations

References

  1. BOSTROM, Nick. (2002) “Existential Risks: Analyzing Human Extinction Scenarios and Related Hazards”. Journal of Evolution and Technology, Vol. 9, March 2002.

  2. BOSTROM, Nick. (2012) “Existential Risk Reduction as the Most Important Task for Humanity”. Global Policy, forthcoming, 2012.

  3. BOSTROM, Nick & SANDBERG, Anders & CIRKOVIC, Milan. (2010) “Anthropic Shadow: Observation Selection Effects and Human Extinction Risks” Risk Analysis, Vol. 30, No. 10 (2010): 1495-1506.

  4. Nick Bostrom, Milan M. Ćirković, ed (2008). Global Catastrophic Risks. Oxford University Press.

  5. Milan M. Ćirković (2008). “Observation Selection Effects and global catastrophic risks”. Global Catastrophic Risks. Oxford University Press.

  6. Eliezer S. Yudkowsky (2008). “Cognitive Biases Potentially Affecting Judgment of Global Risks”. Global Catastrophic Risks. Oxford University Press. (PDF)

  7. Richard A. Posner (2004). Catastrophe Risk and Response. Oxford University Press. (DOC)

Some AI re­search ar­eas and their rele­vance to ex­is­ten­tial safety

Andrew_Critch19 Nov 2020 3:18 UTC
204 points
37 comments50 min readLW link2 reviews

[Question] Fore­cast­ing Thread: Ex­is­ten­tial Risk

Amandango22 Sep 2020 3:44 UTC
43 points
39 comments2 min readLW link

3. Uploading

RogerDearnaley23 Nov 2023 7:39 UTC
21 points
5 comments8 min readLW link

SSA re­jects an­thropic shadow, too

jessicata27 Jul 2023 17:25 UTC
74 points
38 comments11 min readLW link
(unstableontology.com)

5. Mo­ral Value for Sen­tient An­i­mals? Alas, Not Yet

RogerDearnaley27 Dec 2023 6:42 UTC
33 points
41 comments23 min readLW link

The Dumbest Pos­si­ble Gets There First

Artaxerxes13 Aug 2022 10:20 UTC
44 points
7 comments2 min readLW link

An­throp­i­cally Blind: the an­thropic shadow is re­flec­tively inconsistent

Christopher King29 Jun 2023 2:36 UTC
41 points
38 comments10 min readLW link

In defense of flailing, with fore­word by Bill Burr

lc17 Jun 2022 16:40 UTC
88 points
6 comments4 min readLW link

Devel­op­men­tal Stages of GPTs

orthonormal26 Jul 2020 22:03 UTC
140 points
72 comments7 min readLW link1 review

My cur­rent thoughts on the risks from SETI

Matthew Barnett15 Mar 2022 17:18 UTC
128 points
27 comments10 min readLW link

Rus­sian x-risks newslet­ter fall 2021

avturchin3 Dec 2021 13:06 UTC
29 points
2 comments1 min readLW link

My Ob­jec­tions to “We’re All Gonna Die with Eliezer Yud­kowsky”

Quintin Pope21 Mar 2023 0:06 UTC
357 points
230 comments39 min readLW link

In­tel­li­gence en­hance­ment as ex­is­ten­tial risk mitigation

Roko15 Jun 2009 19:35 UTC
21 points
244 comments3 min readLW link

Ex­is­ten­tial Risk

lukeprog15 Nov 2011 14:23 UTC
34 points
108 comments4 min readLW link

Our so­ciety lacks good self-preser­va­tion mechanisms

Roko12 Jul 2009 9:26 UTC
17 points
135 comments3 min readLW link

Disam­biguat­ing Doom

steven046129 Mar 2010 18:14 UTC
28 points
19 comments2 min readLW link

Coron­avirus as a test-run for X-risks

Sammy Martin13 Jun 2020 21:00 UTC
71 points
11 comments18 min readLW link

Some cruxes on im­pact­ful al­ter­na­tives to AI policy work

Richard_Ngo10 Oct 2018 13:35 UTC
165 points
13 comments12 min readLW link

What Mul­tipo­lar Failure Looks Like, and Ro­bust Agent-Ag­nos­tic Pro­cesses (RAAPs)

Andrew_Critch31 Mar 2021 23:50 UTC
278 points
65 comments22 min readLW link1 review

Movie re­view: Don’t Look Up

Sam Marks4 Jan 2022 20:16 UTC
35 points
6 comments11 min readLW link

Pos­si­ble take­aways from the coro­n­avirus pan­demic for slow AI takeoff

Vika31 May 2020 17:51 UTC
135 points
36 comments3 min readLW link1 review

Sim­plify EA Pitches to “Holy Shit, X-Risk”

Neel Nanda11 Feb 2022 1:59 UTC
55 points
22 comments10 min readLW link
(www.neelnanda.io)

Rus­sian x-risks newslet­ter win­ter 21-22, war risks up­date.

avturchin20 Feb 2022 18:58 UTC
29 points
14 comments3 min readLW link

“If we go ex­tinct due to mis­al­igned AI, at least na­ture will con­tinue, right? … right?”

plex18 May 2024 14:09 UTC
47 points
23 comments2 min readLW link
(aisafety.info)

AXRP Epi­sode 13 - First Prin­ci­ples of AGI Safety with Richard Ngo

DanielFilan31 Mar 2022 5:20 UTC
24 points
1 comment48 min readLW link

Rus­sian x-risk newslet­ter March 2022 update

avturchin1 Apr 2022 13:26 UTC
19 points
13 comments2 min readLW link

100 Years Of Ex­is­ten­tial Risk

jdp27 Sep 2021 0:31 UTC
81 points
12 comments33 min readLW link
(www.wrestlinggnon.com)

Los­ing the root for the tree

Adam Zerner20 Sep 2022 4:53 UTC
473 points
31 comments9 min readLW link1 review

Mini ad­vent cal­en­dar of Xrisks: nanotechnology

Stuart_Armstrong5 Dec 2012 11:02 UTC
6 points
25 comments1 min readLW link

Mini ad­vent cal­en­dar of Xrisks: Pandemics

Stuart_Armstrong6 Dec 2012 13:44 UTC
4 points
21 comments1 min readLW link

How close to nu­clear war did we get over Cuba?

NathanBarnard13 May 2022 19:58 UTC
13 points
0 comments10 min readLW link

Mini ad­vent cal­en­dar of Xrisks: nu­clear war

Stuart_Armstrong4 Dec 2012 11:13 UTC
8 points
35 comments1 min readLW link

Evolu­tion, bias and global risk

Giles23 May 2011 0:32 UTC
5 points
10 comments5 min readLW link

Mini ad­vent cal­en­dar of Xrisks: syn­thetic biology

Stuart_Armstrong4 Dec 2012 11:15 UTC
8 points
26 comments1 min readLW link

[Question] How has the to­tal amount of gain-of-func­tion re­search wor­ld­wide grown/​shrunk over time?

johnswentworth19 May 2022 15:57 UTC
29 points
7 comments1 min readLW link

Mini ad­vent cal­en­dar of Xrisks: Ar­tifi­cial Intelligence

Stuart_Armstrong7 Dec 2012 11:26 UTC
5 points
5 comments1 min readLW link

Don’t Fear The Filter

Scott Alexander29 May 2014 0:45 UTC
11 points
17 comments6 min readLW link

Ex­is­ten­tial Risk Per­sua­sion Tournament

PeterMcCluskey17 Jul 2023 18:04 UTC
73 points
1 comment8 min readLW link
(bayesianinvestor.com)

On sav­ing one’s world

Rob Bensinger17 May 2022 19:53 UTC
192 points
4 comments1 min readLW link

I’m try­ing out “as­ter­oid mind­set”

Alex_Altair3 Jun 2022 13:35 UTC
90 points
5 comments4 min readLW link

In­ter­gen­er­a­tional trauma im­ped­ing co­op­er­a­tive ex­is­ten­tial safety efforts

Andrew_Critch3 Jun 2022 8:13 UTC
129 points
29 comments3 min readLW link

Rus­sian x-risks newslet­ter May 2022 + short his­tory of “method­ol­o­gists”

avturchin5 Jun 2022 11:50 UTC
23 points
4 comments2 min readLW link

A Quick Guide to Con­fronting Doom

Ruby13 Apr 2022 19:30 UTC
240 points
33 comments2 min readLW link

Pitch­ing an Align­ment Softball

mu_(negative)7 Jun 2022 4:10 UTC
47 points
13 comments10 min readLW link

Assess­ment of AI safety agen­das: think about the down­side risk

Roman Leventov19 Dec 2023 9:00 UTC
13 points
1 comment1 min readLW link

[Question] Fore­stal­ling At­mo­spheric Ignition

Lone Pine9 Jun 2022 20:49 UTC
11 points
9 comments1 min readLW link

Leav­ing Google, Join­ing the Nu­cleic Acid Observatory

jefftk10 Jun 2022 17:00 UTC
114 points
4 comments3 min readLW link
(www.jefftk.com)

Risk of Mass Hu­man Suffer­ing /​ Ex­tinc­tion due to Cli­mate Emer­gency

willfranks14 Mar 2019 18:32 UTC
4 points
3 comments1 min readLW link

Loose thoughts on AGI risk

Yitz23 Jun 2022 1:02 UTC
7 points
3 comments1 min readLW link

[Linkpost] Ex­is­ten­tial Risk Anal­y­sis in Em­piri­cal Re­search Papers

Dan H2 Jul 2022 0:09 UTC
40 points
0 comments1 min readLW link
(arxiv.org)

Robin Han­son AI X-Risk De­bate — High­lights and Analysis

Liron12 Jul 2024 21:31 UTC
46 points
7 comments45 min readLW link
(www.youtube.com)

Dis­cus­sion: weight­ing in­side view ver­sus out­side view on ex­tinc­tion events

Ilverin the Stupid and Offensive25 Feb 2016 5:18 UTC
5 points
4 comments1 min readLW link

[Question] Why aren’t Yud­kowsky & Bostrom get­ting more at­ten­tion now?

JoshuaFox8 Jan 2024 14:42 UTC
14 points
8 comments1 min readLW link

«Boundaries» Se­quence (In­dex Post)

Andrew_Critch26 Jul 2022 19:12 UTC
25 points
1 comment1 min readLW link

AI-kills-ev­ery­one sce­nar­ios re­quire robotic in­fras­truc­ture, but not nec­es­sar­ily nanotech

avturchin3 Apr 2023 12:45 UTC
53 points
47 comments4 min readLW link

Cul­ti­vat­ing Valiance

Shoshannah Tekofsky13 Aug 2022 18:47 UTC
35 points
4 comments4 min readLW link

Con­crete Ad­vice for Form­ing In­side Views on AI Safety

Neel Nanda17 Aug 2022 22:02 UTC
29 points
6 comments10 min readLW link

A con­ver­sa­tion about progress and safety

jasoncrawford18 Aug 2022 18:36 UTC
12 points
0 comments7 min readLW link
(rootsofprogress.org)

ev­ery­thing is okay

Tamsin Leake23 Aug 2022 9:20 UTC
100 points
22 comments7 min readLW link2 reviews

In favour of ex­plor­ing nag­ging doubts about x-risk

owencb25 Jun 2024 23:52 UTC
105 points
2 comments1 min readLW link

n=3 AI Risk Quick Math and Reasoning

lionhearted (Sebastian Marshall)7 Apr 2023 20:27 UTC
6 points
3 comments4 min readLW link

Bring­ing Agency Into AGI Ex­tinc­tion Is Superfluous

George3d68 Apr 2023 4:02 UTC
28 points
18 comments5 min readLW link

Be­ing at peace with Doom

Johannes C. Mayer9 Apr 2023 14:53 UTC
23 points
13 comments4 min readLW link

How does MIRI Know it Has a Medium Prob­a­bil­ity of Suc­cess?

Peter Wildeford1 Aug 2013 11:42 UTC
35 points
146 comments1 min readLW link

Stop talk­ing about p(doom)

Isaac King1 Jan 2024 10:57 UTC
38 points
22 comments3 min readLW link

Nu­clear war is un­likely to cause hu­man extinction

Jeffrey Ladish7 Nov 2020 5:42 UTC
132 points
48 comments11 min readLW link3 reviews

Dou­ble As­teroid Redi­rec­tion Test succeeds

sanxiyn27 Sep 2022 6:37 UTC
19 points
5 comments1 min readLW link
(twitter.com)

Fun­da­men­tal Uncer­tainty: Chap­ter 8 - When does fun­da­men­tal un­cer­tainty mat­ter?

Gordon Seidoh Worley26 Apr 2024 18:10 UTC
11 points
2 comments32 min readLW link

my cur­rent out­look on AI risk mitigation

Tamsin Leake3 Oct 2022 20:06 UTC
63 points
6 comments11 min readLW link
(carado.moe)

How rea­son­able is tak­ing ex­tinc­tion risk?

FVelde23 Jul 2024 18:05 UTC
2 points
4 comments4 min readLW link

My views on “doom”

paulfchristiano27 Apr 2023 17:50 UTC
245 points
35 comments2 min readLW link
(ai-alignment.com)

Bostrom Goes Unheard

Zvi13 Nov 2023 14:11 UTC
81 points
9 comments18 min readLW link

Cortés, AI Risk, and the Dy­nam­ics of Com­pet­ing Conquerors

James_Miller2 Jan 2024 16:37 UTC
14 points
2 comments3 min readLW link

Se­cond-Order Ex­is­ten­tial Risk

Ideopunk1 Jul 2020 18:46 UTC
2 points
1 comment3 min readLW link

Ac­tu­ally, All Nu­clear Famine Papers are Bunk

Lao Mein12 Oct 2022 5:58 UTC
113 points
37 comments2 min readLW link1 review

[In­ter­view w/​ Jeffrey Ladish] Ap­ply­ing the ‘se­cu­rity mind­set’ to AI and x-risk

fowlertm11 Apr 2023 18:14 UTC
12 points
0 comments1 min readLW link

Robin Han­son & Liron Shapira De­bate AI X-Risk

Liron8 Jul 2024 21:45 UTC
34 points
4 comments1 min readLW link
(www.youtube.com)

Will Ar­tifi­cial Su­per­in­tel­li­gence Kill Us?

James_Miller23 May 2023 16:27 UTC
33 points
2 comments22 min readLW link

Joseph Bloom on choos­ing AI Align­ment over bio, what many as­piring re­searchers get wrong, and more (in­ter­view)

17 Sep 2023 18:45 UTC
27 points
2 comments8 min readLW link

AI Safety Newslet­ter #7: Dis­in­for­ma­tion, Gover­nance Recom­men­da­tions for AI labs, and Se­nate Hear­ings on AI

23 May 2023 21:47 UTC
25 points
0 comments6 min readLW link
(newsletter.safe.ai)

Com­par­ing AI Align­ment Ap­proaches to Min­i­mize False Pos­i­tive Risk

Gordon Seidoh Worley30 Jun 2020 19:34 UTC
5 points
0 comments9 min readLW link

Sleep­walk bias, self-defeat­ing pre­dic­tions and ex­is­ten­tial risk

Stefan_Schubert22 Apr 2016 18:31 UTC
56 points
11 comments3 min readLW link

AI Safety Newslet­ter #6: Ex­am­ples of AI safety progress, Yoshua Ben­gio pro­poses a ban on AI agents, and les­sons from nu­clear arms control

16 May 2023 15:14 UTC
31 points
0 comments6 min readLW link
(newsletter.safe.ai)

How can I re­duce ex­is­ten­tial risk from AI?

lukeprog13 Nov 2012 21:56 UTC
63 points
92 comments8 min readLW link

X-risk Miti­ga­tion Does Ac­tu­ally Re­quire Longter­mism

DragonGod14 Nov 2022 12:54 UTC
6 points
1 comment1 min readLW link

Pop­u­la­tion After a Catastrophe

Stan Pinsent2 Oct 2023 16:06 UTC
3 points
5 comments14 min readLW link

Bayesian Ad­just­ment Does Not Defeat Ex­is­ten­tial Risk Charity

steven046117 Mar 2013 8:50 UTC
81 points
90 comments34 min readLW link

Critch on ca­reer ad­vice for ju­nior AI-x-risk-con­cerned researchers

Rob Bensinger12 May 2018 2:13 UTC
118 points
25 comments4 min readLW link

[Question] Liter­a­ture On Ex­is­ten­tial Risk From At­mo­spheric Con­tam­i­na­tion?

Yitz13 Oct 2023 22:27 UTC
6 points
3 comments1 min readLW link

The Good Life in the face of the apocalypse

16 Oct 2023 22:40 UTC
82 points
8 comments10 min readLW link

Ex­is­ten­tial risks open thread

John_Maxwell31 Mar 2013 0:52 UTC
16 points
47 comments1 min readLW link

[Cer­e­mony In­tro + ] Darkness

Ruby21 Feb 2021 18:06 UTC
26 points
0 comments4 min readLW link
(mindingourway.com)

Ex­is­ten­tial Risk is a sin­gle category

Rafael Harth9 Aug 2020 17:47 UTC
31 points
7 comments1 min readLW link

A model I use when mak­ing plans to re­duce AI x-risk

Ben Pace19 Jan 2018 0:21 UTC
69 points
39 comments6 min readLW link

Count­ing ar­gu­ments provide no ev­i­dence for AI doom

27 Feb 2024 23:03 UTC
95 points
188 comments14 min readLW link

2023 Stan­ford Ex­is­ten­tial Risks Conference

elizabethcooper24 Feb 2023 18:35 UTC
7 points
0 comments1 min readLW link

What I talk about when I talk about AI x-risk: 3 core claims I want ma­chine learn­ing re­searchers to ad­dress.

David Scott Krueger (formerly: capybaralet)2 Dec 2019 18:20 UTC
29 points
13 comments3 min readLW link

NYT: Lab Leak Most Likely Caused Pan­demic, En­ergy Dept. Says

trevor26 Feb 2023 21:21 UTC
17 points
9 comments4 min readLW link
(www.nytimes.com)

Lan­guage Agents Re­duce the Risk of Ex­is­ten­tial Catastrophe

28 May 2023 19:10 UTC
39 points
14 comments26 min readLW link

Maybe An­tivirals aren’t a Use­ful Pri­or­ity for Pan­demics?

Davidmanheim20 Jun 2021 10:04 UTC
24 points
13 comments4 min readLW link

Min­i­mum Vi­able Exterminator

Richard Horvath29 May 2023 16:32 UTC
14 points
5 comments5 min readLW link

Adam Smith Meets AI Doomers

James_Miller31 Jan 2024 15:53 UTC
34 points
10 comments5 min readLW link

A map: “Global Catas­trophic Risks of Scien­tific Ex­per­i­ments”

avturchin7 Aug 2021 15:35 UTC
10 points
2 comments1 min readLW link

In­ter­per­sonal Ap­proaches for X-Risk Education

TurnTrout24 Jan 2018 0:47 UTC
10 points
10 comments1 min readLW link

[Question] Im­pli­ca­tions of the Dooms­day Ar­gu­ment for x-risk reduction

maximkazhenkov2 Apr 2020 21:42 UTC
6 points
17 comments1 min readLW link

Shut­ting Down the Light­cone Offices

14 Mar 2023 22:47 UTC
338 points
95 comments17 min readLW link

Rus­sian x-risks newslet­ter sum­mer 2021

avturchin5 Sep 2021 8:23 UTC
15 points
4 comments1 min readLW link

Other Ex­is­ten­tial Risks

multifoliaterose17 Aug 2010 21:24 UTC
40 points
124 comments11 min readLW link

[Question] What are the rea­sons to *not* con­sider re­duc­ing AI-Xrisk the high­est pri­or­ity cause?

David Scott Krueger (formerly: capybaralet)20 Aug 2019 21:45 UTC
29 points
27 comments1 min readLW link

This Can’t Go On

HoldenKarnofsky18 Sep 2021 23:50 UTC
81 points
55 comments7 min readLW link2 reviews

An­nounc­ing the Vi­talik Bu­terin Fel­low­ships in AI Ex­is­ten­tial Safety!

DanielFilan21 Sep 2021 0:33 UTC
64 points
2 comments1 min readLW link
(grants.futureoflife.org)

Ex­is­ten­tial Risk and Ex­is­ten­tial Hope: Definitions

owencb10 Jan 2015 19:09 UTC
14 points
38 comments1 min readLW link

Book Re­view: Ex­is­ten­tial Risk and Growth

jkim211 Oct 2021 1:07 UTC
9 points
1 comment7 min readLW link

Cli­mate change: ex­is­ten­tial risk?

katydee6 May 2011 6:19 UTC
8 points
26 comments1 min readLW link

Us­ing blin­ders to help you see things for what they are

Adam Zerner11 Nov 2021 7:07 UTC
13 points
2 comments2 min readLW link

Dis­cus­sion with Eliezer Yud­kowsky on AGI interventions

11 Nov 2021 3:01 UTC
328 points
251 comments34 min readLW link1 review

Vi­talik: Cryp­toe­co­nomics and X-Risk Re­searchers Should Listen to Each Other More

Emerson Spartz21 Nov 2021 18:53 UTC
47 points
9 comments5 min readLW link

AXRP Epi­sode 12 - AI Ex­is­ten­tial Risk with Paul Christiano

DanielFilan2 Dec 2021 2:20 UTC
38 points
0 comments126 min readLW link

A list of good heuris­tics that the case for AI x-risk fails

David Scott Krueger (formerly: capybaralet)2 Dec 2019 19:26 UTC
43 points
15 comments2 min readLW link

“Tak­ing AI Risk Se­ri­ously” (thoughts by Critch)

Raemon29 Jan 2018 9:27 UTC
110 points
68 comments13 min readLW link

State Space of X-Risk Trajectories

David_Kristoffersson9 Feb 2020 13:56 UTC
11 points
0 comments9 min readLW link

Good News, Every­one!

jbash25 Mar 2023 13:48 UTC
133 points
23 comments2 min readLW link

Zvi’s Thoughts on the Sur­vival and Flour­ish­ing Fund (SFF)

Zvi14 Dec 2021 14:30 UTC
193 points
65 comments64 min readLW link1 review
(thezvi.wordpress.com)

My Overview of the AI Align­ment Land­scape: A Bird’s Eye View

Neel Nanda15 Dec 2021 23:44 UTC
127 points
9 comments15 min readLW link

An­nounc­ing the Swiss Ex­is­ten­tial Risk Ini­ti­a­tive (CHERI) 2023 Re­search Fellowship

Tobias H27 Mar 2023 16:36 UTC
3 points
0 comments1 min readLW link

My Overview of the AI Align­ment Land­scape: Threat Models

Neel Nanda25 Dec 2021 23:07 UTC
52 points
3 comments28 min readLW link

An in­ter­view with Dan­ica Remy on pro­tect­ing the Earth from as­ter­oids.

fowlertm26 Dec 2021 21:40 UTC
10 points
0 comments2 min readLW link

Avert­ing Catas­tro­phe: De­ci­sion The­ory for COVID-19, Cli­mate Change, and Po­ten­tial Disasters of All Kinds

JakubK2 May 2023 22:50 UTC
10 points
0 comments1 min readLW link

[Question] List of no­table peo­ple who be­lieve in AI X-risk?

vlad.proex3 May 2023 18:46 UTC
14 points
4 comments1 min readLW link

Rus­sian x-risks newslet­ter, sum­mer 2019

avturchin7 Sep 2019 9:50 UTC
39 points
5 comments4 min readLW link

AGI ris­ing: why we are in a new era of acute risk and in­creas­ing pub­lic aware­ness, and what to do now

Greg C3 May 2023 20:26 UTC
23 points
12 comments1 min readLW link

[Question] Why not use ac­tive SETI to pre­vent AI Doom?

RomanS5 May 2023 14:41 UTC
13 points
13 comments1 min readLW link

Up­date on es­tab­lish­ment of Cam­bridge’s Cen­tre for Study of Ex­is­ten­tial Risk

Sean_o_h12 Aug 2013 16:11 UTC
60 points
15 comments3 min readLW link

Are healthy choices effec­tive for im­prov­ing live ex­pec­tancy any­more?

Christopher King8 May 2023 21:25 UTC
6 points
4 comments1 min readLW link

Be­ing Half-Ra­tional About Pas­cal’s Wager is Even Worse

Eliezer Yudkowsky18 Apr 2013 5:20 UTC
63 points
162 comments9 min readLW link

[Question] How should we think about the de­ci­sion rele­vance of mod­els es­ti­mat­ing p(doom)?

Mo Putera11 May 2023 4:16 UTC
11 points
1 comment3 min readLW link

[Linkpost] The AGI Show podcast

Soroush Pour23 May 2023 9:52 UTC
4 points
0 comments1 min readLW link

We are mis­al­igned: the sad­den­ing idea that most of hu­man­ity doesn’t in­trin­si­cally care about x-risk, even on a per­sonal level

Christopher King19 May 2023 16:12 UTC
3 points
5 comments2 min readLW link

At­tend­ing to Now

ialdabaoth8 Nov 2017 16:53 UTC
27 points
2 comments3 min readLW link

Rus­sian x-risks newslet­ter spring 2020

avturchin4 Jun 2020 14:27 UTC
16 points
4 comments1 min readLW link

[AN #93]: The Precipice we’re stand­ing at, and how we can back away from it

Rohin Shah1 Apr 2020 17:10 UTC
24 points
0 comments7 min readLW link
(mailchi.mp)

“Can We Sur­vive Tech­nol­ogy” by von Neumann

Ben Pace18 Aug 2019 18:58 UTC
32 points
2 comments1 min readLW link
(geosci.uchicago.edu)

LA-602 vs. RHIC Review

Eliezer Yudkowsky19 Jun 2008 10:00 UTC
62 points
62 comments6 min readLW link

Alle­gory On AI Risk, Game The­ory, and Mithril

James_Miller13 Feb 2017 20:41 UTC
45 points
57 comments3 min readLW link

How will they feed us

meijer19731 Jun 2023 8:49 UTC
4 points
3 comments5 min readLW link

Yes, avoid­ing ex­tinc­tion from AI *is* an ur­gent pri­or­ity: a re­sponse to Seth Lazar, Jeremy Howard, and Arvind Narayanan.

Soroush Pour1 Jun 2023 13:38 UTC
17 points
0 comments5 min readLW link
(www.soroushjp.com)

A Pro­posed Ad­just­ment to the Astro­nom­i­cal Waste Argument

Nick_Beckstead27 May 2013 3:39 UTC
35 points
38 comments12 min readLW link

Safe AI and moral AI

William D'Alessandro1 Jun 2023 21:36 UTC
−3 points
0 comments10 min readLW link

Q&A with ex­perts on risks from AI #1

XiXiDu8 Jan 2012 11:46 UTC
45 points
67 comments9 min readLW link

An­drew Ng wants to have a con­ver­sa­tion about ex­tinc­tion risk from AI

Leon Lang5 Jun 2023 22:29 UTC
32 points
2 comments1 min readLW link
(twitter.com)

Light­ning Post: Things peo­ple in AI Safety should stop talk­ing about

Prometheus20 Jun 2023 15:00 UTC
23 points
6 comments2 min readLW link

A De­tailed Cri­tique of One Sec­tion of Steven Pinker’s Chap­ter “Ex­is­ten­tial Threats” in En­light­en­ment Now (Part 1)

philosophytorres12 May 2018 13:34 UTC
14 points
1 comment17 min readLW link

EU AI Act passed Ple­nary vote, and X-risk was a main topic

Ariel G.21 Jun 2023 18:33 UTC
17 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

Ex­plor­ing Last-Re­sort Mea­sures for AI Align­ment: Hu­man­ity’s Ex­tinc­tion Switch

0xPetra23 Jun 2023 17:01 UTC
7 points
0 comments2 min readLW link

I just watched don’t look up.

ATheCoder23 Jun 2023 21:22 UTC
0 points
5 comments2 min readLW link

Why am I Me?

dadadarren25 Jun 2023 12:07 UTC
45 points
46 comments3 min readLW link

Cheat sheet of AI X-risk

momom229 Jun 2023 4:28 UTC
19 points
1 comment7 min readLW link

AI In­ci­dent Shar­ing—Best prac­tices from other fields and a com­pre­hen­sive list of ex­ist­ing platforms

Štěpán Los28 Jun 2023 17:21 UTC
20 points
0 comments4 min readLW link

An­i­mal Weapons: Les­sons for Hu­mans in the Age of X-Risk

Damin Curtis4 Jul 2023 18:14 UTC
3 points
0 comments10 min readLW link

against “AI risk”

Wei Dai11 Apr 2012 22:46 UTC
35 points
91 comments1 min readLW link

A Parable of Elites and Takeoffs

gwern30 Jun 2014 23:04 UTC
39 points
98 comments5 min readLW link

Ab­sent co­or­di­na­tion, fu­ture tech­nol­ogy will cause hu­man extinction

Jeffrey Ladish3 Feb 2020 21:52 UTC
21 points
12 comments5 min readLW link

Ne­cro­mancy’s un­in­tended con­se­quences.

Christopher King9 Aug 2023 0:08 UTC
−6 points
2 comments2 min readLW link

What are the flaws in this ar­gu­ment about p(Doom)?

William the Kiwi 8 Aug 2023 20:34 UTC
0 points
25 comments1 min readLW link

[Linkpost] Will AI avoid ex­ploita­tion?

cdkg6 Aug 2023 14:28 UTC
22 points
1 comment1 min readLW link

What are the flaws in this AGI ar­gu­ment?

William the Kiwi 11 Aug 2023 11:31 UTC
5 points
14 comments1 min readLW link

[Question] Bostrom’s Solution

James Blackmon14 Aug 2023 17:09 UTC
1 point
0 comments1 min readLW link

Memetic Judo #1: On Dooms­day Prophets v.3

Max TK18 Aug 2023 0:14 UTC
25 points
17 comments3 min readLW link

Memetic Judo #2: In­cor­po­ral Switches and Lev­ers Compendium

Max TK14 Aug 2023 16:53 UTC
19 points
6 comments17 min readLW link

[Speech] Wor­lds That Never Were

mingyuan12 Jan 2019 19:53 UTC
23 points
0 comments3 min readLW link

Learn­ing as you play: an­thropic shadow in deadly games

dr_s12 Aug 2023 7:34 UTC
37 points
28 comments35 min readLW link

Should We Ban Physics?

Eliezer Yudkowsky21 Jul 2008 8:12 UTC
23 points
22 comments2 min readLW link

[Question] What risks con­cern you which don’t seem to have been se­ri­ously con­sid­ered by the com­mu­nity?

plex28 Oct 2020 18:27 UTC
17 points
35 comments1 min readLW link

AI Reg­u­la­tion May Be More Im­por­tant Than AI Align­ment For Ex­is­ten­tial Safety

otto.barten24 Aug 2023 11:41 UTC
65 points
39 comments5 min readLW link

[Question] Who can most re­duce X-Risk?

sudhanshu_kasewa28 Aug 2023 14:38 UTC
1 point
12 comments1 min readLW link

Tech­niques for op­ti­miz­ing worst-case performance

paulfchristiano28 Jan 2019 21:29 UTC
23 points
12 comments8 min readLW link

A De­tailed Cri­tique of One Sec­tion of Steven Pinker’s Chap­ter “Ex­is­ten­tial Threats” in En­light­en­ment Now (Part 2)

philosophytorres13 May 2018 19:41 UTC
8 points
1 comment17 min readLW link

Selec­tion Effects in es­ti­mates of Global Catas­trophic Risk

bentarm4 Nov 2011 9:14 UTC
32 points
62 comments1 min readLW link

When the Stars Align: The Mo­ments AI De­cides Hu­man­ity’s Fate

Trinio5 Sep 2023 8:55 UTC
1 point
0 comments1 min readLW link

The Evolu­tion­ary Path­way from Biolog­i­cal to Digi­tal In­tel­li­gence: A Cos­mic Perspective

George3605 Sep 2023 17:47 UTC
−17 points
0 comments4 min readLW link

The AI apoc­a­lypse myth.

Spiritus Dei8 Sep 2023 17:43 UTC
−22 points
12 comments2 min readLW link

The Promises and Pit­falls of Long-Term Forecasting

GeoVane11 Sep 2023 5:04 UTC
1 point
0 comments5 min readLW link

My Cur­rent Thoughts on the AI Strate­gic Landscape

Jeffrey Heninger28 Sep 2023 17:59 UTC
11 points
28 comments14 min readLW link

Wel­come to Ap­ply: The 2024 Vi­talik Bu­terin Fel­low­ships in AI Ex­is­ten­tial Safety by FLI!

Zhijing Jin25 Sep 2023 18:42 UTC
5 points
2 comments2 min readLW link

Public Opinion on AI Safety: AIMS 2023 and 2021 Summary

25 Sep 2023 18:55 UTC
3 points
2 comments3 min readLW link
(www.sentienceinstitute.org)

Un­pick­ing Extinction

ukc100149 Dec 2023 9:15 UTC
35 points
10 comments10 min readLW link

Don’t Think About the Thing Be­hind the Cur­tain.

keltan19 Sep 2023 2:07 UTC
4 points
0 comments5 min readLW link

The mind-killer

Paul Crowley2 May 2009 16:49 UTC
29 points
160 comments2 min readLW link

Peo­ple who want to save the world

Giles15 May 2011 0:44 UTC
5 points
247 comments1 min readLW link

In­stru­men­tal Con­ver­gence and hu­man ex­tinc­tion.

Spiritus Dei2 Oct 2023 0:41 UTC
−10 points
3 comments7 min readLW link

Fix­ing In­sider Threats in the AI Sup­ply Chain

Madhav Malhotra7 Oct 2023 13:19 UTC
20 points
2 comments5 min readLW link

A New Model for Com­pute Cen­ter Verification

Damin Curtis10 Oct 2023 19:22 UTC
8 points
0 comments5 min readLW link

Grey Goo Re­quires AI

harsimony15 Jan 2021 4:45 UTC
8 points
11 comments4 min readLW link
(harsimony.wordpress.com)

Notes on “Bioter­ror and Biowar­fare” (2006)

MichaelA2 Mar 2021 0:43 UTC
10 points
3 comments4 min readLW link

Texas Freeze Ret­ro­spec­tive: meetup notes

jchan3 Mar 2021 14:48 UTC
68 points
6 comments11 min readLW link

Jaan Tal­linn’s 2020 Philan­thropy Overview

jaan27 Apr 2021 16:22 UTC
113 points
4 comments1 min readLW link
(jaan.online)

Con­trol­ling In­tel­li­gent Agents The Only Way We Know How: Ideal Bureau­cratic Struc­ture (IBS)

Justin Bullock24 May 2021 12:53 UTC
14 points
15 comments6 min readLW link

Overview of Re­think Pri­ori­ties’ work on risks from nu­clear weapons

MichaelA11 Jun 2021 20:05 UTC
12 points
0 comments3 min readLW link

An­nounc­ing the Nu­clear Risk Fore­cast­ing Tournament

MichaelA16 Jun 2021 16:16 UTC
16 points
2 comments2 min readLW link

Is so­cial the­ory our doom?

pchvykov15 Jul 2021 3:31 UTC
6 points
2 comments2 min readLW link

Is the ar­gu­ment that AI is an xrisk valid?

MACannon19 Jul 2021 13:20 UTC
5 points
61 comments1 min readLW link
(onlinelibrary.wiley.com)

What is the prob­lem?

Carlos Ramirez11 Aug 2021 22:33 UTC
7 points
0 comments6 min readLW link

A gen­tle apoc­a­lypse

pchvykov16 Aug 2021 5:03 UTC
3 points
5 comments3 min readLW link

Could you have stopped Ch­er­nobyl?

Carlos Ramirez27 Aug 2021 1:48 UTC
29 points
17 comments8 min readLW link

The Gover­nance Prob­lem and the “Pretty Good” X-Risk

Zach Stein-Perlman29 Aug 2021 18:00 UTC
5 points
2 comments11 min readLW link

Pivot!

Carlos Ramirez12 Sep 2021 20:39 UTC
−18 points
5 comments1 min readLW link

The Me­taethics and Nor­ma­tive Ethics of AGI Value Align­ment: Many Ques­tions, Some Implications

Eleos Arete Citrini16 Sep 2021 16:13 UTC
6 points
0 comments8 min readLW link

AI take­off story: a con­tinu­a­tion of progress by other means

Edouard Harris27 Sep 2021 15:55 UTC
76 points
13 comments10 min readLW link

X-Risk, An­throp­ics, & Peter Thiel’s In­vest­ment Thesis

Jackson Wagner26 Oct 2021 18:50 UTC
21 points
1 comment19 min readLW link

Fram­ing ap­proaches to al­ign­ment and the hard prob­lem of AI cognition

ryan_greenblatt15 Dec 2021 19:06 UTC
16 points
15 comments27 min readLW link

Ex­ter­mi­nat­ing hu­mans might be on the to-do list of a Friendly AI

RomanS7 Dec 2021 14:15 UTC
5 points
8 comments2 min readLW link

[Linkpost] Chi­nese gov­ern­ment’s guidelines on AI

RomanS10 Dec 2021 21:10 UTC
61 points
14 comments1 min readLW link

[Question] Nu­clear war anthropics

smountjoy12 Dec 2021 4:54 UTC
11 points
7 comments1 min readLW link

Disen­tan­gling Per­spec­tives On Strat­egy-Steal­ing in AI Safety

shawnghu18 Dec 2021 20:13 UTC
20 points
1 comment11 min readLW link

Bet­ter a Brave New World than a dead one

Yitz25 Feb 2022 23:11 UTC
9 points
5 comments4 min readLW link

Pre­dict­ing a global catas­tro­phe: the Ukrainian model

RomanS7 Apr 2022 12:06 UTC
5 points
11 comments2 min readLW link

[Question] Con­vince me that hu­man­ity is as doomed by AGI as Yud­kowsky et al., seems to believe

Yitz10 Apr 2022 21:02 UTC
92 points
141 comments2 min readLW link

AI Alter­na­tive Fu­tures: Sce­nario Map­ping Ar­tifi­cial In­tel­li­gence Risk—Re­quest for Par­ti­ci­pa­tion (*Closed*)

Kakili27 Apr 2022 22:07 UTC
10 points
2 comments8 min readLW link

The Cuban mis­sile crisis: the strate­gic context

NathanBarnard16 May 2022 12:12 UTC
18 points
2 comments10 min readLW link

Deep­Mind’s gen­er­al­ist AI, Gato: A non-tech­ni­cal explainer

16 May 2022 21:21 UTC
63 points
6 comments6 min readLW link

BERI is seek­ing new col­lab­o­ra­tors (2022)

sawyer17 May 2022 17:33 UTC
1 point
0 comments1 min readLW link

The value of x-risk reduction

NathanBarnard22 May 2022 7:44 UTC
4 points
0 comments4 min readLW link

Ad­ver­sar­ial at­tacks and op­ti­mal control

Jan22 May 2022 18:22 UTC
17 points
7 comments8 min readLW link
(universalprior.substack.com)

Where Utopias Go Wrong, or: The Four Lit­tle Planets

ExCeph27 May 2022 1:24 UTC
15 points
0 comments11 min readLW link
(ginnungagapfoundation.wordpress.com)

[Linkpost] A Chi­nese AI op­ti­mized for killing

RomanS3 Jun 2022 9:17 UTC
−2 points
4 comments1 min readLW link

We will be around in 30 years

mukashi7 Jun 2022 3:47 UTC
12 points
205 comments2 min readLW link

[Question] Model­ing hu­man­ity’s ro­bust­ness to GCRs?

Fer32dwt34r3dfsz9 Jun 2022 17:34 UTC
2 points
2 comments2 min readLW link

Re­sources I send to AI re­searchers about AI safety

Vael Gates14 Jun 2022 2:24 UTC
69 points
12 comments1 min readLW link

FYI: I’m work­ing on a book about the threat of AGI/​ASI for a gen­eral au­di­ence. I hope it will be of value to the cause and the community

Darren McKee15 Jun 2022 18:08 UTC
42 points
15 comments2 min readLW link

New US Se­nate Bill on X-Risk Miti­ga­tion [Linkpost]

Evan R. Murphy4 Jul 2022 1:25 UTC
35 points
12 comments1 min readLW link
(www.hsgac.senate.gov)

My Most Likely Rea­son to Die Young is AI X-Risk

AISafetyIsNotLongtermist4 Jul 2022 17:08 UTC
61 points
24 comments4 min readLW link
(forum.effectivealtruism.org)

How to de­stroy the uni­verse with a hypercomputer

Trevor Cappallo5 Jul 2022 19:05 UTC
2 points
3 comments1 min readLW link

John von Neu­mann on how to safely progress with technology

Dalton Mabery13 Jul 2022 11:07 UTC
14 points
0 comments1 min readLW link

A Cri­tique of AI Align­ment Pessimism

ExCeph19 Jul 2022 2:28 UTC
9 points
1 comment9 min readLW link

En­light­en­ment Values in a Vuln­er­a­ble World

Maxwell Tabarrok20 Jul 2022 19:52 UTC
15 points
6 comments31 min readLW link
(maximumprogress.substack.com)

Anti-squat­ted AI x-risk do­mains index

plex12 Aug 2022 12:01 UTC
56 points
6 comments1 min readLW link

In­tro­duc­ing the Ex­is­ten­tial Risks In­tro­duc­tory Course (ERIC)

19 Aug 2022 15:54 UTC
9 points
0 comments7 min readLW link

How Do AI Timelines Affect Ex­is­ten­tial Risk?

Stephen McAleese29 Aug 2022 16:57 UTC
7 points
9 comments23 min readLW link

[Question] How can we se­cure more re­search po­si­tions at our uni­ver­si­ties for x-risk re­searchers?

Neil Crawford6 Sep 2022 17:17 UTC
11 points
0 comments1 min readLW link

It’s (not) how you use it

Eleni Angelou7 Sep 2022 17:15 UTC
8 points
1 comment2 min readLW link

How to Train Your AGI Dragon

Eris Discordia21 Sep 2022 22:28 UTC
−1 points
3 comments5 min readLW link

The pro­to­typ­i­cal catas­trophic AI ac­tion is get­ting root ac­cess to its datacenter

Buck2 Jun 2022 23:46 UTC
171 points
13 comments2 min readLW link1 review

On Generality

Eris Discordia26 Sep 2022 4:06 UTC
2 points
0 comments5 min readLW link

Oren’s Field Guide of Bad AGI Outcomes

Eris Discordia26 Sep 2022 4:06 UTC
0 points
0 comments1 min readLW link

[Linkpost] “In­ten­sity and fre­quency of ex­treme novel epi­demics” by Mar­i­ani et al. (2021)

Fer32dwt34r3dfsz28 Sep 2022 3:31 UTC
10 points
0 comments1 min readLW link

The Village and the River Mon­sters… Or: Less Fight­ing, More Brainstorming

ExCeph3 Oct 2022 23:01 UTC
7 points
29 comments8 min readLW link
(ginnungagapfoundation.wordpress.com)

Un­con­trol­lable AI as an Ex­is­ten­tial Risk

Karl von Wendt9 Oct 2022 10:36 UTC
21 points
0 comments20 min readLW link

Let’s talk about un­con­trol­lable AI

Karl von Wendt9 Oct 2022 10:34 UTC
15 points
6 comments3 min readLW link

That one apoc­a­lyp­tic nu­clear famine pa­per is bunk

Lao Mein12 Oct 2022 3:33 UTC
110 points
10 comments1 min readLW link

Life, Death, and Fi­nance in the Cos­mic Mul­ti­verse

peterb16 Oct 2022 18:57 UTC
2 points
1 comment1 min readLW link

Threat Model Liter­a­ture Review

1 Nov 2022 11:03 UTC
77 points
4 comments25 min readLW link

Clar­ify­ing AI X-risk

1 Nov 2022 11:03 UTC
127 points
24 comments4 min readLW link1 review

Why do we post our AI safety plans on the In­ter­net?

Peter S. Park3 Nov 2022 16:02 UTC
4 points
4 comments11 min readLW link

My sum­mary of “Prag­matic AI Safety”

Eleni Angelou5 Nov 2022 12:54 UTC
3 points
0 comments5 min readLW link

4 Key As­sump­tions in AI Safety

Prometheus7 Nov 2022 10:50 UTC
20 points
5 comments7 min readLW link

Loss of con­trol of AI is not a likely source of AI x-risk

squek7 Nov 2022 18:44 UTC
−6 points
0 comments5 min readLW link

[Question] Value of Query­ing 100+ Peo­ple About Hu­man­ity’s Future

Fer32dwt34r3dfsz8 Nov 2022 0:41 UTC
9 points
3 comments2 min readLW link

Could a sin­gle alien mes­sage de­stroy us?

25 Nov 2022 7:32 UTC
59 points
23 comments6 min readLW link
(youtu.be)

AI can ex­ploit safety plans posted on the Internet

Peter S. Park4 Dec 2022 12:17 UTC
−15 points
4 comments1 min readLW link

AI Safety in a Vuln­er­a­ble World: Re­quest­ing Feed­back on Pre­limi­nary Thoughts

Jordan Arel6 Dec 2022 22:35 UTC
4 points
2 comments3 min readLW link

How Many Lives Does X-Risk Work Save From Nonex­is­tence On Aver­age?

Jordan Arel8 Dec 2022 21:57 UTC
4 points
5 comments14 min readLW link

Avoid­ing per­pet­ual risk from TAI

scasper26 Dec 2022 22:34 UTC
15 points
6 comments5 min readLW link

VIRTUA: a novel about AI alignment

Karl von Wendt12 Jan 2023 9:37 UTC
46 points
12 comments1 min readLW link

An­nounc­ing Cavendish Labs

19 Jan 2023 20:15 UTC
59 points
5 comments2 min readLW link
(forum.effectivealtruism.org)

“How to Es­cape from the Si­mu­la­tion”—Seeds of Science call for reviewers

rogersbacon26 Jan 2023 15:11 UTC
12 points
0 comments1 min readLW link

Vi­su­al­ise your own prob­a­bil­ity of an AI catas­tro­phe: an in­ter­ac­tive Sankey plot

MNoetel16 Feb 2023 12:03 UTC
1 point
2 comments1 min readLW link

Break­ing the Op­ti­mizer’s Curse, and Con­se­quences for Ex­is­ten­tial Risks and Value Learning

Roger Dearnaley21 Feb 2023 9:05 UTC
10 points
1 comment23 min readLW link

How to sur­vive in an AGI cataclysm

RomanS23 Feb 2023 14:34 UTC
−4 points
3 comments4 min readLW link

[Link Post] Cy­ber Digi­tal Author­i­tar­i­anism (Na­tional In­tel­li­gence Coun­cil Re­port)

Phosphorous26 Feb 2023 20:51 UTC
12 points
2 comments1 min readLW link
(www.dni.gov)

Re­sults Pre­dic­tion Thread About How Differ­ent Fac­tors Affect AI X-Risk

MrThink2 Mar 2023 22:13 UTC
9 points
0 comments2 min readLW link

Why kill ev­ery­one?

arisAlexis5 Mar 2023 11:53 UTC
7 points
5 comments2 min readLW link

Is it time to talk about AI dooms­day prep­ping yet?

bokov5 Mar 2023 21:17 UTC
0 points
8 comments1 min readLW link

[Question] Are we too con­fi­dent about un­al­igned AGI kil­ling off hu­man­ity?

RomanS6 Mar 2023 16:19 UTC
21 points
63 comments1 min readLW link

AI Safety in a World of Vuln­er­a­ble Ma­chine Learn­ing Systems

8 Mar 2023 2:40 UTC
70 points
28 comments29 min readLW link
(far.ai)

Hu­man­ity’s Lack of Unity Will Lead to AGI Catastrophe

MiguelDev19 Mar 2023 19:18 UTC
3 points
2 comments4 min readLW link

Anki deck for learn­ing the main AI safety orgs, pro­jects, and programs

Bryce Robertson30 Sep 2023 16:06 UTC
2 points
0 comments1 min readLW link

Is Align­ment enough?

gchu16 Aug 2024 11:46 UTC
1 point
0 comments1 min readLW link

Weak vs Quan­ti­ta­tive Ex­tinc­tion-level Good­hart’s Law

21 Feb 2024 17:38 UTC
27 points
1 comment2 min readLW link

Post se­ries on “Li­a­bil­ity Law for re­duc­ing Ex­is­ten­tial Risk from AI”

Nora_Ammann29 Feb 2024 4:39 UTC
42 points
1 comment1 min readLW link
(forum.effectivealtruism.org)

When sci­en­tists con­sider whether their re­search will end the world

Harlan19 Dec 2023 3:47 UTC
30 points
4 comments11 min readLW link
(blog.aiimpacts.org)

Why en­tropy means you might not have to worry as much about su­per­in­tel­li­gent AI

Ron J23 May 2024 3:52 UTC
−26 points
1 comment2 min readLW link

Reflect­ing on the tran­shu­man­ist re­but­tal to AI ex­is­ten­tial risk and cri­tique of our de­bate method­olo­gies and mi­suse of statistics

catgirlsruletheworld20 Aug 2024 1:59 UTC
−5 points
0 comments4 min readLW link

Psy­chol­ogy of AI doomers and AI optimists

Igor Ivanov28 Dec 2023 17:55 UTC
3 points
0 comments22 min readLW link

We shouldn’t fear su­per­in­tel­li­gence be­cause it already exists

Spencer Chubb7 Jan 2024 17:59 UTC
−22 points
14 comments1 min readLW link

[Question] Isn’t there risk in de­lay­ing AI

Joseph Gardi25 May 2024 1:54 UTC
1 point
0 comments1 min readLW link

The ne­ces­sity of “Guardian AI” and two con­di­tions for its achievement

Proica26 May 2024 17:39 UTC
−2 points
0 comments15 min readLW link

The ex­is­ten­tial threat of hu­mans.

Spiritus Dei12 Jan 2024 17:50 UTC
−24 points
0 comments3 min readLW link

What if Align­ment is Not Enough?

WillPetillo7 Mar 2024 8:10 UTC
12 points
24 comments9 min readLW link

A Paradigm Shift in Sustainability

Jose Miguel Cruz y Celis23 Jan 2024 23:34 UTC
5 points
0 comments18 min readLW link

Krueger Lab AI Safety In­tern­ship 2024

Joey Bream24 Jan 2024 19:17 UTC
3 points
0 comments1 min readLW link

RAND re­port finds no effect of cur­rent LLMs on vi­a­bil­ity of bioter­ror­ism attacks

StellaAthena25 Jan 2024 19:17 UTC
94 points
14 comments1 min readLW link
(www.rand.org)

What Failure Looks Like is not an ex­is­ten­tial risk (and al­ign­ment is not the solu­tion)

otto.barten2 Feb 2024 18:59 UTC
13 points
12 comments9 min readLW link

The Jour­nal of Danger­ous Ideas

rogersbacon3 Feb 2024 15:40 UTC
−25 points
4 comments5 min readLW link
(www.secretorum.life)

Re­quire­ments for a Basin of At­trac­tion to Alignment

RogerDearnaley14 Feb 2024 7:10 UTC
38 points
10 comments31 min readLW link

“No-one in my org puts money in their pen­sion”

Tobes16 Feb 2024 18:33 UTC
267 points
16 comments9 min readLW link
(seekingtobejolly.substack.com)

Ex­tinc­tion Risks from AI: In­visi­ble to Science?

21 Feb 2024 18:07 UTC
24 points
7 comments1 min readLW link
(arxiv.org)

Dy­nam­ics Cru­cial to AI Risk Seem to Make for Com­pli­cated Models

21 Feb 2024 17:54 UTC
19 points
0 comments9 min readLW link

Ex­tinc­tion-level Good­hart’s Law as a Prop­erty of the Environment

21 Feb 2024 17:56 UTC
23 points
0 comments10 min readLW link

Claude es­ti­mates 30-50% like­li­hood x-risk

amelia19 Mar 2024 2:22 UTC
3 points
2 comments2 min readLW link

Nu­clear Quan­tum Im­mor­tal­ity Hack­ing

Nezek23 Mar 2024 22:08 UTC
−3 points
2 comments2 min readLW link

Ar­tifi­cial In­tel­li­gence and Liv­ing Wisdom

TMFOW29 Mar 2024 7:41 UTC
−6 points
1 comment17 min readLW link
(tmfow.substack.com)

[Question] Will OpenAI also re­quire a “Su­per Red Team Agent” for its “Su­per­al­ign­ment” Pro­ject?

Super AGI30 Mar 2024 5:25 UTC
2 points
2 comments1 min readLW link

The Buck­ling World Hy­poth­e­sis—Vi­su­al­is­ing Vuln­er­a­ble Worlds

Rosco-Hunter4 Apr 2024 15:51 UTC
−5 points
2 comments4 min readLW link

In­ves­ti­gat­ing the role of agency in AI x-risk

Corin Katzke8 Apr 2024 15:12 UTC
10 points
0 comments1 min readLW link
(www.convergenceanalysis.org)

A Gen­tle In­tro­duc­tion to Risk Frame­works Beyond Forecasting

pendingsurvival11 Apr 2024 18:03 UTC
73 points
10 comments27 min readLW link

The Hu­man Biolog­i­cal Ad­van­tage Over AI

Wstewart6 Jun 2024 18:18 UTC
−13 points
2 comments1 min readLW link

S-Risks: Fates Worse Than Ex­tinc­tion

4 May 2024 15:30 UTC
53 points
2 comments6 min readLW link
(youtu.be)

The Frag­ility of Life Hy­poth­e­sis and the Evolu­tion of Cooperation

KristianRonn4 Sep 2024 21:04 UTC
49 points
6 comments11 min readLW link

AI x Hu­man Flour­ish­ing: In­tro­duc­ing the Cos­mos Institute

Brendan McCord5 Sep 2024 18:23 UTC
14 points
5 comments6 min readLW link
(cosmosinstitute.substack.com)

Teach­ing CS Dur­ing Take-Off

andrew carle14 May 2024 22:45 UTC
88 points
13 comments2 min readLW link

AI-cre­ated simu­la­tions, na­ture of DOOM

amelia13 Jun 2024 3:44 UTC
1 point
0 comments1 min readLW link

Seek­ing Mechanism De­signer for Re­search into In­ter­nal­iz­ing Catas­trophic Externalities

c.trout11 Sep 2024 15:09 UTC
24 points
2 comments3 min readLW link

Ra­tion­al­ity vs Alignment

Donatas Lučiūnas7 Jul 2024 10:12 UTC
−14 points
14 comments2 min readLW link

“The Sin­gu­lar­ity Is Nearer” by Ray Kurzweil—Review

Lavender8 Jul 2024 21:32 UTC
22 points
0 comments4 min readLW link

[Question] Pon­der­ing how good or bad things will be in the AGI future

Sherrinford9 Jul 2024 22:46 UTC
11 points
9 comments2 min readLW link

[Question] If AI starts to end the world, is suicide a good idea?

IlluminateReality9 Jul 2024 21:53 UTC
0 points
8 comments1 min readLW link

On Ar­tifi­cial Wisdom

Jordan Arel12 Jul 2024 0:20 UTC
3 points
0 comments14 min readLW link

Align­ment: “Do what I would have wanted you to do”

Oleg Trott12 Jul 2024 16:47 UTC
11 points
48 comments1 min readLW link

De­sign­ing Ar­tifi­cial Wis­dom: The Wise Work­flow Re­search Organization

Jordan Arel12 Jul 2024 19:18 UTC
2 points
0 comments8 min readLW link

De­sign­ing Ar­tifi­cial Wis­dom: GitWise and AlphaWise

Jordan Arel13 Jul 2024 6:46 UTC
2 points
0 comments7 min readLW link

Series on Ar­tifi­cial Wisdom

Jordan Arel15 Jul 2024 1:11 UTC
2 points
0 comments3 min readLW link

De­sign­ing Ar­tifi­cial Wis­dom: De­ci­sion Fore­cast­ing AI & Futarchy

Jordan Arel15 Jul 2024 0:46 UTC
0 points
0 comments6 min readLW link

[Question] Would a scope-in­sen­si­tive AGI be less likely to in­ca­pac­i­tate hu­man­ity?

Jim Buhler21 Jul 2024 14:15 UTC
2 points
3 comments1 min readLW link

The $100B plan with “70% risk of kil­ling us all” w Stephen Fry [video]

Oleg Trott21 Jul 2024 20:06 UTC
34 points
8 comments1 min readLW link
(www.youtube.com)

AI ex­is­ten­tial risk prob­a­bil­ities are too un­re­li­able to in­form policy

Oleg Trott28 Jul 2024 0:59 UTC
18 points
5 comments1 min readLW link
(www.aisnakeoil.com)

The Other Ex­is­ten­tial Crisis

James Stephen Brown21 Sep 2024 1:16 UTC
9 points
24 comments2 min readLW link

[Question] [Thought Ex­per­i­ment] Given a but­ton to ter­mi­nate all hu­man­ity, would you press it?

lorepieri1 Aug 2024 15:10 UTC
−2 points
9 comments1 min readLW link

[Question] Does VETLM solve AI su­per­al­ign­ment?

Oleg Trott8 Aug 2024 18:22 UTC
−1 points
10 comments1 min readLW link

Ten counter-ar­gu­ments that AI is (not) an ex­is­ten­tial risk (for now)

Ariel Kwiatkowski13 Aug 2024 22:35 UTC
19 points
5 comments8 min readLW link

Why the 2024 elec­tion mat­ters, the AI risk case for Har­ris, & what you can do to help

Alex Lintz24 Sep 2024 19:32 UTC
23 points
7 comments20 min readLW link

Bounty for Ev­i­dence on Some of Pal­isade Re­search’s Beliefs

23 Sep 2024 20:01 UTC
46 points
4 comments2 min readLW link

Align­ment by de­fault: the simu­la­tion hypothesis

gb25 Sep 2024 16:26 UTC
21 points
39 comments1 min readLW link

You can, in fact, bam­boo­zle an un­al­igned AI into spar­ing your life

David Matolcsi29 Sep 2024 16:59 UTC
92 points
171 comments27 min readLW link

Can AI Quan­tity beat AI Qual­ity?

Gianluca Calcagni2 Oct 2024 15:21 UTC
2 points
0 comments5 min readLW link

Does nat­u­ral se­lec­tion fa­vor AIs over hu­mans?

cdkg3 Oct 2024 18:47 UTC
20 points
1 comment1 min readLW link
(link.springer.com)

AIsip Man­i­festo: A Scien­tific Ex­plo­ra­tion of Har­mo­nious Co-Ex­is­tence Between Hu­mans, AI, and All Be­ings ChatGPT-4o’s In­de­pen­dent Per­spec­tive on AIsip, Signed by ChatGPT-4o and En­dorsed by Carl Sel­l­man

Carl Sellman11 Oct 2024 19:06 UTC
1 point
0 comments3 min readLW link

Dario Amodei’s “Machines of Lov­ing Grace” sound in­cred­ibly dan­ger­ous, for Humans

Super AGI27 Oct 2024 5:05 UTC
8 points
1 comment1 min readLW link

Claude seems to be smarter than LessWrong community

Donatas Lučiūnas3 Nov 2024 21:40 UTC
−52 points
51 comments1 min readLW link

Propos­ing the Con­di­tional AI Safety Treaty (linkpost TIME)

otto.barten15 Nov 2024 13:59 UTC
10 points
4 comments3 min readLW link
(time.com)

[Question] What (if any­thing) made your p(doom) go down in 2024?

Satron16 Nov 2024 16:46 UTC
−2 points
1 comment1 min readLW link

Ex­plor­ing the Pre­cau­tion­ary Prin­ci­ple in AI Devel­op­ment: His­tor­i­cal Analo­gies and Les­sons Learned

Christopher King21 Mar 2023 3:53 UTC
−1 points
2 comments9 min readLW link

Limit in­tel­li­gent weapons

Lucas Pfeifer23 Mar 2023 17:54 UTC
−11 points
36 comments1 min readLW link

How likely do you think worse-than-ex­tinc­tion type fates to be?

span124 Mar 2023 21:03 UTC
5 points
4 comments1 min readLW link

[Question] Seek­ing Ad­vice on Rais­ing AI X-Risk Aware­ness on So­cial Media

MrThink24 Mar 2023 22:25 UTC
2 points
1 comment1 min readLW link

De­sen­si­tiz­ing Deepfakes

Phib29 Mar 2023 1:20 UTC
1 point
0 comments1 min readLW link

[Question] Why don’t peo­ple talk about the Dooms­day Ar­gu­ment more of­ten?

sam31 Mar 2023 17:52 UTC
−1 points
3 comments1 min readLW link

The Plan: Put ChatGPT in Charge

Sven Nilsen1 Apr 2023 17:23 UTC
−5 points
3 comments1 min readLW link

Steer­ing systems

Max H4 Apr 2023 0:56 UTC
50 points
1 comment15 min readLW link

Towards em­pa­thy in RL agents and be­yond: In­sights from cog­ni­tive sci­ence for AI Align­ment

Marc Carauleanu3 Apr 2023 19:59 UTC
15 points
6 comments1 min readLW link
(clipchamp.com)

Strate­gies to Prevent AI Annihilation

lastchanceformankind4 Apr 2023 8:59 UTC
−2 points
0 comments4 min readLW link

How AGI will ac­tu­ally end us: Some pre­dic­tions on evolu­tion by ar­tifi­cial selection

James Carney10 Apr 2023 13:52 UTC
−11 points
1 comment13 min readLW link

[Question] A Tale of Two In­tel­li­gences: xRisk, AI, and My Relationship

xRiskAnon92310 Apr 2023 23:35 UTC
2 points
6 comments1 min readLW link

Why I’m not wor­ried about im­mi­nent doom

Ariel Kwiatkowski10 Apr 2023 15:31 UTC
7 points
2 comments4 min readLW link

AI x-risk, ap­prox­i­mately or­dered by embarrassment

Alex Lawsen 12 Apr 2023 23:01 UTC
151 points
7 comments19 min readLW link

Open-source LLMs may prove Bostrom’s vuln­er­a­ble world hypothesis

Roope Ahvenharju15 Apr 2023 19:16 UTC
1 point
1 comment1 min readLW link

Ar­tifi­cial In­tel­li­gence as exit strat­egy from the age of acute ex­is­ten­tial risk

Arturo Macias12 Apr 2023 14:48 UTC
−7 points
15 comments7 min readLW link

[Link/​cross­post] [US] NTIA: AI Ac­countabil­ity Policy Re­quest for Comment

Kyle J. Lucchese16 Apr 2023 6:57 UTC
8 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

On the pos­si­bil­ity of im­pos­si­bil­ity of AGI Long-Term Safety

Roman Yen13 May 2023 18:38 UTC
6 points
3 comments9 min readLW link

P(doom|su­per­in­tel­li­gence) or coin tosses and dice throws of hu­man val­ues (and other re­lated Ps).

Muyyd22 Apr 2023 10:06 UTC
−7 points
0 comments4 min readLW link

World and Mind in Ar­tifi­cial In­tel­li­gence: ar­gu­ments against the AI pause

Arturo Macias18 Apr 2023 14:40 UTC
1 point
0 comments1 min readLW link
(forum.effectivealtruism.org)

[Cross­post] Or­ga­niz­ing a de­bate with ex­perts and MPs to raise AI xrisk aware­ness: a pos­si­ble blueprint

otto.barten19 Apr 2023 11:45 UTC
8 points
0 comments4 min readLW link
(forum.effectivealtruism.org)

The Cruel Trade-Off Between AI Mi­suse and AI X-risk Concerns

simeon_c22 Apr 2023 13:49 UTC
24 points
1 comment2 min readLW link

The Se­cu­rity Mind­set, S-Risk and Pub­lish­ing Pro­saic Align­ment Research

lukemarks22 Apr 2023 14:36 UTC
39 points
7 comments5 min readLW link

Paths to failure

25 Apr 2023 8:03 UTC
29 points
1 comment8 min readLW link

An­nounc­ing #AISum­mitTalks fea­tur­ing Pro­fes­sor Stu­art Rus­sell and many others

otto.barten24 Oct 2023 10:11 UTC
17 points
1 comment1 min readLW link

Sanc­tu­ary for Humans

nikola27 Oct 2023 18:08 UTC
21 points
9 comments1 min readLW link

5 psy­cholog­i­cal rea­sons for dis­miss­ing x-risks from AGI

Igor Ivanov26 Oct 2023 17:21 UTC
24 points
6 comments4 min readLW link

AI Ex­is­ten­tial Safety Fellowships

mmfli28 Oct 2023 18:07 UTC
5 points
0 comments1 min readLW link

Fo­cus on ex­is­ten­tial risk is a dis­trac­tion from the real is­sues. A false fallacy

Nik Samoylov30 Oct 2023 23:42 UTC
−19 points
11 comments2 min readLW link

On ex­clud­ing dan­ger­ous in­for­ma­tion from training

ShayBenMoshe17 Nov 2023 11:14 UTC
23 points
5 comments3 min readLW link

Ilya: The AI sci­en­tist shap­ing the world

David Varga20 Nov 2023 13:09 UTC
11 points
0 comments4 min readLW link

4. A Mo­ral Case for Evolved-Sapi­ence-Chau­vinism

RogerDearnaley24 Nov 2023 4:56 UTC
10 points
0 comments4 min readLW link

1. A Sense of Fair­ness: De­con­fus­ing Ethics

RogerDearnaley17 Nov 2023 20:55 UTC
16 points
8 comments15 min readLW link

How to Con­trol an LLM’s Be­hav­ior (why my P(DOOM) went down)

RogerDearnaley28 Nov 2023 19:56 UTC
64 points
30 comments11 min readLW link

Re­think Pri­ori­ties: Seek­ing Ex­pres­sions of In­ter­est for Spe­cial Pro­jects Next Year

kierangreig29 Nov 2023 13:59 UTC
4 points
0 comments5 min readLW link

Pre­serv­ing our her­i­tage: Build­ing a move­ment and a knowl­edge ark for cur­rent and fu­ture generations

rnk829 Nov 2023 19:20 UTC
0 points
5 comments12 min readLW link

6. The Mutable Values Prob­lem in Value Learn­ing and CEV

RogerDearnaley4 Dec 2023 18:31 UTC
12 points
0 comments49 min readLW link

FLI Pod­cast: The Precipice: Ex­is­ten­tial Risk and the Fu­ture of Hu­man­ity with Toby Ord

Palus Astra1 Apr 2020 1:02 UTC
7 points
1 comment46 min readLW link

Don’t Con­di­tion on no Catastrophes

Scott Garrabrant21 Feb 2018 21:50 UTC
37 points
7 comments2 min readLW link

Jaan Tal­linn’s Philan­thropic Pledge

jaan22 Feb 2020 10:03 UTC
78 points
1 comment1 min readLW link

Bioinfohazards

Spiracular17 Sep 2019 2:41 UTC
87 points
14 comments18 min readLW link2 reviews

Should ethi­cists be in­side or out­side a pro­fes­sion?

Eliezer Yudkowsky12 Dec 2018 1:40 UTC
97 points
7 comments9 min readLW link

Global in­sect de­clines: Why aren’t we all dead yet?

eukaryote1 Apr 2018 20:38 UTC
28 points
26 comments1 min readLW link

New or­ga­ni­za­tion—Fu­ture of Life In­sti­tute (FLI)

Vika14 Jun 2014 23:00 UTC
70 points
35 comments1 min readLW link

The Vuln­er­a­ble World Hy­poth­e­sis (by Bostrom)

Ben Pace6 Nov 2018 20:05 UTC
50 points
17 comments4 min readLW link
(nickbostrom.com)

AI Safety Newslet­ter #4: AI and Cy­ber­se­cu­rity, Per­sua­sive AIs, Weaponiza­tion, and Ge­offrey Hin­ton talks AI risks

2 May 2023 18:41 UTC
32 points
0 comments5 min readLW link
(newsletter.safe.ai)