[Question] If al­ign­ment prob­lem was un­solv­able, would that avoid doom?

Kinrany7 May 2023 22:13 UTC
3 points
3 comments1 min readLW link

An ar­tifi­cially struc­tured ar­gu­ment for ex­pect­ing AGI ruin

Rob Bensinger7 May 2023 21:52 UTC
91 points
26 comments19 min readLW link

Where “the Se­quences” Are Wrong

Thoth Hermes7 May 2023 20:21 UTC
−15 points
5 comments14 min readLW link
(thothhermes.substack.com)

What’s wrong with be­ing dumb?

Adam Zerner7 May 2023 18:31 UTC
14 points
17 comments2 min readLW link

Cat­e­gories of Ar­gu­ing Style : Why be­ing good among ra­tio­nal­ists isn’t enough to ar­gue with everyone

Camille Berger 7 May 2023 17:45 UTC
16 points
0 comments23 min readLW link

Self-Ad­ministered Gell-Mann Amnesia

krs7 May 2023 17:44 UTC
1 point
1 comment1 min readLW link

Un­der­stand­ing mesa-op­ti­miza­tion us­ing toy models

7 May 2023 17:00 UTC
43 points
2 comments10 min readLW link

How to have Poly­geni­cally Screened Children

GeneSmith7 May 2023 16:01 UTC
354 points
127 comments27 min readLW link

Statis­ti­cal mod­els & the ir­rele­vance of rare exceptions

patrissimo7 May 2023 15:59 UTC
37 points
6 comments2 min readLW link

Let’s look for co­her­ence theorems

Valdes7 May 2023 14:45 UTC
25 points
18 comments6 min readLW link

Graph­i­cal Rep­re­sen­ta­tions of Paul Chris­ti­ano’s Doom Model

Nathan Young7 May 2023 13:03 UTC
7 points
0 comments1 min readLW link

An an­thro­po­mor­phic AI dilemma

TsviBT7 May 2023 12:44 UTC
26 points
0 comments7 min readLW link

Violin Supports

jefftk7 May 2023 12:10 UTC
12 points
1 comment1 min readLW link
(www.jefftk.com)

Prop­er­ties of Good Textbooks

niplav7 May 2023 8:38 UTC
50 points
11 comments1 min readLW link

Against sac­ri­fic­ing AI trans­parency for gen­er­al­ity gains

Ape in the coat7 May 2023 6:52 UTC
4 points
0 comments2 min readLW link

TED talk by Eliezer Yud­kowsky: Un­leash­ing the Power of Ar­tifi­cial Intelligence

bayesed7 May 2023 5:45 UTC
49 points
36 comments1 min readLW link
(www.youtube.com)

Think­ing of Con­ve­nience as an Eco­nomic Term

ozziegooen7 May 2023 1:21 UTC
6 points
0 comments12 min readLW link
(forum.effectivealtruism.org)

Cor­rigi­bil­ity, Much more de­tail than any­one wants to Read

Logan Zoellner7 May 2023 1:02 UTC
26 points
2 comments7 min readLW link

Resi­d­ual stream norms grow ex­po­nen­tially over the for­ward pass

7 May 2023 0:46 UTC
76 points
24 comments11 min readLW link

On the Loeb­ner Silver Prize (a Tur­ing test)

hold_my_fish7 May 2023 0:39 UTC
18 points
2 comments2 min readLW link

Time and En­ergy Costs to Erase a Bit

DaemonicSigil6 May 2023 23:29 UTC
24 points
32 comments7 min readLW link

How much do you be­lieve your re­sults?

Eric Neyman6 May 2023 20:31 UTC
476 points
17 comments15 min readLW link3 reviews
(ericneyman.wordpress.com)

Long Covid Risks: 2023 Update

Elizabeth6 May 2023 18:20 UTC
70 points
9 comments4 min readLW link
(acesounderglass.com)

Is “red” for GPT-4 the same as “red” for you?

Yusuke Hayashi6 May 2023 17:55 UTC
9 points
6 comments2 min readLW link

The Broader Fos­sil Fuel Community

Jeffrey Heninger6 May 2023 14:49 UTC
16 points
1 comment3 min readLW link

Es­ti­mat­ing Norovirus Prevalence

jefftk6 May 2023 11:40 UTC
16 points
0 comments2 min readLW link
(www.jefftk.com)

Align­ment as Func­tion Fitting

A.H.6 May 2023 11:38 UTC
7 points
0 comments12 min readLW link

My preferred fram­ings for re­ward mis­speci­fi­ca­tion and goal misgeneralisation

Yi-Yang6 May 2023 4:48 UTC
27 points
1 comment8 min readLW link

You don’t need to be a ge­nius to be in AI safety research

Claire Short6 May 2023 2:32 UTC
14 points
1 comment6 min readLW link

Nat­u­ral­ist Collection

LoganStrohl6 May 2023 0:37 UTC
66 points
7 comments15 min readLW link

Do you work at an AI lab? Please quit

Nik Samoylov5 May 2023 23:41 UTC
−29 points
9 comments1 min readLW link

Ex­plain­ing “Hell is Game The­ory Folk The­o­rems”

electroswing5 May 2023 23:33 UTC
57 points
21 comments5 min readLW link

Sleep­ing Beauty – the Death Hypothesis

Guillaume Charrier5 May 2023 23:32 UTC
6 points
8 comments5 min readLW link

Orthog­o­nal’s For­mal-Goal Align­ment the­ory of change

Tamsin Leake5 May 2023 22:36 UTC
68 points
13 comments4 min readLW link
(carado.moe)

A smart enough LLM might be deadly sim­ply if you run it for long enough

Mikhail Samin5 May 2023 20:49 UTC
19 points
16 comments8 min readLW link

What Ja­son has been read­ing, May 2023: “Pro­topia,” com­plex sys­tems, Daedalus vs. Icarus, and more

jasoncrawford5 May 2023 19:54 UTC
25 points
2 comments11 min readLW link
(rootsofprogress.org)

CHAT Di­plo­macy: LLMs and Na­tional Security

JohnBuridan5 May 2023 19:45 UTC
25 points
6 comments7 min readLW link

Linkpost for Ac­cursed Farms Dis­cus­sion /​ de­bate with AI ex­pert Eliezer Yudkowsky

gilch5 May 2023 18:20 UTC
14 points
2 comments1 min readLW link
(www.youtube.com)

Reg­u­late or Com­pete? The China Fac­tor in U.S. AI Policy (NAIR #2)

charles_m5 May 2023 17:43 UTC
2 points
1 comment7 min readLW link
(navigatingairisks.substack.com)

Kingfisher Live CD Process

jefftk5 May 2023 17:00 UTC
13 points
0 comments3 min readLW link
(www.jefftk.com)

What can we learn from Bayes about rea­son­ing?

jasoncrawford5 May 2023 15:52 UTC
21 points
11 comments1 min readLW link

[Question] Why not use ac­tive SETI to pre­vent AI Doom?

RomanS5 May 2023 14:41 UTC
13 points
13 comments1 min readLW link

In­ves­ti­gat­ing Emer­gent Goal-Like Be­hav­ior in Large Lan­guage Models us­ing Ex­per­i­men­tal Economics

phelps-sg5 May 2023 11:15 UTC
6 points
1 comment4 min readLW link

Monthly Shorts 4/​23

Celer5 May 2023 7:20 UTC
8 points
1 comment3 min readLW link
(keller.substack.com)

[Question] What is it like to be a com­pat­i­bil­ist?

tslarm5 May 2023 2:56 UTC
8 points
72 comments1 min readLW link

Tran­script of a pre­sen­ta­tion on catas­trophic risks from AI

RobertM5 May 2023 1:38 UTC
6 points
0 comments8 min readLW link

How to get good at programming

Ulisse Mini5 May 2023 1:14 UTC
39 points
3 comments2 min readLW link

An Up­date On The Cam­paign For AI Safety Dot Org

yanni kyriacos5 May 2023 0:21 UTC
−13 points
2 comments1 min readLW link

A brief col­lec­tion of Hin­ton’s re­cent com­ments on AGI risk

Kaj_Sotala4 May 2023 23:31 UTC
143 points
9 comments11 min readLW link

Robin Han­son and I talk about AI risk

KatjaGrace4 May 2023 22:20 UTC
39 points
8 comments1 min readLW link
(worldspiritsockpuppet.com)