Naive Hy­pothe­ses on AI Alignment

Shoshannah TekofskyJul 2, 2022, 7:03 PM
98 points
29 comments5 min readLW link

The Tree of Life: Stan­ford AI Align­ment The­ory of Change

Gabe MJul 2, 2022, 6:36 PM
25 points
0 comments14 min readLW link

Fol­low along with Columbia EA’s Ad­vanced AI Safety Fel­low­ship!

RohanSJul 2, 2022, 5:45 PM
3 points
0 comments2 min readLW link
(forum.effectivealtruism.org)

Wel­come to Analo­gia! (Chap­ter 7)

Justin BullockJul 2, 2022, 5:04 PM
5 points
0 comments11 min readLW link

[Question] What about tran­shu­mans and be­yond?

AlignmentMirrorJul 2, 2022, 1:58 PM
7 points
6 comments1 min readLW link

Goal-di­rect­ed­ness: tack­ling complexity

Morgan_RogersJul 2, 2022, 1:51 PM
8 points
0 comments38 min readLW link

Liter­a­ture recom­men­da­tions July 2022

ChristianKlJul 2, 2022, 9:14 AM
17 points
9 comments1 min readLW link

Deon­tolog­i­cal Evil

lsusrJul 2, 2022, 6:57 AM
44 points
4 comments2 min readLW link

Could an AI Align­ment Sand­box be use­ful?

Michael SoareverixJul 2, 2022, 5:06 AM
2 points
1 comment1 min readLW link

Five views of Bayes’ Theorem

Adam ScherlisJul 2, 2022, 2:25 AM
38 points
4 comments1 min readLW link

[Linkpost] Ex­is­ten­tial Risk Anal­y­sis in Em­piri­cal Re­search Papers

Dan HJul 2, 2022, 12:09 AM
40 points
0 comments1 min readLW link
(arxiv.org)

Agenty AGI – How Tempt­ing?

PeterMcCluskeyJul 1, 2022, 11:40 PM
22 points
3 comments5 min readLW link
(www.bayesianinvestor.com)

AXRP Epi­sode 16 - Prepar­ing for De­bate AI with Ge­offrey Irving

DanielFilanJul 1, 2022, 10:20 PM
20 points
0 comments37 min readLW link

[Question] Ex­am­ples of prac­ti­cal im­pli­ca­tions of Judea Pearl’s Causal­ity work

ChristianKlJul 1, 2022, 8:58 PM
23 points
6 comments1 min readLW link

Minerva

AlgonJul 1, 2022, 8:06 PM
36 points
6 comments2 min readLW link
(ai.googleblog.com)

Disarm­ing status

sanoJul 1, 2022, 8:00 PM
−4 points
1 comment6 min readLW link

Paper: Fore­cast­ing world events with neu­ral nets

Jul 1, 2022, 7:40 PM
39 points
3 comments4 min readLW link

Refram­ing the AI Risk

Thane RuthenisJul 1, 2022, 6:44 PM
26 points
7 comments6 min readLW link

Who is this MSRayne per­son any­way?

MSRayneJul 1, 2022, 5:32 PM
32 points
30 comments11 min readLW link

Limer­ence Messes Up Your Ra­tion­al­ity Real Bad, Yo

RaemonJul 1, 2022, 4:53 PM
128 points
42 comments3 min readLW link2 reviews

[Link] On the para­dox of tol­er­ance in re­la­tion to fas­cism and on­line con­tent mod­er­a­tion – Un­sta­ble Ontology

KennyJul 1, 2022, 4:43 PM
5 points
0 comments1 min readLW link

Trends in GPU price-performance

Jul 1, 2022, 3:51 PM
85 points
13 comments1 min readLW link1 review
(epochai.org)

[Question] How to deal with non-schedu­la­ble one-off stim­u­lus-re­sponse-pair-like situ­a­tions when plan­ning/​or­ganis­ing pro­jects?

mikbpJul 1, 2022, 3:22 PM
2 points
3 comments1 min readLW link

What Is The True Name of Mo­du­lar­ity?

Jul 1, 2022, 2:55 PM
39 points
10 comments12 min readLW link

Defin­ing Op­ti­miza­tion in a Deeper Way Part 1

J BostockJul 1, 2022, 2:03 PM
7 points
0 comments2 min readLW link

Safetywashing

Adam SchollJul 1, 2022, 11:56 AM
260 points
20 comments1 min readLW link2 reviews

[Question] AGI al­ign­ment with what?

AlignmentMirrorJul 1, 2022, 10:22 AM
6 points
10 comments1 min readLW link

Open & Wel­come Thread—July 2022

Kaj_SotalaJul 1, 2022, 7:47 AM
20 points
61 comments1 min readLW link

[Question] What is the con­trast to coun­ter­fac­tual rea­son­ing?

Dominic RoserJul 1, 2022, 7:39 AM
5 points
10 comments1 min readLW link

Meio­sis is all you need

MetacelsusJul 1, 2022, 7:39 AM
41 points
3 comments2 min readLW link
(denovo.substack.com)

[Question] How to Nav­i­gate Eval­u­at­ing Poli­ti­cized Re­search?

Davis_KingsleyJul 1, 2022, 5:59 AM
11 points
1 comment1 min readLW link

One is (al­most) nor­mal in base π

Adam ScherlisJul 1, 2022, 4:05 AM
14 points
0 comments1 min readLW link
(adam.scherlis.com)

AI safety uni­ver­sity groups: a promis­ing op­por­tu­nity to re­duce ex­is­ten­tial risk

micJul 1, 2022, 3:59 AM
14 points
0 comments11 min readLW link

Look­ing back on my al­ign­ment PhD

TurnTroutJul 1, 2022, 3:19 AM
334 points
66 comments11 min readLW link

Selec­tion pro­cesses for subagents

Ryan KiddJun 30, 2022, 11:57 PM
36 points
2 comments9 min readLW link

[Question] Cry­on­ics-ad­ja­cent question

FlaglandbaseJun 30, 2022, 11:03 PM
12 points
3 comments1 min readLW link

Fore­casts are not enough

Ege ErdilJun 30, 2022, 10:00 PM
43 points
5 comments5 min readLW link

Mur­phyjitsu: an In­ner Si­mu­la­tor algorithm

CFAR!DuncanJun 30, 2022, 9:50 PM
67 points
24 comments11 min readLW link2 reviews

GPT-3 Catch­ing Fish in Morse Code

Megan KinnimentJun 30, 2022, 9:22 PM
117 points
27 comments8 min readLW link

Me­tacog­ni­tion in the Rat

Jacob FalkovichJun 30, 2022, 8:53 PM
19 points
0 comments6 min readLW link

On viewquakes

Dalton MaberyJun 30, 2022, 8:08 PM
8 points
0 comments2 min readLW link

The Track Record of Fu­tur­ists Seems … Fine

HoldenKarnofskyJun 30, 2022, 7:40 PM
91 points
25 comments12 min readLW link
(www.cold-takes.com)

Quick sur­vey on AI al­ign­ment resources

frances_lorenzJun 30, 2022, 7:09 PM
14 points
0 comments1 min readLW link

[Linkpost] Solv­ing Quan­ti­ta­tive Rea­son­ing Prob­lems with Lan­guage Models

YitzJun 30, 2022, 6:58 PM
76 points
15 comments2 min readLW link
(storage.googleapis.com)

Failing to fix a dan­ger­ous intersection

alyssavanceJun 30, 2022, 6:09 PM
110 points
17 comments2 min readLW link

Most Func­tions Have Un­de­sir­able Global Extrema

En KepeigJun 30, 2022, 5:10 PM
8 points
5 comments3 min readLW link

He­donis­tic Iso­topes:

TrozxzrJun 30, 2022, 4:49 PM
1 point
0 comments1 min readLW link

Abadar­ian Trades

David UdellJun 30, 2022, 4:41 PM
17 points
22 comments2 min readLW link

Covid 6/​30/​22: Vac­cine Up­date Update

ZviJun 30, 2022, 2:00 PM
32 points
6 comments12 min readLW link
(thezvi.wordpress.com)

[Question] How should I talk about op­ti­mal but not sub­game-op­ti­mal play?

JamesFavilleJun 30, 2022, 1:58 PM
5 points
1 comment3 min readLW link