Power Buys You Dis­tance From The Crime

Elizabeth2 Aug 2019 20:50 UTC
208 points
75 comments7 min readLW link1 review
(acesounderglass.com)

Why Subagents?

johnswentworth1 Aug 2019 22:17 UTC
174 points
48 comments7 min readLW link1 review

The Com­mit­ment Races problem

Daniel Kokotajlo23 Aug 2019 1:58 UTC
152 points
56 comments5 min readLW link

Soft take­off can still lead to de­ci­sive strate­gic advantage

Daniel Kokotajlo23 Aug 2019 16:39 UTC
122 points
47 comments8 min readLW link4 reviews

Subagents, trauma and rationality

Kaj_Sotala14 Aug 2019 13:14 UTC
111 points
4 comments19 min readLW link

Trauma, Med­i­ta­tion, and a Cool Scar

Logan Riggs6 Aug 2019 16:17 UTC
102 points
17 comments5 min readLW link1 review

[Question] Can we re­ally pre­vent all warm­ing for less than 10B$ with the mostly side-effect free geo­eng­ineer­ing tech­nique of Marine Cloud Bright­en­ing?

mako yass5 Aug 2019 0:12 UTC
94 points
55 comments2 min readLW link

Par­tial sum­mary of de­bate with Ben­quo and Jes­si­cata [pt 1]

Raemon14 Aug 2019 20:02 UTC
89 points
63 comments22 min readLW link3 reviews

Subagents, neu­ral Tur­ing ma­chines, thought se­lec­tion, and blindspots

Kaj_Sotala6 Aug 2019 21:15 UTC
87 points
3 comments12 min readLW link

Troll Bridge

abramdemski23 Aug 2019 18:36 UTC
86 points
59 comments12 min readLW link

2-D Robustness

Vlad Mikulik30 Aug 2019 20:27 UTC
85 points
8 comments2 min readLW link

Clar­ify­ing some key hy­pothe­ses in AI alignment

15 Aug 2019 21:29 UTC
79 points
12 comments9 min readLW link

Prob­lems in AI Align­ment that philoso­phers could po­ten­tially con­tribute to

Wei Dai17 Aug 2019 17:38 UTC
78 points
14 comments2 min readLW link

Mar­kets are Univer­sal for Log­i­cal Induction

johnswentworth22 Aug 2019 6:44 UTC
75 points
2 comments5 min readLW link

Clas­sify­ing speci­fi­ca­tion prob­lems as var­i­ants of Good­hart’s Law

Vika19 Aug 2019 20:40 UTC
72 points
5 comments5 min readLW link1 review

Six AI Risk/​Strat­egy Ideas

Wei Dai27 Aug 2019 0:40 UTC
69 points
17 comments4 min readLW link1 review

[Question] Does Agent-like Be­hav­ior Im­ply Agent-like Ar­chi­tec­ture?

Scott Garrabrant23 Aug 2019 2:01 UTC
66 points
8 comments1 min readLW link

Re­sponse to Glen Weyl on Tech­noc­racy and the Ra­tion­al­ist Community

John_Maxwell22 Aug 2019 23:14 UTC
66 points
9 comments10 min readLW link

[Question] Why so much var­i­ance in hu­man in­tel­li­gence?

Ben Pace22 Aug 2019 22:36 UTC
65 points
28 comments4 min readLW link

Book Re­view: Sec­u­lar Cycles

Scott Alexander13 Aug 2019 4:10 UTC
62 points
10 comments16 min readLW link1 review
(slatestarcodex.com)

Dual Wielding

Zvi27 Aug 2019 14:10 UTC
60 points
23 comments2 min readLW link3 reviews
(thezvi.wordpress.com)

How to Make Billions of Dol­lars Re­duc­ing Loneliness

John_Maxwell30 Aug 2019 17:30 UTC
60 points
32 comments7 min readLW link

Schel­ling Cat­e­gories, and Sim­ple Mem­ber­ship Tests

Zack_M_Davis26 Aug 2019 2:43 UTC
58 points
10 comments8 min readLW link

Ta­boo­ing ‘Agent’ for Pro­saic Alignment

Hjalmar_Wijk23 Aug 2019 2:55 UTC
57 points
10 comments6 min readLW link

Ac­tu­ally updating

SaraHax23 Aug 2019 17:46 UTC
56 points
10 comments4 min readLW link

In­ten­tional Bucket Errors

Scott Garrabrant22 Aug 2019 20:02 UTC
55 points
6 comments3 min readLW link

Per­mis­sions in Governance

sarahconstantin2 Aug 2019 19:50 UTC
53 points
12 comments8 min readLW link
(srconstantin.wordpress.com)

A Per­sonal Ra­tion­al­ity Wishlist

DanielFilan27 Aug 2019 3:40 UTC
53 points
54 comments4 min readLW link
(danielfilan.com)

Com­pu­ta­tional Model: Causal Di­a­grams with Symmetry

johnswentworth22 Aug 2019 17:54 UTC
53 points
29 comments4 min readLW link

AI Fore­cast­ing Dic­tionary (Fore­cast­ing in­fras­truc­ture, part 1)

8 Aug 2019 16:10 UTC
50 points
0 comments5 min readLW link

Vaniver’s View on Fac­tored Cognition

Vaniver23 Aug 2019 2:54 UTC
48 points
4 comments8 min readLW link

Sta­tus 451 on Di­ag­no­sis: Rus­sell Aphasia

Zack_M_Davis6 Aug 2019 4:43 UTC
48 points
1 comment1 min readLW link
(status451.com)

Septem­ber Brag­ging Thread

Raemon30 Aug 2019 21:58 UTC
47 points
12 comments1 min readLW link

Towards a mechanis­tic un­der­stand­ing of corrigibility

evhub22 Aug 2019 23:20 UTC
47 points
26 comments4 min readLW link

[Question] How Can Peo­ple Eval­u­ate Com­plex Ques­tions Con­sis­tently?

Elizabeth26 Aug 2019 20:33 UTC
46 points
12 comments1 min readLW link

[Link] Book Re­view: Refram­ing Su­per­in­tel­li­gence (SSC)

ioannes28 Aug 2019 22:57 UTC
46 points
9 comments2 min readLW link

New pa­per: Cor­rigi­bil­ity with Utility Preservation

Koen.Holtman6 Aug 2019 19:04 UTC
44 points
11 comments2 min readLW link

Zeno walks into a bar

lsusr4 Aug 2019 7:00 UTC
43 points
4 comments2 min readLW link

Embed­ded Agency via Abstraction

johnswentworth26 Aug 2019 23:03 UTC
42 points
20 comments11 min readLW link

My recom­men­da­tions for grat­i­tude exercises

MaxCarpendale5 Aug 2019 19:04 UTC
40 points
3 comments5 min readLW link

The Miss­ing Math of Map-Making

johnswentworth28 Aug 2019 21:18 UTC
40 points
8 comments2 min readLW link

Cephaloponderings

Jacob Falkovich4 Aug 2019 16:45 UTC
39 points
4 comments7 min readLW link

Call for con­trib­u­tors to the Align­ment Newsletter

Rohin Shah21 Aug 2019 18:21 UTC
39 points
0 comments4 min readLW link

LW Team Up­dates—Septem­ber 2019

Ruby29 Aug 2019 22:12 UTC
39 points
13 comments2 min readLW link

Epistemic Spot Check: The Fate of Rome (Kyle Harper)

Elizabeth24 Aug 2019 21:40 UTC
39 points
3 comments5 min readLW link
(acesounderglass.com)

Unstriving

Jacob Falkovich19 Aug 2019 14:31 UTC
38 points
7 comments6 min readLW link

Di­ana Fleischman and Ge­offrey Miller—Au­di­ence Q&A

Jacob Falkovich10 Aug 2019 22:37 UTC
38 points
6 comments9 min readLW link

Op­ti­miza­tion Provenance

Adele Lopez23 Aug 2019 20:08 UTC
38 points
5 comments5 min readLW link

Mis­take Ver­sus Con­flict The­ory of Against Billion­aire Philanthropy

Zvi1 Aug 2019 13:10 UTC
36 points
34 comments3 min readLW link
(thezvi.wordpress.com)

Ver­ifi­ca­tion and Transparency

DanielFilan8 Aug 2019 1:50 UTC
35 points
6 comments2 min readLW link
(danielfilan.com)