Par­tial sum­mary of de­bate with Ben­quo and Jes­si­cata [pt 1]

Raemon14 Aug 2019 20:02 UTC
89 points
63 comments22 min readLW link3 reviews

“De­sign­ing agent in­cen­tives to avoid re­ward tam­per­ing”, DeepMind

gwern14 Aug 2019 16:57 UTC
28 points
15 comments1 min readLW link
(medium.com)

Subagents, trauma and rationality

Kaj_Sotala14 Aug 2019 13:14 UTC
111 points
4 comments19 min readLW link

Pre­dicted AI al­ign­ment event/​meet­ing calendar

rmoehn14 Aug 2019 7:14 UTC
29 points
14 comments1 min readLW link

Nat­u­ral laws should be ex­plicit con­straints on strat­egy space

ryan_b13 Aug 2019 20:22 UTC
8 points
6 comments1 min readLW link

Dis­tance Func­tions are Hard

Grue_Slinky13 Aug 2019 17:33 UTC
31 points
19 comments6 min readLW link

Book Re­view: Sec­u­lar Cycles

Scott Alexander13 Aug 2019 4:10 UTC
62 points
10 comments16 min readLW link1 review
(slatestarcodex.com)

A Primer on Ma­trix Calcu­lus, Part 1: Ba­sic review

Matthew Barnett12 Aug 2019 23:44 UTC
25 points
4 comments7 min readLW link

[Question] What ex­plana­tory power does Kah­ne­man’s Sys­tem 2 pos­sess?

Richard_Ngo12 Aug 2019 15:23 UTC
31 points
2 comments1 min readLW link

Mesa-Op­ti­miz­ers and Over-op­ti­miza­tion Failure (Op­ti­miz­ing and Good­hart Effects, Clar­ify­ing Thoughts—Part 4)

Davidmanheim12 Aug 2019 8:07 UTC
15 points
3 comments4 min readLW link

Ad­jec­tives from the Fu­ture: The Dangers of Re­sult-based Descriptions

Pradeep_Kumar11 Aug 2019 19:19 UTC
19 points
8 comments11 min readLW link

[Question] Could we solve this email mess if we all moved to paid emails?

jacobjacob11 Aug 2019 16:31 UTC
29 points
50 comments4 min readLW link

AI Safety Read­ing Group

Søren Elverlin11 Aug 2019 9:01 UTC
16 points
8 comments1 min readLW link

[Question] Does hu­man choice have to be tran­si­tive in or­der to be ra­tio­nal/​con­sis­tent?

jmh11 Aug 2019 1:49 UTC
9 points
6 comments1 min readLW link

Di­ana Fleischman and Ge­offrey Miller—Au­di­ence Q&A

Jacob Falkovich10 Aug 2019 22:37 UTC
38 points
6 comments9 min readLW link

In­tran­si­tive Prefer­ences You Can’t Pump

zulupineapple9 Aug 2019 23:10 UTC
0 points
2 comments1 min readLW link

Cat­e­go­rial prefer­ences and util­ity functions

DavidHolmes9 Aug 2019 21:36 UTC
10 points
6 comments5 min readLW link

[Question] What is the state of the ego de­ple­tion field?

Eli Tyre9 Aug 2019 20:30 UTC
27 points
10 comments1 min readLW link

Why Gra­di­ents Van­ish and Explode

Matthew Barnett9 Aug 2019 2:54 UTC
25 points
9 comments3 min readLW link

AI Fore­cast­ing Dic­tionary (Fore­cast­ing in­fras­truc­ture, part 1)

8 Aug 2019 16:10 UTC
50 points
0 comments5 min readLW link

[Question] Why do hu­mans not have built-in neu­ral i/​o chan­nels?

Richard_Ngo8 Aug 2019 13:09 UTC
25 points
23 comments1 min readLW link

Which of these five AI al­ign­ment re­search pro­jects ideas are no good?

rmoehn8 Aug 2019 7:17 UTC
25 points
13 comments1 min readLW link

Cal­ibrat­ing With Cards

lifelonglearner8 Aug 2019 6:44 UTC
32 points
3 comments3 min readLW link

[Question] Is there a source/​mar­ket for LW-re­lated t-shirts?

jooyous8 Aug 2019 4:30 UTC
8 points
3 comments1 min readLW link

Ver­ifi­ca­tion and Transparency

DanielFilan8 Aug 2019 1:50 UTC
35 points
6 comments2 min readLW link
(danielfilan.com)

Toy model piece #2: Com­bin­ing short and long range par­tial preferences

Stuart_Armstrong8 Aug 2019 0:11 UTC
14 points
0 comments4 min readLW link

Four Ways An Im­pact Mea­sure Could Help Alignment

Matthew Barnett8 Aug 2019 0:10 UTC
21 points
1 comment9 min readLW link

Nashville Au­gust SSC Meetup

friedelcraftiness7 Aug 2019 20:11 UTC
1 point
0 comments1 min readLW link

In defense of Or­a­cle (“Tool”) AI research

Steven Byrnes7 Aug 2019 19:14 UTC
22 points
11 comments4 min readLW link

Help fore­cast study repli­ca­tion in this so­cial sci­ence pre­dic­tion market

rosiecam7 Aug 2019 18:18 UTC
29 points
3 comments1 min readLW link

[Question] Edit Nickname

Luigi Lotti7 Aug 2019 17:42 UTC
5 points
1 comment1 min readLW link

Self-Su­per­vised Learn­ing and AGI Safety

Steven Byrnes7 Aug 2019 14:21 UTC
29 points
9 comments12 min readLW link

Emo­tions are not beliefs

Chris_Leong7 Aug 2019 6:27 UTC
25 points
2 comments2 min readLW link

Un­der­stand­ing Re­cent Im­pact Measures

Matthew Barnett7 Aug 2019 4:57 UTC
16 points
6 comments7 min readLW link

[Site Up­date] Be­hind the scenes data-layer and caching improvements

habryka7 Aug 2019 0:49 UTC
23 points
3 comments1 min readLW link

Pro­ject Pro­posal: Con­sid­er­a­tions for trad­ing off ca­pa­bil­ities and safety im­pacts of AI research

David Scott Krueger (formerly: capybaralet)6 Aug 2019 22:22 UTC
25 points
11 comments2 min readLW link

Subagents, neu­ral Tur­ing ma­chines, thought se­lec­tion, and blindspots

Kaj_Sotala6 Aug 2019 21:15 UTC
87 points
3 comments12 min readLW link

[Question] Per­cent re­duc­tion of gun-re­lated deaths by color of gun.

Gunnar_Zarncke6 Aug 2019 20:28 UTC
8 points
11 comments1 min readLW link

New pa­per: Cor­rigi­bil­ity with Utility Preservation

Koen.Holtman6 Aug 2019 19:04 UTC
44 points
11 comments2 min readLW link

Weak foun­da­tion of de­ter­minism analysis

aiiixiii6 Aug 2019 19:03 UTC
14 points
54 comments3 min readLW link

Trauma, Med­i­ta­tion, and a Cool Scar

Logan Riggs6 Aug 2019 16:17 UTC
102 points
17 comments5 min readLW link1 review

[Question] Why is the ni­tro­gen cy­cle so un­der-em­pha­sized com­pared to cli­mate change

ChristianKl6 Aug 2019 9:25 UTC
15 points
4 comments1 min readLW link

[Question] How would a per­son go about start­ing a geo­eng­ineer­ing startup?

Pee Doom6 Aug 2019 7:34 UTC
11 points
5 comments1 min readLW link

Sta­tus 451 on Di­ag­no­sis: Rus­sell Aphasia

Zack_M_Davis6 Aug 2019 4:43 UTC
48 points
1 comment1 min readLW link
(status451.com)

Searle’s Chi­nese Room and the Mean­ing of Meaning

Jimdrix_Hendri6 Aug 2019 4:09 UTC
0 points
4 comments2 min readLW link

[Question] What are the best re­sources for ex­am­in­ing the ev­i­dence for an­thro­pogenic cli­mate change?

Matthew Barnett6 Aug 2019 2:53 UTC
10 points
8 comments1 min readLW link

A Sur­vey of Early Im­pact Measures

Matthew Barnett6 Aug 2019 1:22 UTC
29 points
0 comments8 min readLW link

Prefer­ences as an (in­stinc­tive) stance

Stuart_Armstrong6 Aug 2019 0:43 UTC
18 points
4 comments4 min readLW link

[Question] How to nav­i­gate through con­tra­dic­tory (health/​fit­ness) ad­vice?

Sherrinford5 Aug 2019 20:58 UTC
14 points
7 comments1 min readLW link

My recom­men­da­tions for grat­i­tude exercises

MaxCarpendale5 Aug 2019 19:04 UTC
40 points
3 comments5 min readLW link