En­vi­ron­ments for kil­ling AIs

Douglas_Reay17 Mar 2018 15:23 UTC
3 points
1 comment9 min readLW link

You can now log in with your LW1 cre­den­tials on LW2

habryka17 Mar 2018 5:56 UTC
10 points
5 comments1 min readLW link

CoZE 3: Empiricism

alkjash17 Mar 2018 4:10 UTC
20 points
4 comments2 min readLW link
(radimentary.wordpress.com)

Rais­ing funds to es­tab­lish a new AI Safety charity

null17 Mar 2018 0:09 UTC
57 points
9 comments5 min readLW link

The Ep­silon Fallacy

johnswentworth17 Mar 2018 0:08 UTC
88 points
21 comments7 min readLW link
(medium.com)

AI Sum­mer Fel­lows Program

colm16 Mar 2018 21:57 UTC
20 points
2 comments1 min readLW link

Dat­ing like a Pro

Jacob Falkovich16 Mar 2018 21:09 UTC
12 points
10 comments1 min readLW link
(putanumonit.com)

Rigged re­ward learning

Stuart_Armstrong16 Mar 2018 15:39 UTC
1 point
0 comments2 min readLW link

Defect or Cooperate

Douglas_Reay16 Mar 2018 14:12 UTC
4 points
5 comments6 min readLW link

De­sign 3: Intentionality

alkjash16 Mar 2018 4:30 UTC
21 points
12 comments3 min readLW link
(radimentary.wordpress.com)

Cryp­tog­ra­phy/​Soft­ware Eng­ineer­ing Prob­lem: How to make LW 1.0 lo­gins work on LW 2.0

habryka16 Mar 2018 4:01 UTC
8 points
17 comments2 min readLW link

The Costly Co­or­di­na­tion Mechanism of Com­mon Knowledge

Ben Pace15 Mar 2018 20:20 UTC
198 points
31 comments20 min readLW link2 reviews

Us­ing ly­ing to de­tect hu­man values

Stuart_Armstrong15 Mar 2018 11:41 UTC
0 points
0 comments1 min readLW link
(www.lesserwrong.com)

Us­ing ly­ing to de­tect hu­man values

Stuart_Armstrong15 Mar 2018 11:37 UTC
19 points
6 comments1 min readLW link

Up­com­ing sta­bil­ity of values

Stuart_Armstrong15 Mar 2018 11:36 UTC
15 points
15 comments2 min readLW link

Values de­ter­mined by “stop­ping” properties

Stuart_Armstrong15 Mar 2018 10:53 UTC
12 points
16 comments3 min readLW link

Don’t put all your eggs in one basket

Douglas_Reay15 Mar 2018 8:07 UTC
5 points
1 comment7 min readLW link

TAPs 3: Reductionism

alkjash15 Mar 2018 5:20 UTC
23 points
8 comments2 min readLW link
(radimentary.wordpress.com)

On Dualities

Chris_Leong15 Mar 2018 2:10 UTC
2 points
10 comments3 min readLW link

A Con­crete Multi-Step Var­i­ant of Dou­ble Crux I Have Used Suc­cess­fully

sapphire15 Mar 2018 1:26 UTC
16 points
4 comments2 min readLW link

Shadow

whpearson14 Mar 2018 21:13 UTC
−1 points
6 comments1 min readLW link

The Build­ing Blocks of Interpretability

Ben Pace14 Mar 2018 20:42 UTC
8 points
1 comment1 min readLW link

LW Up­date 3/​14 – Com­mu­nity, Mark­down and More

Raemon14 Mar 2018 18:29 UTC
11 points
6 comments2 min readLW link

Ex­per­tise Exchange

ChristianKl14 Mar 2018 18:04 UTC
21 points
23 comments1 min readLW link

Op­ti­mum num­ber of sin­gle points of failure

Douglas_Reay14 Mar 2018 13:30 UTC
7 points
5 comments5 min readLW link

New Paper Ex­pand­ing on the Good­hart Taxonomy

Scott Garrabrant14 Mar 2018 9:01 UTC
17 points
4 comments1 min readLW link
(arxiv.org)

Strength­en­ing the foun­da­tions un­der the Over­ton Win­dow with­out mov­ing it

KatjaGrace14 Mar 2018 2:20 UTC
12 points
7 comments3 min readLW link
(meteuphoric.wordpress.com)

Large Mam­mal BPF Prize Win­ning Announcement

JohnGreer13 Mar 2018 23:48 UTC
3 points
0 comments1 min readLW link
(www.brainpreservation.org)

Re­quest for “Tests” for the MIRI Re­search Guide

Hazard13 Mar 2018 23:22 UTC
28 points
14 comments1 min readLW link

Car­ing less

eukaryote13 Mar 2018 22:53 UTC
72 points
24 comments4 min readLW link3 reviews

Look­ing and the no-self

ChristianKl13 Mar 2018 19:39 UTC
16 points
17 comments1 min readLW link

Yoda Timers 3: Speed

alkjash13 Mar 2018 18:00 UTC
20 points
12 comments2 min readLW link
(radimentary.wordpress.com)

A Devel­op­men­tal Frame­work for Rationality

lifelonglearner13 Mar 2018 1:36 UTC
23 points
9 comments9 min readLW link

Bug Hunt 3

alkjash13 Mar 2018 0:20 UTC
26 points
13 comments3 min readLW link
(radimentary.wordpress.com)

Ap­pro­pri­ate­ness of Dis­cussing Ra­tion­al­ist Dis­course of a Poli­ti­cal Na­ture on LW?

Evan_Gaensbauer12 Mar 2018 23:21 UTC
13 points
24 comments1 min readLW link

Avoid­ing AI Races Through Self-Regulation

Gordon Seidoh Worley12 Mar 2018 20:53 UTC
7 points
2 comments8 min readLW link
(mapandterritory.org)

AI Align­ment Prize: Round 2 due March 31, 2018

Zvi12 Mar 2018 12:10 UTC
28 points
2 comments3 min readLW link
(thezvi.wordpress.com)

Mul­ti­plic­ity of “en­light­en­ment” states and con­tem­pla­tive practices

Wei Dai12 Mar 2018 8:15 UTC
46 points
4 comments2 min readLW link

Should we re­move mark­down pars­ing from the com­ment ed­i­tor?

habryka12 Mar 2018 5:00 UTC
9 points
14 comments1 min readLW link

A Tax­on­omy of Weird­ness

Evan Clark12 Mar 2018 2:33 UTC
6 points
5 comments4 min readLW link

ESPR 2018 Ap­pli­ca­tions Are Open!

lifelonglearner12 Mar 2018 0:02 UTC
2 points
0 comments1 min readLW link

Leav­ing beta: Vot­ing on mov­ing to LessWrong.com

Vaniver11 Mar 2018 23:40 UTC
10 points
38 comments2 min readLW link

Leav­ing beta: Vot­ing on mov­ing to LessWrong.com

Vaniver11 Mar 2018 22:53 UTC
57 points
65 comments2 min readLW link

Edi­tor Mini-Guide

Ben Pace11 Mar 2018 20:58 UTC
22 points
62 comments2 min readLW link

Mur­phy’s Quest Postmorterm

alkjash11 Mar 2018 20:10 UTC
28 points
10 comments6 min readLW link
(radimentary.wordpress.com)

ESPR 2018 Ap­pli­ca­tions Are Open

lifelonglearner11 Mar 2018 20:07 UTC
9 points
4 comments1 min readLW link

Types of Con­fu­sion Experiences

Hazard11 Mar 2018 14:32 UTC
13 points
0 comments2 min readLW link

Mur­phy’s Quest Ch 13: Ex­is­ten­tial Risk

alkjash11 Mar 2018 7:10 UTC
21 points
5 comments2 min readLW link
(radimentary.wordpress.com)

Ke­gan and Cul­ti­vat­ing Compassion

lifelonglearner11 Mar 2018 1:32 UTC
18 points
4 comments6 min readLW link

Misery Pits

Alicorn10 Mar 2018 23:50 UTC
47 points
23 comments2 min readLW link