Can cor­rigi­bil­ity be learned safely?

Wei Dai1 Apr 2018 23:07 UTC
35 points
115 comments4 min readLW link

Global in­sect de­clines: Why aren’t we all dead yet?

eukaryote1 Apr 2018 20:38 UTC
28 points
26 comments1 min readLW link

An­nounc­ing Ra­tional Newsletter

Alexey Lapitsky1 Apr 2018 14:37 UTC
10 points
9 comments1 min readLW link

April Fools: An­nounc­ing: Karma 2.0

habryka1 Apr 2018 10:33 UTC
63 points
56 comments1 min readLW link

Life hacks

Jan_Kulveit1 Apr 2018 10:29 UTC
4 points
0 comments1 min readLW link

One-Year An­niver­sary Ret­ro­spec­tive—Los Angeles

RobertM1 Apr 2018 6:34 UTC
12 points
4 comments3 min readLW link

My take on agent foun­da­tions: for­mal­iz­ing metaphilo­soph­i­cal competence

zhukeepa1 Apr 2018 6:33 UTC
21 points
6 comments1 min readLW link

Cor­rigible but mis­al­igned: a su­per­in­tel­li­gent messiah

zhukeepa1 Apr 2018 6:20 UTC
28 points
26 comments5 min readLW link

LW Up­date 3/​31 - Post High­lights and Bug Fixes

Raemon1 Apr 2018 4:01 UTC
10 points
2 comments1 min readLW link

Schel­ling Shifts Dur­ing AI Self-Modification

MikailKhan1 Apr 2018 1:58 UTC
6 points
3 comments6 min readLW link

Refram­ing mis­al­igned AGI’s: well-in­ten­tioned non-neu­rotyp­i­cal assistants

zhukeepa1 Apr 2018 1:22 UTC
46 points
14 comments2 min readLW link

The Reg­u­lariz­ing-Re­duc­ing Model

RyenKrusinga1 Apr 2018 1:16 UTC
3 points
6 comments1 min readLW link
(drive.google.com)

Me­taphilo­soph­i­cal com­pe­tence can’t be dis­en­tan­gled from alignment

zhukeepa1 Apr 2018 0:38 UTC
46 points
39 comments3 min readLW link

Belief alignment

hnowak1 Apr 2018 0:13 UTC
1 point
2 comments6 min readLW link

A Sketch of Good Communication

Ben Pace31 Mar 2018 22:48 UTC
201 points
35 comments3 min readLW link1 review

Harry Pot­ter and the Method of En­tropy 1 [LessWrong ver­sion]

habryka31 Mar 2018 20:38 UTC
6 points
0 comments3 min readLW link

Harry Pot­ter and the Method of Entropy

alkjash31 Mar 2018 20:10 UTC
11 points
12 comments1 min readLW link
(radimentary.wordpress.com)

Salience

Tueskes31 Mar 2018 19:52 UTC
6 points
1 comment4 min readLW link

Op­por­tu­ni­ties for in­di­vi­d­ual donors in AI safety

Alex Flint31 Mar 2018 18:37 UTC
30 points
3 comments11 min readLW link

Time in Ma­chine Metaethics

Razmęk Massaräinen31 Mar 2018 15:02 UTC
2 points
1 comment6 min readLW link

Nice Things

Zvi31 Mar 2018 12:30 UTC
14 points
0 comments2 min readLW link
(thezvi.wordpress.com)

Re­duc­ing Agents: When ab­strac­tions break

Hazard31 Mar 2018 0:03 UTC
13 points
10 comments8 min readLW link

Syd­ney Ra­tion­al­ity Dojo—April

luminosity30 Mar 2018 14:18 UTC
1 point
0 comments1 min readLW link

The Eter­nal Grind

Zvi30 Mar 2018 11:40 UTC
10 points
1 comment17 min readLW link
(thezvi.wordpress.com)

Re­ward hack­ing and Good­hart’s law by evolu­tion­ary algorithms

Jan_Kulveit30 Mar 2018 7:57 UTC
18 points
5 comments1 min readLW link
(arxiv.org)

Ra­tion­al­ist Lent is over

Qiaochu_Yuan30 Mar 2018 5:57 UTC
20 points
16 comments1 min readLW link

Re­solv­ing hu­man val­ues, com­pletely and adequately

Stuart_Armstrong30 Mar 2018 3:35 UTC
32 points
30 comments12 min readLW link

Chart­ing Deaths: Real­ity vs Reported

lifelonglearner30 Mar 2018 0:50 UTC
13 points
1 comment1 min readLW link
(owenshen24.github.io)

Site search will be down for a few hours

habryka30 Mar 2018 0:43 UTC
4 points
0 comments1 min readLW link

Hufflepuff Cyn­i­cism on Hypocrisy

abramdemski29 Mar 2018 21:01 UTC
21 points
78 comments5 min readLW link

2018 Pre­dic­tion Con­test—Propo­si­tions Needed

jbeshir29 Mar 2018 15:02 UTC
7 points
6 comments4 min readLW link

A frame­work for think­ing about AI timescales

Tobias_Baumann29 Mar 2018 9:29 UTC
7 points
0 comments1 min readLW link
(s-risks.org)

Every Im­ple­men­ta­tion of You is You: An In­tu­ition Ladder

lolbifrons29 Mar 2018 5:14 UTC
3 points
47 comments3 min readLW link

Wash­ing­ton, D.C.: Meta-Meta Meetup

RobinZ28 Mar 2018 18:54 UTC
2 points
0 comments1 min readLW link

Open-Cat­e­gory Classification

TurnTrout28 Mar 2018 14:49 UTC
14 points
6 comments10 min readLW link

*Deleted*

Martin Bernstorff28 Mar 2018 10:22 UTC
−5 points
21 comments1 min readLW link

‘Triv­ial In­con­ve­nience Day’ Retrospective

namespace28 Mar 2018 5:14 UTC
32 points
3 comments6 min readLW link

Karnofsky on fore­cast­ing and what sci­ence does

Rob Bensinger28 Mar 2018 1:55 UTC
14 points
0 comments8 min readLW link
(80000hours.org)

The fun­da­men­tal com­ple­men­tar­ity of con­scious­ness and work

KatjaGrace28 Mar 2018 1:20 UTC
16 points
5 comments2 min readLW link
(meteuphoric.wordpress.com)

Defin­ing the ways hu­man val­ues are messy

Stuart_Armstrong27 Mar 2018 22:42 UTC
9 points
6 comments2 min readLW link

Op­ti­mal level of hi­er­ar­chy for effec­tive al­tru­ism?

Jan_Kulveit27 Mar 2018 22:38 UTC
3 points
0 comments2 min readLW link
(effective-altruism.com)

Learn Bayes Nets!

abramdemski27 Mar 2018 22:00 UTC
52 points
8 comments2 min readLW link

Eval­u­at­ing Ex­ist­ing Ap­proaches to AGI Alignment

Gordon Seidoh Worley27 Mar 2018 19:57 UTC
12 points
0 comments4 min readLW link
(mapandterritory.org)

The mas­ter skill of match­ing map and territory

Rafael Harth27 Mar 2018 12:06 UTC
14 points
13 comments1 min readLW link

[Preprint for com­ment­ing] Digi­tal Im­mor­tal­ity: The­ory and Pro­to­col for Indi­rect Mind Uploading

avturchin27 Mar 2018 11:49 UTC
8 points
5 comments1 min readLW link

Prob­lems with Am­plifi­ca­tion/​Distillation

Stuart_Armstrong27 Mar 2018 11:12 UTC
29 points
7 comments10 min readLW link

GreaterWrong—sev­eral new fea­tures & enhancements

Said Achmiz27 Mar 2018 2:36 UTC
15 points
3 comments1 min readLW link

Non-Ad­ver­sar­ial Good­hart and AI Risks

Davidmanheim27 Mar 2018 1:39 UTC
22 points
11 comments6 min readLW link

A Difficulty With Den­sity-Zero Exploration

Diffractor27 Mar 2018 1:03 UTC
0 points
1 comment2 min readLW link

My Thoughts on Take­off Speeds

tristanm27 Mar 2018 0:05 UTC
11 points
2 comments7 min readLW link