How safe “safe” AI de­vel­op­ment?

Gordon Seidoh WorleyFeb 28, 2018, 11:21 PM
9 points
1 comment1 min readLW link

Beyond al­gorith­mic equiv­alence: self-modelling

Stuart_ArmstrongFeb 28, 2018, 4:55 PM
10 points
3 comments1 min readLW link

Beyond al­gorith­mic equiv­alence: al­gorith­mic noise

Stuart_ArmstrongFeb 28, 2018, 4:55 PM
10 points
4 comments2 min readLW link

Us­ing the uni­ver­sal prior for log­i­cal un­cer­tainty (re­tracted)

cousin_itFeb 28, 2018, 1:07 PM
15 points
13 comments2 min readLW link

2/​27/​08 Up­date – Front­page 3.0

RaemonFeb 28, 2018, 6:26 AM
15 points
21 comments1 min readLW link

TDT for Humans

alkjashFeb 28, 2018, 5:40 AM
26 points
7 comments5 min readLW link
(radimentary.wordpress.com)

Set Up for Suc­cess: In­sights from ‘Naïve Set The­ory’

TurnTroutFeb 28, 2018, 2:01 AM
31 points
40 comments3 min readLW link

In­tu­ition should be ap­plied at the low­est pos­si­ble level

Rafael HarthFeb 27, 2018, 10:58 PM
10 points
9 comments1 min readLW link

The sad state of Ra­tion­al­ity Zürich—Effec­tive Altru­ism Zürich included

rolandFeb 27, 2018, 2:51 PM
−8 points
50 comments3 min readLW link

The worst trol­ley prob­lem in the world

CronoDASFeb 27, 2018, 3:56 AM
1 point
1 comment1 min readLW link

Cat­e­gories of Sacredness

ZviFeb 27, 2018, 2:00 AM
26 points
35 comments8 min readLW link
(thezvi.wordpress.com)

More on the Lin­ear Utility Hy­poth­e­sis and the Lev­er­age Prior

AlexMennenFeb 26, 2018, 11:53 PM
16 points
4 comments9 min readLW link

Goal Factoring

alkjashFeb 26, 2018, 11:30 PM
27 points
4 comments2 min readLW link
(radimentary.wordpress.com)

In­con­ve­nience Is Qual­i­ta­tively Bad

AlicornFeb 26, 2018, 11:27 PM
83 points
52 comments2 min readLW link

The Ham­ming Prob­lem of Group Rationality

PDVFeb 26, 2018, 6:59 PM
6 points
36 comments1 min readLW link

Focusing

alkjashFeb 26, 2018, 6:10 AM
20 points
22 comments3 min readLW link
(radimentary.wordpress.com)

Map­ping the Archipelago

alkjashFeb 26, 2018, 5:09 AM
14 points
24 comments1 min readLW link

Ex­per­i­men­tal Open Threads

Chris_LeongFeb 26, 2018, 3:13 AM
22 points
5 comments1 min readLW link

Walk­through of ‘For­mal­iz­ing Con­ver­gent In­stru­men­tal Goals’

TurnTroutFeb 26, 2018, 2:20 AM
13 points
2 comments10 min readLW link

Will AI See Sud­den Progress?

KatjaGraceFeb 26, 2018, 12:41 AM
27 points
11 comments1 min readLW link1 review

Self-reg­u­la­tion of safety in AI research

Gordon Seidoh WorleyFeb 25, 2018, 11:17 PM
12 points
6 comments2 min readLW link

The abrupt­ness of nu­clear weapons

paulfchristianoFeb 25, 2018, 5:40 PM
47 points
35 comments2 min readLW link

Like­li­hood of dis­con­tin­u­ous progress around the de­vel­op­ment of AGI

vedevazzFeb 25, 2018, 3:13 PM
4 points
2 commentsLW link
(aiimpacts.org)

Open-Source Monasticism

Nathan RosquistFeb 25, 2018, 1:52 PM
26 points
7 comments4 min readLW link

Pass­ing Troll Bridge

DiffractorFeb 25, 2018, 8:21 AM
11 points
2 comments10 min readLW link

Three Miniatures

alkjashFeb 25, 2018, 5:40 AM
22 points
13 comments3 min readLW link
(radimentary.wordpress.com)

Ar­gu­ments about fast takeoff

paulfchristianoFeb 25, 2018, 4:53 AM
97 points
68 comments2 min readLW link1 review
(sideways-view.com)

Meta-tations on Moder­a­tion: Towards Public Archipelago

RaemonFeb 25, 2018, 3:59 AM
81 points
176 comments23 min readLW link

Les­sons from the Cold War on In­for­ma­tion Hazards: Why In­ter­nal Com­mu­ni­ca­tion is Critical

GentzelFeb 24, 2018, 11:34 PM
47 points
10 comments4 min readLW link

What we talk about when we talk about max­imis­ing utility

Richard_NgoFeb 24, 2018, 10:33 PM
14 points
18 comments4 min readLW link

Links with underscores

ShardPhoenixFeb 24, 2018, 11:32 AM
2 points
3 comments1 min readLW link

A use­ful level distinction

Charlie SteinerFeb 24, 2018, 6:39 AM
8 points
4 comments2 min readLW link

CoZE 2

alkjashFeb 24, 2018, 5:40 AM
16 points
7 comments2 min readLW link
(radimentary.wordpress.com)

On Build­ing The­o­ries of History

Samo BurjaFeb 23, 2018, 11:40 PM
30 points
20 comments5 min readLW link

Mythic Mode

ValentineFeb 23, 2018, 10:45 PM
68 points
82 comments9 min readLW link

The Mal­i­cious Use of Ar­tifi­cial In­tel­li­gence: Fore­cast­ing, Preven­tion, and Mitigation

Gordon Seidoh WorleyFeb 23, 2018, 9:42 PM
5 points
8 commentsLW link
(arxiv.org)

Two types of mathematician

drossbucketFeb 23, 2018, 7:26 PM
64 points
41 comments4 min readLW link

June 2012: 0/​33 Tur­ing Award win­ners pre­dict com­put­ers beat­ing hu­mans at go within next 10 years.

betterthanwellFeb 23, 2018, 11:25 AM
18 points
13 comments2 min readLW link

De­sign 2

alkjashFeb 23, 2018, 6:20 AM
18 points
17 comments3 min readLW link
(radimentary.wordpress.com)

AI Align­ment and Phenom­e­nal Consciousness

Gordon Seidoh WorleyFeb 23, 2018, 1:21 AM
9 points
0 comments6 min readLW link
(mapandterritory.org)

Ex­pla­na­tion vs Rationalization

abramdemskiFeb 22, 2018, 11:46 PM
16 points
11 comments4 min readLW link

The map has gears. They don’t always turn.

abramdemskiFeb 22, 2018, 8:16 PM
24 points
0 comments1 min readLW link

The In­tel­li­gent So­cial Web

ValentineFeb 22, 2018, 6:55 PM
230 points
112 comments12 min readLW link2 reviews

The Three Stages Of Model Development

katerinjoFeb 22, 2018, 2:33 PM
17 points
7 comments2 min readLW link

Pain, fear, sex, and higher or­der preferences

Stuart_ArmstrongFeb 22, 2018, 11:30 AM
5 points
3 comments1 min readLW link

TAPs 2

alkjashFeb 22, 2018, 5:10 AM
25 points
6 comments3 min readLW link
(radimentary.wordpress.com)

Ro­bust­ness to Scale

Scott GarrabrantFeb 21, 2018, 10:55 PM
130 points
23 comments2 min readLW link1 review

Don’t Con­di­tion on no Catastrophes

Scott GarrabrantFeb 21, 2018, 9:50 PM
37 points
7 comments2 min readLW link

The Logic of Science: 2.2

mprFeb 21, 2018, 5:28 PM
9 points
3 comments1 min readLW link
(pulsarcoffee.com)

Yoda Timers 2

alkjashFeb 21, 2018, 7:40 AM
28 points
27 comments3 min readLW link
(radimentary.wordpress.com)