Se­condary Stres­sors and Tac­tile Ambition

lionhearted (Sebastian Marshall)13 Jul 2018 0:26 UTC
16 points
16 comments4 min readLW link

A Sarno-Han­son Synthesis

moridinamael12 Jul 2018 16:13 UTC
52 points
15 comments4 min readLW link

Prob­a­bil­ity is a model, fre­quency is an ob­ser­va­tion: Why both halfers and thirders are cor­rect in the Sleep­ing Beauty prob­lem.

Shmi12 Jul 2018 6:52 UTC
26 points
34 comments2 min readLW link

What does the stock mar­ket tell us about AI timelines?

Tobias_Baumann12 Jul 2018 6:05 UTC
6 points
5 comments1 min readLW link
(s-risks.org)

An Agent is a Wor­ldline in Teg­mark V

komponisto12 Jul 2018 5:12 UTC
24 points
12 comments2 min readLW link

Wash­ing­ton, D.C.: What If

RobinZ12 Jul 2018 4:30 UTC
9 points
0 comments1 min readLW link

Are pre-speci­fied util­ity func­tions about the real world pos­si­ble in prin­ci­ple?

mlogan11 Jul 2018 18:46 UTC
24 points
7 comments4 min readLW link

Me­la­tonin: Much More Than You Wanted To Know

Scott Alexander11 Jul 2018 17:40 UTC
120 points
16 comments15 min readLW link
(slatestarcodex.com)

Monk Tree­house: some prob­lems defin­ing simulation

dranorter11 Jul 2018 7:35 UTC
6 points
1 comment5 min readLW link

Math­e­mat­i­cal Mindset

komponisto11 Jul 2018 3:03 UTC
54 points
5 comments2 min readLW link

De­ci­sion-the­o­retic prob­lems and The­o­ries; An (In­com­plete) com­par­a­tive list

somervta11 Jul 2018 2:59 UTC
36 points
0 comments1 min readLW link
(docs.google.com)

Agents That Learn From Hu­man Be­hav­ior Can’t Learn Hu­man Values That Hu­mans Haven’t Learned Yet

steven046111 Jul 2018 2:59 UTC
28 points
11 comments1 min readLW link

On the Role of Coun­ter­fac­tu­als in Learning

Max Kanwal11 Jul 2018 2:45 UTC
11 points
2 comments3 min readLW link

Clar­ify­ing Con­se­quen­tial­ists in the Solomonoff Prior

Vlad Mikulik11 Jul 2018 2:35 UTC
20 points
16 comments6 min readLW link

Com­plete Class: Con­se­quen­tial­ist Foundations

abramdemski11 Jul 2018 1:57 UTC
58 points
35 comments13 min readLW link

Con­di­tions un­der which mis­al­igned sub­agents can (not) arise in classifiers

anon111 Jul 2018 1:52 UTC
12 points
2 comments2 min readLW link

No, I won’t go there, it feels like you’re try­ing to Pas­cal-mug me

Rupert11 Jul 2018 1:37 UTC
9 points
0 comments2 min readLW link

Con­cep­tual prob­lems with util­ity functions

Dacyn11 Jul 2018 1:29 UTC
22 points
12 comments2 min readLW link

Depen­dent Type The­ory and Zero-Shot Reasoning

evhub11 Jul 2018 1:16 UTC
27 points
3 comments5 min readLW link

A com­ment on the IDA-AlphaGoZero metaphor; ca­pa­bil­ities ver­sus alignment

AlexMennen11 Jul 2018 1:03 UTC
40 points
1 comment1 min readLW link

Bound­ing Good­hart’s Law

eric_langlois11 Jul 2018 0:46 UTC
43 points
2 comments5 min readLW link

Mechanis­tic Trans­parency for Ma­chine Learning

DanielFilan11 Jul 2018 0:34 UTC
54 points
9 comments4 min readLW link

An en­vi­ron­ment for study­ing counterfactuals

Nisan11 Jul 2018 0:14 UTC
15 points
6 comments3 min readLW link

A uni­ver­sal score for optimizers

levin10 Jul 2018 23:52 UTC
15 points
8 comments3 min readLW link

Bayesian Prob­a­bil­ity is for things that are Space-like Separated from You

Scott Garrabrant10 Jul 2018 23:47 UTC
86 points
22 comments2 min readLW link

Align­ment prob­lems for economists

Chris van Merwijk10 Jul 2018 23:43 UTC
5 points
2 comments2 min readLW link

Non-re­solve as Resolve

Linda Linsefors10 Jul 2018 23:31 UTC
15 points
1 comment2 min readLW link

A frame­work for think­ing about wireheading

theotherotheralex10 Jul 2018 23:14 UTC
15 points
4 comments1 min readLW link

Log­i­cal Uncer­tainty and Func­tional De­ci­sion Theory

swordsintoploughshares10 Jul 2018 23:08 UTC
15 points
4 comments2 min readLW link

Re­peated (and im­proved) Sleep­ing Beauty problem

Linda Linsefors10 Jul 2018 22:32 UTC
12 points
5 comments2 min readLW link

Prob­a­bil­ity is fake, fre­quency is real

Linda Linsefors10 Jul 2018 22:32 UTC
12 points
7 comments1 min readLW link

Con­di­tion­ing, Coun­ter­fac­tu­als, Ex­plo­ra­tion, and Gears

Diffractor10 Jul 2018 22:11 UTC
28 points
1 comment5 min readLW link

Two agents can have the same source code and op­ti­mise differ­ent util­ity functions

Joar Skalse10 Jul 2018 21:51 UTC
11 points
11 comments1 min readLW link

The In­ten­tional Agency Experiment

Alexander Gietelink Oldenziel10 Jul 2018 20:32 UTC
13 points
5 comments3 min readLW link

An­nounc­ing Align­men­tFo­rum.org Beta

Raemon10 Jul 2018 20:19 UTC
68 points
35 comments2 min readLW link

Choos­ing to Choose?

Whispermute10 Jul 2018 20:15 UTC
10 points
7 comments5 min readLW link

Study on what makes peo­ple ap­prove or con­demn mind up­load tech­nol­ogy; refer­ences LW

Kaj_Sotala10 Jul 2018 17:14 UTC
22 points
0 comments2 min readLW link
(www.nature.com)

How to par­ent more predictably

jefftk10 Jul 2018 15:18 UTC
78 points
1 comment4 min readLW link

Open Thread July 2018

null10 Jul 2018 14:51 UTC
10 points
9 comments1 min readLW link

Three an­chor­ings: num­ber, at­ti­tude, and taste

Stuart_Armstrong10 Jul 2018 14:21 UTC
14 points
4 comments2 min readLW link

The Dilemma of Worse Than Death Scenarios

arkaeik10 Jul 2018 9:18 UTC
14 points
18 comments4 min readLW link

New­comb’s Prob­lem In One Paragraph

Chris_Leong10 Jul 2018 7:10 UTC
7 points
0 comments1 min readLW link

Let­ting Go III: Unilat­eral or GTFO

johnswentworth10 Jul 2018 6:26 UTC
21 points
3 comments2 min readLW link

Syd­ney Ra­tion­al­ity Dojo—December

Next10 Jul 2018 4:22 UTC
1 point
0 comments1 min readLW link

Syd­ney Ra­tion­al­ity Dojo—November

Next10 Jul 2018 4:20 UTC
1 point
0 comments1 min readLW link

Syd­ney Ra­tion­al­ity Dojo—October

Next10 Jul 2018 4:19 UTC
1 point
0 comments1 min readLW link

Syd­ney Ra­tion­al­ity Dojo—September

Next10 Jul 2018 4:12 UTC
1 point
0 comments1 min readLW link

Syd­ney Ra­tion­al­ity Dojo—August

Next10 Jul 2018 4:04 UTC
1 point
0 comments1 min readLW link

Con­text Win­dows: A Model of Un­pro­duc­tive Disagreement

Zachary Jacobi10 Jul 2018 1:40 UTC
4 points
2 comments5 min readLW link

Fun­da­men­tals of For­mal­i­sa­tion Level 5: For­mal Proof

philip_b9 Jul 2018 20:55 UTC
13 points
0 comments1 min readLW link