16 points

16 comments4 min readLW link

A Sarno-Hanson Synthesis

moridinamael12 Jul 2018 16:13 UTC

52 points

15 comments4 min readLW link

Probability is a model, frequency is an observation: Why both halfers and thirders are correct in the Sleeping Beauty problem.

Shmi12 Jul 2018 6:52 UTC

26 points

34 comments2 min readLW link

What does the stock market tell us about AI timelines?

Tobias_Baumann12 Jul 2018 6:05 UTC

6 points

5 comments1 min readLW link

(s-risks.org)

An Agent is a Worldline in Tegmark V

komponisto12 Jul 2018 5:12 UTC

24 points

12 comments2 min readLW link

Washington, D.C.: What If

RobinZ12 Jul 2018 4:30 UTC

9 points

0 comments1 min readLW link

Are pre-specified utility functions about the real world possible in principle?

mlogan11 Jul 2018 18:46 UTC

24 points

7 comments4 min readLW link

Melatonin: Much More Than You Wanted To Know

Scott Alexander11 Jul 2018 17:40 UTC

120 points

16 comments15 min readLW link

(slatestarcodex.com)

Monk Treehouse: some problems defining simulation

dranorter11 Jul 2018 7:35 UTC

6 points

1 comment5 min readLW link

Mathematical Mindset

komponisto11 Jul 2018 3:03 UTC

54 points

5 comments2 min readLW link

Decision-theoretic problems and Theories; An (Incomplete) comparative list

somervta11 Jul 2018 2:59 UTC

36 points

0 comments1 min readLW link

(docs.google.com)

Agents That Learn From Human Behavior Can’t Learn Human Values That Humans Haven’t Learned Yet

steven046111 Jul 2018 2:59 UTC

28 points

11 comments1 min readLW link

On the Role of Counterfactuals in Learning

Max Kanwal11 Jul 2018 2:45 UTC

11 points

2 comments3 min readLW link

Clarifying Consequentialists in the Solomonoff Prior

Vlad Mikulik11 Jul 2018 2:35 UTC

20 points

16 comments6 min readLW link

Complete Class: Consequentialist Foundations

abramdemski11 Jul 2018 1:57 UTC

58 points

37 comments13 min readLW link

Conditions under which misaligned subagents can (not) arise in classifiers

anon111 Jul 2018 1:52 UTC

12 points

2 comments2 min readLW link

No, I won’t go there, it feels like you’re trying to Pascal-mug me

Rupert11 Jul 2018 1:37 UTC

9 points

0 comments2 min readLW link

Conceptual problems with utility functions

Dacyn11 Jul 2018 1:29 UTC

22 points

12 comments2 min readLW link

Dependent Type Theory and Zero-Shot Reasoning

evhub11 Jul 2018 1:16 UTC

27 points

3 comments5 min readLW link

A comment on the IDA-AlphaGoZero metaphor; capabilities versus alignment

AlexMennen11 Jul 2018 1:03 UTC

40 points

1 comment1 min readLW link

Bounding Goodhart’s Law

eric_langlois11 Jul 2018 0:46 UTC

43 points

2 comments5 min readLW link

Mechanistic Transparency for Machine Learning

DanielFilan11 Jul 2018 0:34 UTC

54 points

9 comments4 min readLW link

An environment for studying counterfactuals

Nisan11 Jul 2018 0:14 UTC

15 points

6 comments3 min readLW link

A universal score for optimizers

levin10 Jul 2018 23:52 UTC

15 points

8 comments3 min readLW link

Bayesian Probability is for things that are Space-like Separated from You

Scott Garrabrant10 Jul 2018 23:47 UTC

86 points

22 comments2 min readLW link

Alignment problems for economists

Chris van Merwijk10 Jul 2018 23:43 UTC

5 points

2 comments2 min readLW link

Non-resolve as Resolve

Linda Linsefors10 Jul 2018 23:31 UTC

15 points

1 comment2 min readLW link

A framework for thinking about wireheading

theotherotheralex10 Jul 2018 23:14 UTC

15 points

4 comments1 min readLW link

Logical Uncertainty and Functional Decision Theory

swordsintoploughshares10 Jul 2018 23:08 UTC

15 points

4 comments2 min readLW link

Repeated (and improved) Sleeping Beauty problem

Linda Linsefors10 Jul 2018 22:32 UTC

12 points

5 comments2 min readLW link

Probability is fake, frequency is real

Linda Linsefors10 Jul 2018 22:32 UTC

12 points

7 comments1 min readLW link

Conditioning, Counterfactuals, Exploration, and Gears

Diffractor10 Jul 2018 22:11 UTC

28 points

1 comment5 min readLW link

Two agents can have the same source code and optimise different utility functions

Joar Skalse10 Jul 2018 21:51 UTC

11 points

11 comments1 min readLW link

The Intentional Agency Experiment

Alexander Gietelink Oldenziel10 Jul 2018 20:32 UTC

13 points

5 comments3 min readLW link

Announcing AlignmentForum.org Beta

Raemon10 Jul 2018 20:19 UTC

68 points

35 comments2 min readLW link

Choosing to Choose?

Whispermute10 Jul 2018 20:15 UTC

10 points

7 comments5 min readLW link

Study on what makes people approve or condemn mind upload technology; references LW

Kaj_Sotala10 Jul 2018 17:14 UTC

22 points

0 comments2 min readLW link

(www.nature.com)

How to parent more predictably

jefftk10 Jul 2018 15:18 UTC

78 points

1 comment4 min readLW link

Open Thread July 2018

null10 Jul 2018 14:51 UTC

10 points

9 comments1 min readLW link

Three anchorings: number, attitude, and taste

Stuart_Armstrong10 Jul 2018 14:21 UTC

14 points

4 comments2 min readLW link

The Dilemma of Worse Than Death Scenarios

arkaeik10 Jul 2018 9:18 UTC

14 points

18 comments4 min readLW link

Newcomb’s Problem In One Paragraph

Chris_Leong10 Jul 2018 7:10 UTC

7 points

0 comments1 min readLW link

Letting Go III: Unilateral or GTFO

johnswentworth10 Jul 2018 6:26 UTC

21 points

3 comments2 min readLW link

Sydney Rationality Dojo—December

Next10 Jul 2018 4:22 UTC

1 point

0 comments1 min readLW link

Sydney Rationality Dojo—November

Next10 Jul 2018 4:20 UTC

1 point

0 comments1 min readLW link

Sydney Rationality Dojo—October

Next10 Jul 2018 4:19 UTC

1 point

0 comments1 min readLW link

Sydney Rationality Dojo—September

Next10 Jul 2018 4:12 UTC

1 point

0 comments1 min readLW link

Sydney Rationality Dojo—August

Next10 Jul 2018 4:04 UTC

1 point

0 comments1 min readLW link

Context Windows: A Model of Unproductive Disagreement

Zachary Jacobi10 Jul 2018 1:40 UTC

4 points

2 comments5 min readLW link

Fundamentals of Formalisation Level 5: Formal Proof

philip_b9 Jul 2018 20:55 UTC

13 points

0 comments1 min readLW link