Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Are pre-specified utility functions about the real world possible in principle?
mlogan
11 Jul 2018 18:46 UTC
24
points
7
comments
4
min read
LW
link
Melatonin: Much More Than You Wanted To Know
Scott Alexander
11 Jul 2018 17:40 UTC
120
points
16
comments
15
min read
LW
link
(slatestarcodex.com)
Monk Treehouse: some problems defining simulation
dranorter
11 Jul 2018 7:35 UTC
6
points
1
comment
5
min read
LW
link
Mathematical Mindset
komponisto
11 Jul 2018 3:03 UTC
54
points
5
comments
2
min read
LW
link
Decision-theoretic problems and Theories; An (Incomplete) comparative list
somervta
11 Jul 2018 2:59 UTC
36
points
0
comments
1
min read
LW
link
(docs.google.com)
Agents That Learn From Human Behavior Can’t Learn Human Values That Humans Haven’t Learned Yet
steven0461
11 Jul 2018 2:59 UTC
28
points
11
comments
1
min read
LW
link
On the Role of Counterfactuals in Learning
Max Kanwal
11 Jul 2018 2:45 UTC
11
points
2
comments
3
min read
LW
link
Clarifying Consequentialists in the Solomonoff Prior
Vlad Mikulik
11 Jul 2018 2:35 UTC
20
points
16
comments
6
min read
LW
link
Complete Class: Consequentialist Foundations
abramdemski
11 Jul 2018 1:57 UTC
58
points
35
comments
13
min read
LW
link
Conditions under which misaligned subagents can (not) arise in classifiers
anon1
11 Jul 2018 1:52 UTC
12
points
2
comments
2
min read
LW
link
No, I won’t go there, it feels like you’re trying to Pascal-mug me
Rupert
11 Jul 2018 1:37 UTC
9
points
0
comments
2
min read
LW
link
Conceptual problems with utility functions
Dacyn
11 Jul 2018 1:29 UTC
22
points
12
comments
2
min read
LW
link
Dependent Type Theory and Zero-Shot Reasoning
evhub
11 Jul 2018 1:16 UTC
27
points
3
comments
5
min read
LW
link
A comment on the IDA-AlphaGoZero metaphor; capabilities versus alignment
AlexMennen
11 Jul 2018 1:03 UTC
40
points
1
comment
1
min read
LW
link
Bounding Goodhart’s Law
eric_langlois
11 Jul 2018 0:46 UTC
43
points
2
comments
5
min read
LW
link
Mechanistic Transparency for Machine Learning
DanielFilan
11 Jul 2018 0:34 UTC
54
points
9
comments
4
min read
LW
link
An environment for studying counterfactuals
Nisan
11 Jul 2018 0:14 UTC
15
points
6
comments
3
min read
LW
link
A universal score for optimizers
levin
10 Jul 2018 23:52 UTC
15
points
8
comments
3
min read
LW
link
Bayesian Probability is for things that are Space-like Separated from You
Scott Garrabrant
10 Jul 2018 23:47 UTC
86
points
22
comments
2
min read
LW
link
Alignment problems for economists
Chris van Merwijk
10 Jul 2018 23:43 UTC
5
points
2
comments
2
min read
LW
link
Non-resolve as Resolve
Linda Linsefors
10 Jul 2018 23:31 UTC
15
points
1
comment
2
min read
LW
link
A framework for thinking about wireheading
theotherotheralex
10 Jul 2018 23:14 UTC
15
points
4
comments
1
min read
LW
link
Logical Uncertainty and Functional Decision Theory
swordsintoploughshares
10 Jul 2018 23:08 UTC
15
points
4
comments
2
min read
LW
link
Repeated (and improved) Sleeping Beauty problem
Linda Linsefors
10 Jul 2018 22:32 UTC
12
points
5
comments
2
min read
LW
link
Probability is fake, frequency is real
Linda Linsefors
10 Jul 2018 22:32 UTC
12
points
7
comments
1
min read
LW
link
Conditioning, Counterfactuals, Exploration, and Gears
Diffractor
10 Jul 2018 22:11 UTC
28
points
1
comment
5
min read
LW
link
Two agents can have the same source code and optimise different utility functions
Joar Skalse
10 Jul 2018 21:51 UTC
11
points
11
comments
1
min read
LW
link
The Intentional Agency Experiment
Alexander Gietelink Oldenziel
10 Jul 2018 20:32 UTC
13
points
5
comments
3
min read
LW
link
Announcing AlignmentForum.org Beta
Raemon
10 Jul 2018 20:19 UTC
68
points
35
comments
2
min read
LW
link
Choosing to Choose?
Whispermute
10 Jul 2018 20:15 UTC
10
points
7
comments
5
min read
LW
link
Study on what makes people approve or condemn mind upload technology; references LW
Kaj_Sotala
10 Jul 2018 17:14 UTC
22
points
0
comments
2
min read
LW
link
(www.nature.com)
How to parent more predictably
jefftk
10 Jul 2018 15:18 UTC
78
points
1
comment
4
min read
LW
link
Open Thread July 2018
null
10 Jul 2018 14:51 UTC
10
points
9
comments
1
min read
LW
link
Three anchorings: number, attitude, and taste
Stuart_Armstrong
10 Jul 2018 14:21 UTC
14
points
4
comments
2
min read
LW
link
The Dilemma of Worse Than Death Scenarios
arkaeik
10 Jul 2018 9:18 UTC
14
points
18
comments
4
min read
LW
link
Newcomb’s Problem In One Paragraph
Chris_Leong
10 Jul 2018 7:10 UTC
7
points
0
comments
1
min read
LW
link
Letting Go III: Unilateral or GTFO
johnswentworth
10 Jul 2018 6:26 UTC
21
points
3
comments
2
min read
LW
link
Sydney Rationality Dojo—December
Next
10 Jul 2018 4:22 UTC
1
point
0
comments
1
min read
LW
link
Sydney Rationality Dojo—November
Next
10 Jul 2018 4:20 UTC
1
point
0
comments
1
min read
LW
link
Sydney Rationality Dojo—October
Next
10 Jul 2018 4:19 UTC
1
point
0
comments
1
min read
LW
link
Sydney Rationality Dojo—September
Next
10 Jul 2018 4:12 UTC
1
point
0
comments
1
min read
LW
link
Sydney Rationality Dojo—August
Next
10 Jul 2018 4:04 UTC
1
point
0
comments
1
min read
LW
link
Context Windows: A Model of Unproductive Disagreement
Zachary Jacobi
10 Jul 2018 1:40 UTC
4
points
2
comments
5
min read
LW
link
Fundamentals of Formalisation Level 5: Formal Proof
philip_b
9 Jul 2018 20:55 UTC
13
points
0
comments
1
min read
LW
link
RAISE is looking for full-time content developers
null
9 Jul 2018 17:01 UTC
22
points
5
comments
1
min read
LW
link
Alignment Newsletter #14
Rohin Shah
9 Jul 2018 16:20 UTC
14
points
0
comments
9
min read
LW
link
(mailchi.mp)
Math: Textbooks and the DTP pipeline
Andrew Quinn
9 Jul 2018 15:09 UTC
12
points
3
comments
2
min read
LW
link
The Craft And The Codex
Paperclip Minimizer
9 Jul 2018 10:50 UTC
12
points
7
comments
1
min read
LW
link
(slatestarcodex.com)
The Fermi Paradox: What did Sandberg, Drexler and Ord Really Dissolve?
Shmi
8 Jul 2018 21:18 UTC
46
points
28
comments
5
min read
LW
link
An Exercise in Applied Rationality: A New Apartment
Sable
8 Jul 2018 21:18 UTC
8
points
9
comments
1
min read
LW
link
Back to top
Next