Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
2
Melatonin: Much More Than You Wanted To Know
Scott Alexander
Jul 11, 2018, 5:40 PM
122
points
16
comments
15
min read
LW
link
(slatestarcodex.com)
Monk Treehouse: some problems defining simulation
dranorter
Jul 11, 2018, 7:35 AM
6
points
1
comment
5
min read
LW
link
Mathematical Mindset
komponisto
Jul 11, 2018, 3:03 AM
54
points
5
comments
2
min read
LW
link
Decision-theoretic problems and Theories; An (Incomplete) comparative list
somervta
Jul 11, 2018, 2:59 AM
36
points
0
comments
1
min read
LW
link
(docs.google.com)
Agents That Learn From Human Behavior Can’t Learn Human Values That Humans Haven’t Learned Yet
steven0461
Jul 11, 2018, 2:59 AM
28
points
11
comments
1
min read
LW
link
On the Role of Counterfactuals in Learning
Max Kanwal
Jul 11, 2018, 2:45 AM
11
points
2
comments
3
min read
LW
link
Clarifying Consequentialists in the Solomonoff Prior
Vlad Mikulik
Jul 11, 2018, 2:35 AM
20
points
16
comments
6
min read
LW
link
Complete Class: Consequentialist Foundations
abramdemski
Jul 11, 2018, 1:57 AM
58
points
37
comments
13
min read
LW
link
Conditions under which misaligned subagents can (not) arise in classifiers
anon1
Jul 11, 2018, 1:52 AM
12
points
2
comments
2
min read
LW
link
No, I won’t go there, it feels like you’re trying to Pascal-mug me
Rupert
Jul 11, 2018, 1:37 AM
9
points
0
comments
2
min read
LW
link
Conceptual problems with utility functions
Dacyn
Jul 11, 2018, 1:29 AM
22
points
12
comments
2
min read
LW
link
Dependent Type Theory and Zero-Shot Reasoning
evhub
Jul 11, 2018, 1:16 AM
27
points
3
comments
5
min read
LW
link
A comment on the IDA-AlphaGoZero metaphor; capabilities versus alignment
AlexMennen
Jul 11, 2018, 1:03 AM
40
points
1
comment
1
min read
LW
link
Bounding Goodhart’s Law
eric_langlois
Jul 11, 2018, 12:46 AM
43
points
2
comments
5
min read
LW
link
Mechanistic Transparency for Machine Learning
DanielFilan
Jul 11, 2018, 12:34 AM
55
points
9
comments
4
min read
LW
link
An environment for studying counterfactuals
Nisan
Jul 11, 2018, 12:14 AM
15
points
6
comments
3
min read
LW
link
A universal score for optimizers
levin
Jul 10, 2018, 11:52 PM
15
points
8
comments
3
min read
LW
link
Bayesian Probability is for things that are Space-like Separated from You
Scott Garrabrant
Jul 10, 2018, 11:47 PM
86
points
22
comments
2
min read
LW
link
Alignment problems for economists
Chris van Merwijk
Jul 10, 2018, 11:43 PM
5
points
2
comments
2
min read
LW
link
Non-resolve as Resolve
Linda Linsefors
Jul 10, 2018, 11:31 PM
15
points
1
comment
2
min read
LW
link
A framework for thinking about wireheading
theotherotheralex
Jul 10, 2018, 11:14 PM
15
points
4
comments
1
min read
LW
link
Logical Uncertainty and Functional Decision Theory
swordsintoploughshares
Jul 10, 2018, 11:08 PM
15
points
4
comments
2
min read
LW
link
Repeated (and improved) Sleeping Beauty problem
Linda Linsefors
Jul 10, 2018, 10:32 PM
12
points
5
comments
2
min read
LW
link
Probability is fake, frequency is real
Linda Linsefors
Jul 10, 2018, 10:32 PM
12
points
7
comments
1
min read
LW
link
Conditioning, Counterfactuals, Exploration, and Gears
Diffractor
Jul 10, 2018, 10:11 PM
28
points
1
comment
5
min read
LW
link
Two agents can have the same source code and optimise different utility functions
Joar Skalse
Jul 10, 2018, 9:51 PM
11
points
11
comments
1
min read
LW
link
The Intentional Agency Experiment
Alexander Gietelink Oldenziel
Jul 10, 2018, 8:32 PM
13
points
5
comments
3
min read
LW
link
Announcing AlignmentForum.org Beta
Raemon
Jul 10, 2018, 8:19 PM
68
points
35
comments
2
min read
LW
link
Choosing to Choose?
Daniel Herrmann
Jul 10, 2018, 8:15 PM
10
points
7
comments
5
min read
LW
link
Study on what makes people approve or condemn mind upload technology; references LW
Kaj_Sotala
Jul 10, 2018, 5:14 PM
22
points
0
comments
2
min read
LW
link
(www.nature.com)
How to parent more predictably
jefftk
Jul 10, 2018, 3:18 PM
78
points
1
comment
4
min read
LW
link
Open Thread July 2018
null
Jul 10, 2018, 2:51 PM
10
points
9
comments
1
min read
LW
link
Three anchorings: number, attitude, and taste
Stuart_Armstrong
Jul 10, 2018, 2:21 PM
14
points
4
comments
2
min read
LW
link
The Dilemma of Worse Than Death Scenarios
arkaeik
Jul 10, 2018, 9:18 AM
14
points
18
comments
4
min read
LW
link
Newcomb’s Problem In One Paragraph
Chris_Leong
Jul 10, 2018, 7:10 AM
7
points
0
comments
1
min read
LW
link
Letting Go III: Unilateral or GTFO
johnswentworth
Jul 10, 2018, 6:26 AM
21
points
3
comments
2
min read
LW
link
Sydney Rationality Dojo—December
Next
Jul 10, 2018, 4:22 AM
1
point
0
comments
1
min read
LW
link
Sydney Rationality Dojo—November
Next
Jul 10, 2018, 4:20 AM
1
point
0
comments
1
min read
LW
link
Sydney Rationality Dojo—October
Next
Jul 10, 2018, 4:19 AM
1
point
0
comments
1
min read
LW
link
Sydney Rationality Dojo—September
Next
Jul 10, 2018, 4:12 AM
1
point
0
comments
1
min read
LW
link
Sydney Rationality Dojo—August
Next
Jul 10, 2018, 4:04 AM
1
point
0
comments
1
min read
LW
link
Context Windows: A Model of Unproductive Disagreement
Zachary Jacobi
Jul 10, 2018, 1:40 AM
4
points
2
comments
5
min read
LW
link
Fundamentals of Formalisation Level 5: Formal Proof
philip_b
Jul 9, 2018, 8:55 PM
13
points
0
comments
1
min read
LW
link
RAISE is looking for full-time content developers
null
Jul 9, 2018, 5:01 PM
22
points
5
comments
1
min read
LW
link
Alignment Newsletter #14
Rohin Shah
Jul 9, 2018, 4:20 PM
14
points
0
comments
9
min read
LW
link
(mailchi.mp)
Math: Textbooks and the DTP pipeline
Andrew Quinn
Jul 9, 2018, 3:09 PM
12
points
3
comments
2
min read
LW
link
The Craft And The Codex
Paperclip Minimizer
Jul 9, 2018, 10:50 AM
12
points
7
comments
LW
link
(slatestarcodex.com)
The Fermi Paradox: What did Sandberg, Drexler and Ord Really Dissolve?
Shmi
Jul 8, 2018, 9:18 PM
47
points
28
comments
5
min read
LW
link
An Exercise in Applied Rationality: A New Apartment
Sable
Jul 8, 2018, 9:18 PM
8
points
9
comments
1
min read
LW
link
Estimating the consequences of device detection tech
Jsevillamol
Jul 8, 2018, 6:25 PM
27
points
4
comments
7
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel