Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
2
(AI alignment) Now is special
Andrew Quinn
Jul 17, 2018, 1:50 AM
2
points
0
comments
1
min read
LW
link
Look Under the Light Post
Gordon Seidoh Worley
Jul 16, 2018, 10:19 PM
22
points
8
comments
4
min read
LW
link
Alignment Newsletter #15: 07/16/18
Rohin Shah
Jul 16, 2018, 4:10 PM
42
points
0
comments
15
min read
LW
link
(mailchi.mp)
Compact vs. Wide Models
Vaniver
Jul 16, 2018, 4:09 AM
31
points
5
comments
3
min read
LW
link
Probabilistic decision-making as an anxiety-reduction technique
RationallyDense
Jul 16, 2018, 3:51 AM
8
points
4
comments
1
min read
LW
link
Buridan’s ass in coordination games
jessicata
Jul 16, 2018, 2:51 AM
52
points
26
comments
10
min read
LW
link
Research Debt
Elizabeth
Jul 15, 2018, 7:36 PM
25
points
2
comments
LW
link
(distill.pub)
An optimistic explanation of the outrage epidemic
chaosmage
Jul 15, 2018, 2:35 PM
18
points
5
comments
3
min read
LW
link
Announcement: AI alignment prize round 3 winners and next round
cousin_it
Jul 15, 2018, 7:40 AM
93
points
7
comments
1
min read
LW
link
Meetup Cookbook
maia
Jul 14, 2018, 10:26 PM
74
points
7
comments
1
min read
LW
link
(tigrennatenn.neocities.org)
Expected Pain Parameters
Alicorn
Jul 14, 2018, 7:30 PM
87
points
12
comments
2
min read
LW
link
Boltzmann Brains and Within-model vs. Between-models Probability
Charlie Steiner
Jul 14, 2018, 9:52 AM
15
points
12
comments
3
min read
LW
link
[1607.08289] “Mammalian Value Systems” (as a starting point for human value system model created by IRL agent)
avturchin
Jul 14, 2018, 9:46 AM
9
points
9
comments
LW
link
(arxiv.org)
Generating vs Recognizing
lifelonglearner
Jul 14, 2018, 5:10 AM
15
points
3
comments
4
min read
LW
link
LW Update 2018-7-14 – Styling Rework, CommentsItem, Performance
Raemon
Jul 14, 2018, 1:13 AM
30
points
0
comments
1
min read
LW
link
Secondary Stressors and Tactile Ambition
lionhearted (Sebastian Marshall)
Jul 13, 2018, 12:26 AM
16
points
16
comments
4
min read
LW
link
A Sarno-Hanson Synthesis
moridinamael
Jul 12, 2018, 4:13 PM
52
points
15
comments
4
min read
LW
link
Probability is a model, frequency is an observation: Why both halfers and thirders are correct in the Sleeping Beauty problem.
Shmi
Jul 12, 2018, 6:52 AM
26
points
34
comments
2
min read
LW
link
What does the stock market tell us about AI timelines?
Tobias_Baumann
Jul 12, 2018, 6:05 AM
6
points
5
comments
LW
link
(s-risks.org)
An Agent is a Worldline in Tegmark V
komponisto
Jul 12, 2018, 5:12 AM
24
points
12
comments
2
min read
LW
link
Washington, D.C.: What If
RobinZ
Jul 12, 2018, 4:30 AM
9
points
0
comments
1
min read
LW
link
Are pre-specified utility functions about the real world possible in principle?
mlogan
Jul 11, 2018, 6:46 PM
24
points
7
comments
4
min read
LW
link
Melatonin: Much More Than You Wanted To Know
Scott Alexander
Jul 11, 2018, 5:40 PM
122
points
16
comments
15
min read
LW
link
(slatestarcodex.com)
Monk Treehouse: some problems defining simulation
dranorter
Jul 11, 2018, 7:35 AM
6
points
1
comment
5
min read
LW
link
Mathematical Mindset
komponisto
Jul 11, 2018, 3:03 AM
54
points
5
comments
2
min read
LW
link
Decision-theoretic problems and Theories; An (Incomplete) comparative list
somervta
Jul 11, 2018, 2:59 AM
36
points
0
comments
1
min read
LW
link
(docs.google.com)
Agents That Learn From Human Behavior Can’t Learn Human Values That Humans Haven’t Learned Yet
steven0461
Jul 11, 2018, 2:59 AM
28
points
11
comments
1
min read
LW
link
On the Role of Counterfactuals in Learning
Max Kanwal
Jul 11, 2018, 2:45 AM
11
points
2
comments
3
min read
LW
link
Clarifying Consequentialists in the Solomonoff Prior
Vlad Mikulik
Jul 11, 2018, 2:35 AM
20
points
16
comments
6
min read
LW
link
Complete Class: Consequentialist Foundations
abramdemski
Jul 11, 2018, 1:57 AM
58
points
37
comments
13
min read
LW
link
Conditions under which misaligned subagents can (not) arise in classifiers
anon1
Jul 11, 2018, 1:52 AM
12
points
2
comments
2
min read
LW
link
No, I won’t go there, it feels like you’re trying to Pascal-mug me
Rupert
Jul 11, 2018, 1:37 AM
9
points
0
comments
2
min read
LW
link
Conceptual problems with utility functions
Dacyn
Jul 11, 2018, 1:29 AM
22
points
12
comments
2
min read
LW
link
Dependent Type Theory and Zero-Shot Reasoning
evhub
Jul 11, 2018, 1:16 AM
27
points
3
comments
5
min read
LW
link
A comment on the IDA-AlphaGoZero metaphor; capabilities versus alignment
AlexMennen
Jul 11, 2018, 1:03 AM
40
points
1
comment
1
min read
LW
link
Bounding Goodhart’s Law
eric_langlois
Jul 11, 2018, 12:46 AM
43
points
2
comments
5
min read
LW
link
Mechanistic Transparency for Machine Learning
DanielFilan
Jul 11, 2018, 12:34 AM
55
points
9
comments
4
min read
LW
link
An environment for studying counterfactuals
Nisan
Jul 11, 2018, 12:14 AM
15
points
6
comments
3
min read
LW
link
A universal score for optimizers
levin
Jul 10, 2018, 11:52 PM
15
points
8
comments
3
min read
LW
link
Bayesian Probability is for things that are Space-like Separated from You
Scott Garrabrant
Jul 10, 2018, 11:47 PM
86
points
22
comments
2
min read
LW
link
Alignment problems for economists
Chris van Merwijk
Jul 10, 2018, 11:43 PM
5
points
2
comments
2
min read
LW
link
Non-resolve as Resolve
Linda Linsefors
Jul 10, 2018, 11:31 PM
15
points
1
comment
2
min read
LW
link
A framework for thinking about wireheading
theotherotheralex
Jul 10, 2018, 11:14 PM
15
points
4
comments
1
min read
LW
link
Logical Uncertainty and Functional Decision Theory
swordsintoploughshares
Jul 10, 2018, 11:08 PM
15
points
4
comments
2
min read
LW
link
Repeated (and improved) Sleeping Beauty problem
Linda Linsefors
Jul 10, 2018, 10:32 PM
12
points
5
comments
2
min read
LW
link
Probability is fake, frequency is real
Linda Linsefors
Jul 10, 2018, 10:32 PM
12
points
7
comments
1
min read
LW
link
Conditioning, Counterfactuals, Exploration, and Gears
Diffractor
Jul 10, 2018, 10:11 PM
28
points
1
comment
5
min read
LW
link
Two agents can have the same source code and optimise different utility functions
Joar Skalse
Jul 10, 2018, 9:51 PM
11
points
11
comments
1
min read
LW
link
The Intentional Agency Experiment
Alexander Gietelink Oldenziel
10 Jul 2018 20:32 UTC
13
points
5
comments
3
min read
LW
link
Announcing AlignmentForum.org Beta
Raemon
10 Jul 2018 20:19 UTC
68
points
35
comments
2
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel