Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Solving the AI Race Finalists
Gordon Seidoh Worley
19 Jul 2018 21:04 UTC
24
points
0
comments
1
min read
LW
link
(medium.com)
“Artificial Intelligence” (new entry at Stanford Encyclopedia of Philosophy)
fortyeridania
19 Jul 2018 9:48 UTC
5
points
8
comments
1
min read
LW
link
(plato.stanford.edu)
Discussion: Raising the Sanity Waterline
Chriswaterguy
19 Jul 2018 2:12 UTC
2
points
0
comments
1
min read
LW
link
LW Update 2018-07-18 – AlignmentForum Bug Fixes
Raemon
19 Jul 2018 2:10 UTC
13
points
0
comments
1
min read
LW
link
Generalized Kelly betting
Linda Linsefors
19 Jul 2018 1:38 UTC
15
points
5
comments
2
min read
LW
link
Mechanism Design for AI
Tobias_Baumann
18 Jul 2018 16:47 UTC
5
points
3
comments
1
min read
LW
link
(s-risks.org)
A Step-by-step Guide to Finding a (Good!) Therapist
squidious
18 Jul 2018 1:50 UTC
46
points
5
comments
9
min read
LW
link
(opalsandbonobos.blogspot.com)
Simple Metaphor About Compressed Sensing
ryan_b
17 Jul 2018 15:47 UTC
6
points
0
comments
1
min read
LW
link
Figuring out what Alice wants, part II
Stuart_Armstrong
17 Jul 2018 13:59 UTC
17
points
0
comments
5
min read
LW
link
Figuring out what Alice wants, part I
Stuart_Armstrong
17 Jul 2018 13:59 UTC
15
points
8
comments
3
min read
LW
link
How To Use Bureaucracies
Samo Burja
17 Jul 2018 8:10 UTC
63
points
37
comments
9
min read
LW
link
(medium.com)
September CFAR Workshop
CFAR Team
17 Jul 2018 3:16 UTC
20
points
0
comments
1
min read
LW
link
(AI alignment) Now is special
Andrew Quinn
17 Jul 2018 1:50 UTC
2
points
0
comments
1
min read
LW
link
Look Under the Light Post
Gordon Seidoh Worley
16 Jul 2018 22:19 UTC
22
points
8
comments
4
min read
LW
link
Alignment Newsletter #15: 07/16/18
Rohin Shah
16 Jul 2018 16:10 UTC
42
points
0
comments
15
min read
LW
link
(mailchi.mp)
Compact vs. Wide Models
Vaniver
16 Jul 2018 4:09 UTC
31
points
5
comments
3
min read
LW
link
Probabilistic decision-making as an anxiety-reduction technique
RationallyDense
16 Jul 2018 3:51 UTC
8
points
4
comments
1
min read
LW
link
Buridan’s ass in coordination games
jessicata
16 Jul 2018 2:51 UTC
52
points
26
comments
10
min read
LW
link
Research Debt
Elizabeth
15 Jul 2018 19:36 UTC
24
points
2
comments
1
min read
LW
link
(distill.pub)
An optimistic explanation of the outrage epidemic
chaosmage
15 Jul 2018 14:35 UTC
18
points
5
comments
3
min read
LW
link
Announcement: AI alignment prize round 3 winners and next round
cousin_it
15 Jul 2018 7:40 UTC
93
points
7
comments
1
min read
LW
link
Meetup Cookbook
maia
14 Jul 2018 22:26 UTC
74
points
7
comments
1
min read
LW
link
(tigrennatenn.neocities.org)
Expected Pain Parameters
Alicorn
14 Jul 2018 19:30 UTC
87
points
12
comments
2
min read
LW
link
Boltzmann Brains and Within-model vs. Between-models Probability
Charlie Steiner
14 Jul 2018 9:52 UTC
15
points
12
comments
3
min read
LW
link
[1607.08289] “Mammalian Value Systems” (as a starting point for human value system model created by IRL agent)
avturchin
14 Jul 2018 9:46 UTC
9
points
9
comments
1
min read
LW
link
(arxiv.org)
Generating vs Recognizing
lifelonglearner
14 Jul 2018 5:10 UTC
15
points
3
comments
4
min read
LW
link
LW Update 2018-7-14 – Styling Rework, CommentsItem, Performance
Raemon
14 Jul 2018 1:13 UTC
30
points
0
comments
1
min read
LW
link
Secondary Stressors and Tactile Ambition
lionhearted (Sebastian Marshall)
13 Jul 2018 0:26 UTC
16
points
16
comments
4
min read
LW
link
A Sarno-Hanson Synthesis
moridinamael
12 Jul 2018 16:13 UTC
52
points
15
comments
4
min read
LW
link
Probability is a model, frequency is an observation: Why both halfers and thirders are correct in the Sleeping Beauty problem.
Shmi
12 Jul 2018 6:52 UTC
26
points
34
comments
2
min read
LW
link
What does the stock market tell us about AI timelines?
Tobias_Baumann
12 Jul 2018 6:05 UTC
6
points
5
comments
1
min read
LW
link
(s-risks.org)
An Agent is a Worldline in Tegmark V
komponisto
12 Jul 2018 5:12 UTC
24
points
12
comments
2
min read
LW
link
Washington, D.C.: What If
RobinZ
12 Jul 2018 4:30 UTC
9
points
0
comments
1
min read
LW
link
Are pre-specified utility functions about the real world possible in principle?
mlogan
11 Jul 2018 18:46 UTC
24
points
7
comments
4
min read
LW
link
Melatonin: Much More Than You Wanted To Know
Scott Alexander
11 Jul 2018 17:40 UTC
120
points
16
comments
15
min read
LW
link
(slatestarcodex.com)
Monk Treehouse: some problems defining simulation
dranorter
11 Jul 2018 7:35 UTC
6
points
1
comment
5
min read
LW
link
Mathematical Mindset
komponisto
11 Jul 2018 3:03 UTC
54
points
5
comments
2
min read
LW
link
Decision-theoretic problems and Theories; An (Incomplete) comparative list
somervta
11 Jul 2018 2:59 UTC
36
points
0
comments
1
min read
LW
link
(docs.google.com)
Agents That Learn From Human Behavior Can’t Learn Human Values That Humans Haven’t Learned Yet
steven0461
11 Jul 2018 2:59 UTC
28
points
11
comments
1
min read
LW
link
On the Role of Counterfactuals in Learning
Max Kanwal
11 Jul 2018 2:45 UTC
11
points
2
comments
3
min read
LW
link
Clarifying Consequentialists in the Solomonoff Prior
Vlad Mikulik
11 Jul 2018 2:35 UTC
20
points
16
comments
6
min read
LW
link
Complete Class: Consequentialist Foundations
abramdemski
11 Jul 2018 1:57 UTC
58
points
37
comments
13
min read
LW
link
Conditions under which misaligned subagents can (not) arise in classifiers
anon1
11 Jul 2018 1:52 UTC
12
points
2
comments
2
min read
LW
link
No, I won’t go there, it feels like you’re trying to Pascal-mug me
Rupert
11 Jul 2018 1:37 UTC
9
points
0
comments
2
min read
LW
link
Conceptual problems with utility functions
Dacyn
11 Jul 2018 1:29 UTC
22
points
12
comments
2
min read
LW
link
Dependent Type Theory and Zero-Shot Reasoning
evhub
11 Jul 2018 1:16 UTC
27
points
3
comments
5
min read
LW
link
A comment on the IDA-AlphaGoZero metaphor; capabilities versus alignment
AlexMennen
11 Jul 2018 1:03 UTC
40
points
1
comment
1
min read
LW
link
Bounding Goodhart’s Law
eric_langlois
11 Jul 2018 0:46 UTC
43
points
2
comments
5
min read
LW
link
Mechanistic Transparency for Machine Learning
DanielFilan
11 Jul 2018 0:34 UTC
54
points
9
comments
4
min read
LW
link
An environment for studying counterfactuals
Nisan
11 Jul 2018 0:14 UTC
15
points
6
comments
3
min read
LW
link
Back to top
Next