Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Partial summary of debate with Benquo and Jessicata [pt 1]
Raemon
14 Aug 2019 20:02 UTC
89
points
63
comments
22
min read
LW
link
3
reviews
“Designing agent incentives to avoid reward tampering”, DeepMind
gwern
14 Aug 2019 16:57 UTC
28
points
15
comments
1
min read
LW
link
(medium.com)
Subagents, trauma and rationality
Kaj_Sotala
14 Aug 2019 13:14 UTC
111
points
4
comments
19
min read
LW
link
Predicted AI alignment event/meeting calendar
rmoehn
14 Aug 2019 7:14 UTC
29
points
14
comments
1
min read
LW
link
Natural laws should be explicit constraints on strategy space
ryan_b
13 Aug 2019 20:22 UTC
8
points
6
comments
1
min read
LW
link
Distance Functions are Hard
Grue_Slinky
13 Aug 2019 17:33 UTC
31
points
19
comments
6
min read
LW
link
Book Review: Secular Cycles
Scott Alexander
13 Aug 2019 4:10 UTC
62
points
10
comments
16
min read
LW
link
1
review
(slatestarcodex.com)
A Primer on Matrix Calculus, Part 1: Basic review
Matthew Barnett
12 Aug 2019 23:44 UTC
25
points
4
comments
7
min read
LW
link
[Question]
What explanatory power does Kahneman’s System 2 possess?
Richard_Ngo
12 Aug 2019 15:23 UTC
31
points
2
comments
1
min read
LW
link
Mesa-Optimizers and Over-optimization Failure (Optimizing and Goodhart Effects, Clarifying Thoughts—Part 4)
Davidmanheim
12 Aug 2019 8:07 UTC
15
points
3
comments
4
min read
LW
link
Adjectives from the Future: The Dangers of Result-based Descriptions
Pradeep_Kumar
11 Aug 2019 19:19 UTC
19
points
8
comments
11
min read
LW
link
[Question]
Could we solve this email mess if we all moved to paid emails?
jacobjacob
11 Aug 2019 16:31 UTC
29
points
50
comments
4
min read
LW
link
AI Safety Reading Group
Søren Elverlin
11 Aug 2019 9:01 UTC
16
points
8
comments
1
min read
LW
link
[Question]
Does human choice have to be transitive in order to be rational/consistent?
jmh
11 Aug 2019 1:49 UTC
9
points
6
comments
1
min read
LW
link
Diana Fleischman and Geoffrey Miller—Audience Q&A
Jacob Falkovich
10 Aug 2019 22:37 UTC
38
points
6
comments
9
min read
LW
link
Intransitive Preferences You Can’t Pump
zulupineapple
9 Aug 2019 23:10 UTC
0
points
2
comments
1
min read
LW
link
Categorial preferences and utility functions
DavidHolmes
9 Aug 2019 21:36 UTC
10
points
6
comments
5
min read
LW
link
[Question]
What is the state of the ego depletion field?
Eli Tyre
9 Aug 2019 20:30 UTC
27
points
10
comments
1
min read
LW
link
Why Gradients Vanish and Explode
Matthew Barnett
9 Aug 2019 2:54 UTC
25
points
9
comments
3
min read
LW
link
AI Forecasting Dictionary (Forecasting infrastructure, part 1)
jacobjacob
and
bgold
8 Aug 2019 16:10 UTC
50
points
0
comments
5
min read
LW
link
[Question]
Why do humans not have built-in neural i/o channels?
Richard_Ngo
8 Aug 2019 13:09 UTC
25
points
23
comments
1
min read
LW
link
Which of these five AI alignment research projects ideas are no good?
rmoehn
8 Aug 2019 7:17 UTC
25
points
13
comments
1
min read
LW
link
Calibrating With Cards
lifelonglearner
8 Aug 2019 6:44 UTC
32
points
3
comments
3
min read
LW
link
[Question]
Is there a source/market for LW-related t-shirts?
jooyous
8 Aug 2019 4:30 UTC
8
points
3
comments
1
min read
LW
link
Verification and Transparency
DanielFilan
8 Aug 2019 1:50 UTC
35
points
6
comments
2
min read
LW
link
(danielfilan.com)
Toy model piece #2: Combining short and long range partial preferences
Stuart_Armstrong
8 Aug 2019 0:11 UTC
14
points
0
comments
4
min read
LW
link
Four Ways An Impact Measure Could Help Alignment
Matthew Barnett
8 Aug 2019 0:10 UTC
21
points
1
comment
9
min read
LW
link
Nashville August SSC Meetup
friedelcraftiness
7 Aug 2019 20:11 UTC
1
point
0
comments
1
min read
LW
link
In defense of Oracle (“Tool”) AI research
Steven Byrnes
7 Aug 2019 19:14 UTC
22
points
11
comments
4
min read
LW
link
Help forecast study replication in this social science prediction market
rosiecam
7 Aug 2019 18:18 UTC
29
points
3
comments
1
min read
LW
link
[Question]
Edit Nickname
Luigi Lotti
7 Aug 2019 17:42 UTC
5
points
1
comment
1
min read
LW
link
Self-Supervised Learning and AGI Safety
Steven Byrnes
7 Aug 2019 14:21 UTC
29
points
9
comments
12
min read
LW
link
Emotions are not beliefs
Chris_Leong
7 Aug 2019 6:27 UTC
25
points
2
comments
2
min read
LW
link
Understanding Recent Impact Measures
Matthew Barnett
7 Aug 2019 4:57 UTC
16
points
6
comments
7
min read
LW
link
[Site Update] Behind the scenes data-layer and caching improvements
habryka
7 Aug 2019 0:49 UTC
23
points
3
comments
1
min read
LW
link
Project Proposal: Considerations for trading off capabilities and safety impacts of AI research
David Scott Krueger (formerly: capybaralet)
6 Aug 2019 22:22 UTC
25
points
11
comments
2
min read
LW
link
Subagents, neural Turing machines, thought selection, and blindspots
Kaj_Sotala
6 Aug 2019 21:15 UTC
87
points
3
comments
12
min read
LW
link
[Question]
Percent reduction of gun-related deaths by color of gun.
Gunnar_Zarncke
6 Aug 2019 20:28 UTC
8
points
11
comments
1
min read
LW
link
New paper: Corrigibility with Utility Preservation
Koen.Holtman
6 Aug 2019 19:04 UTC
44
points
11
comments
2
min read
LW
link
Weak foundation of determinism analysis
aiiixiii
6 Aug 2019 19:03 UTC
14
points
54
comments
3
min read
LW
link
Trauma, Meditation, and a Cool Scar
Logan Riggs
6 Aug 2019 16:17 UTC
102
points
17
comments
5
min read
LW
link
1
review
[Question]
Why is the nitrogen cycle so under-emphasized compared to climate change
ChristianKl
6 Aug 2019 9:25 UTC
15
points
4
comments
1
min read
LW
link
[Question]
How would a person go about starting a geoengineering startup?
Pee Doom
6 Aug 2019 7:34 UTC
11
points
5
comments
1
min read
LW
link
Status 451 on Diagnosis: Russell Aphasia
Zack_M_Davis
6 Aug 2019 4:43 UTC
48
points
1
comment
1
min read
LW
link
(status451.com)
Searle’s Chinese Room and the Meaning of Meaning
Jimdrix_Hendri
6 Aug 2019 4:09 UTC
0
points
4
comments
2
min read
LW
link
[Question]
What are the best resources for examining the evidence for anthropogenic climate change?
Matthew Barnett
6 Aug 2019 2:53 UTC
10
points
8
comments
1
min read
LW
link
A Survey of Early Impact Measures
Matthew Barnett
6 Aug 2019 1:22 UTC
29
points
0
comments
8
min read
LW
link
Preferences as an (instinctive) stance
Stuart_Armstrong
6 Aug 2019 0:43 UTC
18
points
4
comments
4
min read
LW
link
[Question]
How to navigate through contradictory (health/fitness) advice?
Sherrinford
5 Aug 2019 20:58 UTC
14
points
7
comments
1
min read
LW
link
My recommendations for gratitude exercises
MaxCarpendale
5 Aug 2019 19:04 UTC
40
points
3
comments
5
min read
LW
link
Back to top
Next