Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Stuart_Armstrong
Karma:
17,986
All
Posts
Comments
New
Top
Old
Page
4
Consistencies as (meta-)preferences
Stuart_Armstrong
May 3, 2021, 3:10 PM
17
points
0
comments
3
min read
LW
link
Why unriggable *almost* implies uninfluenceable
Stuart_Armstrong
Apr 9, 2021, 5:07 PM
11
points
0
comments
4
min read
LW
link
A possible preference algorithm
Stuart_Armstrong
Apr 8, 2021, 6:25 PM
22
points
0
comments
4
min read
LW
link
If you don’t design for extrapolation, you’ll extrapolate poorly—possibly fatally
Stuart_Armstrong
Apr 8, 2021, 6:10 PM
17
points
0
comments
4
min read
LW
link
Which counterfactuals should an AI follow?
Stuart_Armstrong
Apr 7, 2021, 4:47 PM
19
points
5
comments
7
min read
LW
link
Toy model of preference, bias, and extra information
Stuart_Armstrong
Mar 24, 2021, 10:14 AM
9
points
0
comments
4
min read
LW
link
Preferences and biases, the information argument
Stuart_Armstrong
Mar 23, 2021, 12:44 PM
14
points
5
comments
1
min read
LW
link
Why sigmoids are so hard to predict
Stuart_Armstrong
Mar 18, 2021, 6:21 PM
56
points
7
comments
5
min read
LW
link
Connecting the good regulator theorem with semantics and symbol grounding
Stuart_Armstrong
Mar 4, 2021, 2:35 PM
13
points
0
comments
2
min read
LW
link
Cartesian frames as generalised models
Stuart_Armstrong
Feb 16, 2021, 4:09 PM
20
points
0
comments
5
min read
LW
link
Generalised models as a category
Stuart_Armstrong
Feb 16, 2021, 4:08 PM
25
points
9
comments
4
min read
LW
link
Counterfactual control incentives
Stuart_Armstrong
Jan 21, 2021, 4:54 PM
21
points
10
comments
9
min read
LW
link
Short summary of mAIry’s room
Stuart_Armstrong
Jan 18, 2021, 6:11 PM
26
points
2
comments
4
min read
LW
link
Syntax, semantics, and symbol grounding, simplified
Stuart_Armstrong
Nov 23, 2020, 4:12 PM
30
points
4
comments
9
min read
LW
link
The ethics of AI for the Routledge Encyclopedia of Philosophy
Stuart_Armstrong
Nov 18, 2020, 5:55 PM
45
points
8
comments
1
min read
LW
link
Extortion beats brinksmanship, but the audience matters
Stuart_Armstrong
Nov 16, 2020, 9:13 PM
27
points
15
comments
4
min read
LW
link
Humans are stunningly rational and stunningly irrational
Stuart_Armstrong
Oct 23, 2020, 2:13 PM
23
points
4
comments
2
min read
LW
link
Knowledge, manipulation, and free will
Stuart_Armstrong
Oct 13, 2020, 5:47 PM
33
points
15
comments
3
min read
LW
link
Dehumanisation *errors*
Stuart_Armstrong
Sep 23, 2020, 9:51 AM
13
points
0
comments
1
min read
LW
link
Anthropomorphisation vs value learning: type 1 vs type 2 errors
Stuart_Armstrong
Sep 22, 2020, 10:46 AM
16
points
10
comments
1
min read
LW
link
Back to first
Previous
Back to top
Next