Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
The alignment problem from a deep learning perspective
Richard_Ngo
Aug 10, 2022, 10:46 PM
107
points
15
comments
27
min read
LW
link
1
review
How much alignment data will we need in the long run?
Jacob_Hilton
Aug 10, 2022, 9:39 PM
37
points
15
comments
4
min read
LW
link
On Ego, Reincarnation, Consciousness and The Universe
qmaury
Aug 10, 2022, 8:21 PM
−3
points
6
comments
5
min read
LW
link
Formalizing Alignment
Marv K
Aug 10, 2022, 6:50 PM
4
points
0
comments
2
min read
LW
link
How Do We Align an AGI Without Getting Socially Engineered? (Hint: Box It)
Peter S. Park
,
NickyP
and
Stephen Fowler
Aug 10, 2022, 6:14 PM
28
points
30
comments
11
min read
LW
link
Emergent Abilities of Large Language Models [Linkpost]
aog
Aug 10, 2022, 6:02 PM
25
points
2
comments
1
min read
LW
link
(arxiv.org)
How To Go From Interpretability To Alignment: Just Retarget The Search
johnswentworth
Aug 10, 2022, 4:08 PM
209
points
34
comments
3
min read
LW
link
1
review
Using GPT-3 to augment human intelligence
Henrik Karlsson
Aug 10, 2022, 3:54 PM
52
points
8
comments
18
min read
LW
link
(escapingflatland.substack.com)
ACX meetup [August]
sallatik
Aug 10, 2022, 9:54 AM
1
point
1
comment
1
min read
LW
link
Dissent Collusion
Screwtape
Aug 10, 2022, 2:43 AM
30
points
7
comments
3
min read
LW
link
The Medium Is The Bandage
party girl
Aug 10, 2022, 1:45 AM
11
points
0
comments
10
min read
LW
link
[Question]
Why is increasing public awareness of AI safety not a priority?
FinalFormal2
Aug 10, 2022, 1:28 AM
−5
points
14
comments
1
min read
LW
link
Manifold x CSPI $25k Forecasting Tournament
David Chee
Aug 9, 2022, 9:13 PM
5
points
0
comments
1
min read
LW
link
(www.cspicenter.com)
Proposal: Consider not using distance-direction-dimension words in abstract discussions
moridinamael
Aug 9, 2022, 8:44 PM
46
points
18
comments
5
min read
LW
link
[Question]
How would two superintelligent AIs interact, if they are unaligned with each other?
Nathan1123
Aug 9, 2022, 6:58 PM
4
points
6
comments
1
min read
LW
link
Disagreements about Alignment: Why, and how, we should try to solve them
ojorgensen
Aug 9, 2022, 6:49 PM
11
points
2
comments
16
min read
LW
link
Progress links and tweets, 2022-08-09
jasoncrawford
Aug 9, 2022, 5:35 PM
11
points
3
comments
1
min read
LW
link
(rootsofprogress.org)
[Question]
Is it possible to find venture capital for AI research org with strong safety focus?
AnonResearch
Aug 9, 2022, 4:12 PM
6
points
1
comment
1
min read
LW
link
[Question]
Many Gods refutation and Instrumental Goals. (Proper one)
aditya malik
Aug 9, 2022, 11:59 AM
0
points
15
comments
1
min read
LW
link
Content generation. Where do we draw the line?
Q Home
Aug 9, 2022, 10:51 AM
6
points
7
comments
2
min read
LW
link
[Question]
What are some alternatives to Shapley values which drop additivity?
eapi
Aug 9, 2022, 9:16 AM
11
points
6
comments
1
min read
LW
link
(math.stackexchange.com)
Radio Bostrom: Audio narrations of papers by Nick Bostrom
PeterH
Aug 9, 2022, 8:56 AM
12
points
0
comments
2
min read
LW
link
(forum.effectivealtruism.org)
Team Shard Status Report
David Udell
Aug 9, 2022, 5:33 AM
38
points
8
comments
3
min read
LW
link
Announcing: Mechanism Design for AI Safety—Reading Group
Rubi J. Hudson
Aug 9, 2022, 4:21 AM
18
points
3
comments
4
min read
LW
link
[Question]
What are some Works that might be useful but are difficult, so forgotten?
TekhneMakre
Aug 9, 2022, 2:22 AM
10
points
5
comments
1
min read
LW
link
Project proposal: Testing the IBP definition of agent
Jeremy Gillen
,
Thomas Larsen
and
JamesH
Aug 9, 2022, 1:09 AM
21
points
4
comments
2
min read
LW
link
How (not) to choose a research project
Garrett Baker
,
CatGoddess
and
Johannes C. Mayer
Aug 9, 2022, 12:26 AM
79
points
11
comments
7
min read
LW
link
[Question]
Are ya winning, son?
Nathan1123
Aug 9, 2022, 12:06 AM
14
points
13
comments
2
min read
LW
link
General alignment properties
TurnTrout
Aug 8, 2022, 11:40 PM
51
points
2
comments
1
min read
LW
link
Experiment: Be my math tutor?
sudo
Aug 8, 2022, 10:50 PM
12
points
5
comments
1
min read
LW
link
Encultured AI, Part 1 Appendix: Relevant Research Examples
Andrew_Critch
and
Nick Hay
Aug 8, 2022, 10:44 PM
11
points
1
comment
7
min read
LW
link
Encultured AI Pre-planning, Part 1: Enabling New Benchmarks
Andrew_Critch
and
Nick Hay
Aug 8, 2022, 10:44 PM
63
points
2
comments
6
min read
LW
link
Broad Basins and Data Compression
Jeremy Gillen
,
Stephen Fowler
and
Thomas Larsen
Aug 8, 2022, 8:33 PM
33
points
6
comments
7
min read
LW
link
Interpretability/Tool-ness/Alignment/Corrigibility are not Composable
johnswentworth
Aug 8, 2022, 6:05 PM
143
points
13
comments
3
min read
LW
link
LW Meetup @ DEFCON (Las Vegas) − 5-7pm Thu. Aug. 11 at Forum Food Court (Caesars)
jchan
Aug 8, 2022, 2:57 PM
6
points
0
comments
1
min read
LW
link
A sufficiently paranoid paperclip maximizer
RomanS
Aug 8, 2022, 11:17 AM
18
points
10
comments
2
min read
LW
link
[Question]
Instrumental Goals and Many Gods Refutation
aditya malik
Aug 8, 2022, 10:46 AM
−10
points
4
comments
1
min read
LW
link
Area under the curve, Eat Dirt, Broccoli Errors, Copernicus & Chaos
CFAR!Duncan
Aug 8, 2022, 8:17 AM
41
points
0
comments
7
min read
LW
link
Steganography in Chain of Thought Reasoning
A Ray
8 Aug 2022 3:47 UTC
62
points
13
comments
6
min read
LW
link
How Deadly Will Roughly-Human-Level AGI Be?
David Udell
8 Aug 2022 1:59 UTC
12
points
6
comments
1
min read
LW
link
[Question]
Can we get full audio for Eliezer’s conversation with Sam Harris?
JakubK
7 Aug 2022 20:35 UTC
30
points
8
comments
1
min read
LW
link
Complexity No Bar to AI (Or, why Computational Complexity matters less than you think for real life problems)
Noosphere89
7 Aug 2022 19:55 UTC
17
points
14
comments
3
min read
LW
link
(www.gwern.net)
The lessons of Xanadu
jasoncrawford
7 Aug 2022 17:59 UTC
110
points
20
comments
8
min read
LW
link
(jasoncrawford.org)
Careful with Caching
jefftk
7 Aug 2022 15:20 UTC
15
points
3
comments
1
min read
LW
link
(www.jefftk.com)
[Question]
How would Logical Decision Theories address the Psychopath Button?
Nathan1123
7 Aug 2022 15:19 UTC
5
points
33
comments
1
min read
LW
link
Jack Clark on the realities of AI policy
Kaj_Sotala
7 Aug 2022 8:44 UTC
68
points
3
comments
3
min read
LW
link
(threadreaderapp.com)
Expected (Social) Value
algrthms
7 Aug 2022 8:16 UTC
5
points
2
comments
3
min read
LW
link
Lamentations, Gaza and Empathy
Yair Halberstadt
7 Aug 2022 7:55 UTC
20
points
2
comments
3
min read
LW
link
Paper reading as a Cargo Cult
jem-mosig
7 Aug 2022 7:50 UTC
70
points
10
comments
5
min read
LW
link
Most Ivy-smart students aren’t at Ivy-tier schools
Aaron Bergman
7 Aug 2022 3:18 UTC
82
points
7
comments
8
min read
LW
link
(www.aaronbergman.net)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel