Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
[Question]
How to bet against civilizational adequacy?
Wei Dai
Aug 12, 2022, 11:33 PM
54
points
20
comments
1
min read
LW
link
Infant AI Scenario
Nathan1123
Aug 12, 2022, 9:20 PM
1
point
0
comments
3
min read
LW
link
DeepMind alignment team opinions on AGI ruin arguments
Vika
Aug 12, 2022, 9:06 PM
395
points
37
comments
14
min read
LW
link
1
review
Dissolve: The Petty Crimes of Blaise Pascal
SebastianG
Aug 12, 2022, 8:04 PM
17
points
4
comments
6
min read
LW
link
The Host Minds of HBO’s Westworld.
Nerret
Aug 12, 2022, 6:53 PM
1
point
0
comments
3
min read
LW
link
What is estimational programming? Squiggle in context
Quinn
Aug 12, 2022, 6:39 PM
14
points
7
comments
7
min read
LW
link
Oversight Misses 100% of Thoughts The AI Does Not Think
johnswentworth
Aug 12, 2022, 4:30 PM
110
points
49
comments
1
min read
LW
link
Timelines explanation post part 1 of ?
Nathan Helm-Burger
Aug 12, 2022, 4:13 PM
10
points
1
comment
2
min read
LW
link
A little playing around with Blenderbot3
Nathan Helm-Burger
Aug 12, 2022, 4:06 PM
9
points
0
comments
1
min read
LW
link
Refining the Sharp Left Turn threat model, part 1: claims and mechanisms
Vika
,
Vikrant Varma
,
Ramana Kumar
and
Mary Phuong
Aug 12, 2022, 3:17 PM
86
points
4
comments
3
min read
LW
link
1
review
(vkrakovna.wordpress.com)
Argument by Intellectual Ordeal
lc
Aug 12, 2022, 1:03 PM
26
points
5
comments
5
min read
LW
link
Anti-squatted AI x-risk domains index
plex
Aug 12, 2022, 12:01 PM
59
points
6
comments
1
min read
LW
link
[Question]
Perfect Predictors
aditya malik
Aug 12, 2022, 11:51 AM
2
points
5
comments
1
min read
LW
link
[Question]
What are some good arguments against building new nuclear power plants?
RomanS
Aug 12, 2022, 7:32 AM
16
points
15
comments
2
min read
LW
link
Seeking PCK (Pedagogical Content Knowledge)
CFAR!Duncan
Aug 12, 2022, 4:15 AM
62
points
11
comments
5
min read
LW
link
Artificial intelligence wireheading
Big Tony
Aug 12, 2022, 3:06 AM
5
points
2
comments
1
min read
LW
link
Dissected boxed AI
Nathan1123
Aug 12, 2022, 2:37 AM
−8
points
2
comments
1
min read
LW
link
Troll Timers
Screwtape
Aug 12, 2022, 12:55 AM
29
points
13
comments
4
min read
LW
link
[Question]
Seriously, what goes wrong with “reward the agent when it makes you smile”?
TurnTrout
Aug 11, 2022, 10:22 PM
87
points
43
comments
2
min read
LW
link
Encultured AI Pre-planning, Part 2: Providing a Service
Andrew_Critch
and
Nick Hay
Aug 11, 2022, 8:11 PM
33
points
4
comments
3
min read
LW
link
My summary of the alignment problem
Peter Hroššo
Aug 11, 2022, 7:42 PM
15
points
3
comments
2
min read
LW
link
(threadreaderapp.com)
Language models seem to be much better than humans at next-token prediction
Buck
,
Fabien Roger
and
LawrenceC
Aug 11, 2022, 5:45 PM
182
points
60
comments
13
min read
LW
link
1
review
Introducing Pastcasting: A tool for forecasting practice
Sage Future
Aug 11, 2022, 5:38 PM
95
points
10
comments
2
min read
LW
link
2
reviews
Pendulums, Policy-Level Decisionmaking, Saving State
CFAR!Duncan
Aug 11, 2022, 4:47 PM
30
points
3
comments
8
min read
LW
link
Covid 8/11/22: The End Is Never The End
Zvi
Aug 11, 2022, 4:20 PM
28
points
11
comments
16
min read
LW
link
(thezvi.wordpress.com)
Singapore—Small casual dinner in Chinatown #4
Joe Rocca
Aug 11, 2022, 12:30 PM
3
points
3
comments
1
min read
LW
link
Thoughts on the good regulator theorem
JonasMoss
Aug 11, 2022, 12:08 PM
12
points
0
comments
4
min read
LW
link
How and why to turn everything into audio
KatWoods
and
AmberDawn
Aug 11, 2022, 8:55 AM
55
points
20
comments
5
min read
LW
link
Shard Theory: An Overview
David Udell
Aug 11, 2022, 5:44 AM
166
points
34
comments
10
min read
LW
link
[Question]
Do advancements in Decision Theory point towards moral absolutism?
Nathan1123
Aug 11, 2022, 12:59 AM
0
points
4
comments
4
min read
LW
link
The alignment problem from a deep learning perspective
Richard_Ngo
Aug 10, 2022, 10:46 PM
107
points
15
comments
27
min read
LW
link
1
review
How much alignment data will we need in the long run?
Jacob_Hilton
Aug 10, 2022, 9:39 PM
37
points
15
comments
4
min read
LW
link
On Ego, Reincarnation, Consciousness and The Universe
qmaury
Aug 10, 2022, 8:21 PM
−3
points
6
comments
5
min read
LW
link
Formalizing Alignment
Marv K
Aug 10, 2022, 6:50 PM
4
points
0
comments
2
min read
LW
link
How Do We Align an AGI Without Getting Socially Engineered? (Hint: Box It)
Peter S. Park
,
NickyP
and
Stephen Fowler
Aug 10, 2022, 6:14 PM
28
points
30
comments
11
min read
LW
link
Emergent Abilities of Large Language Models [Linkpost]
aog
Aug 10, 2022, 6:02 PM
25
points
2
comments
1
min read
LW
link
(arxiv.org)
How To Go From Interpretability To Alignment: Just Retarget The Search
johnswentworth
Aug 10, 2022, 4:08 PM
209
points
34
comments
3
min read
LW
link
1
review
Using GPT-3 to augment human intelligence
Henrik Karlsson
Aug 10, 2022, 3:54 PM
52
points
8
comments
18
min read
LW
link
(escapingflatland.substack.com)
ACX meetup [August]
sallatik
Aug 10, 2022, 9:54 AM
1
point
1
comment
1
min read
LW
link
Dissent Collusion
Screwtape
Aug 10, 2022, 2:43 AM
30
points
7
comments
3
min read
LW
link
The Medium Is The Bandage
party girl
Aug 10, 2022, 1:45 AM
11
points
0
comments
10
min read
LW
link
[Question]
Why is increasing public awareness of AI safety not a priority?
FinalFormal2
Aug 10, 2022, 1:28 AM
−5
points
14
comments
1
min read
LW
link
Manifold x CSPI $25k Forecasting Tournament
David Chee
Aug 9, 2022, 9:13 PM
5
points
0
comments
1
min read
LW
link
(www.cspicenter.com)
Proposal: Consider not using distance-direction-dimension words in abstract discussions
moridinamael
Aug 9, 2022, 8:44 PM
46
points
18
comments
5
min read
LW
link
[Question]
How would two superintelligent AIs interact, if they are unaligned with each other?
Nathan1123
Aug 9, 2022, 6:58 PM
4
points
6
comments
1
min read
LW
link
Disagreements about Alignment: Why, and how, we should try to solve them
ojorgensen
Aug 9, 2022, 6:49 PM
11
points
2
comments
16
min read
LW
link
Progress links and tweets, 2022-08-09
jasoncrawford
Aug 9, 2022, 5:35 PM
11
points
3
comments
1
min read
LW
link
(rootsofprogress.org)
[Question]
Is it possible to find venture capital for AI research org with strong safety focus?
AnonResearch
Aug 9, 2022, 4:12 PM
6
points
1
comment
1
min read
LW
link
[Question]
Many Gods refutation and Instrumental Goals. (Proper one)
aditya malik
Aug 9, 2022, 11:59 AM
0
points
15
comments
1
min read
LW
link
Content generation. Where do we draw the line?
Q Home
Aug 9, 2022, 10:51 AM
6
points
7
comments
2
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel