Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
2
Against Active Shooter Drills
Zvi
Jun 16, 2022, 1:40 PM
91
points
30
comments
7
min read
LW
link
(thezvi.wordpress.com)
Announcing the Alignment of Complex Systems Research Group
Jan_Kulveit
and
technicalities
Jun 4, 2022, 4:10 AM
91
points
20
comments
5
min read
LW
link
The “mind-body vicious cycle” model of RSI & back pain
Steven Byrnes
Jun 9, 2022, 12:30 PM
91
points
32
comments
12
min read
LW
link
I’m trying out “asteroid mindset”
Alex_Altair
Jun 3, 2022, 1:35 PM
90
points
5
comments
4
min read
LW
link
In defense of flailing, with foreword by Bill Burr
lc
Jun 17, 2022, 4:40 PM
88
points
6
comments
4
min read
LW
link
I applied for a MIRI job in 2020. Here’s what happened next.
ViktoriaMalyasova
Jun 15, 2022, 7:37 PM
86
points
17
comments
7
min read
LW
link
Causal confusion as an argument against the scaling hypothesis
RobertKirk
and
David Scott Krueger (formerly: capybaralet)
Jun 20, 2022, 10:54 AM
86
points
30
comments
15
min read
LW
link
Transcript of a Twitter Discussion on EA from June 2022
Zvi
Jun 6, 2022, 1:50 PM
85
points
4
comments
1
min read
LW
link
(thezvi.wordpress.com)
Air Conditioner Test Results & Discussion
johnswentworth
Jun 22, 2022, 10:26 PM
82
points
42
comments
6
min read
LW
link
Air Conditioner Repair
Zvi
Jun 27, 2022, 12:40 PM
81
points
34
comments
4
min read
LW
link
(thezvi.wordpress.com)
Reinventing the wheel
jasoncrawford
Jun 4, 2022, 10:39 PM
78
points
13
comments
2
min read
LW
link
(rootsofprogress.org)
AI Training Should Allow Opt-Out
alyssavance
Jun 23, 2022, 1:33 AM
76
points
13
comments
6
min read
LW
link
A Quick List of Some Problems in AI Alignment As A Field
Nicholas / Heather Kross
Jun 21, 2022, 11:23 PM
75
points
12
comments
6
min read
LW
link
(www.thinkingmuchbetter.com)
Worked Examples of Shapley Values
lalaithion
Jun 24, 2022, 5:13 PM
75
points
11
comments
8
min read
LW
link
Some reflections on the LW community after several months of active engagement
M. Y. Zuo
Jun 25, 2022, 5:04 PM
72
points
40
comments
4
min read
LW
link
Feature request: voting buttons at the bottom?
Oliver Sourbut
Jun 24, 2022, 2:41 PM
71
points
12
comments
1
min read
LW
link
Book Review: Talent
Zvi
Jun 3, 2022, 8:10 PM
70
points
19
comments
79
min read
LW
link
(thezvi.wordpress.com)
Eliciting Latent Knowledge (ELK) - Distillation/Summary
Marius Hobbhahn
Jun 8, 2022, 1:18 PM
69
points
2
comments
21
min read
LW
link
How to pursue a career in technical AI alignment
Charlie Rogers-Smith
Jun 4, 2022, 9:11 PM
69
points
1
comment
39
min read
LW
link
Resources I send to AI researchers about AI safety
Vael Gates
Jun 14, 2022, 2:24 AM
69
points
12
comments
1
min read
LW
link
Epistemological Vigilance for Alignment
adamShimi
Jun 6, 2022, 12:27 AM
66
points
11
comments
10
min read
LW
link
Seven ways to become unstoppably agentic
Evie Cottrell
Jun 26, 2022, 5:39 PM
64
points
16
comments
8
min read
LW
link
[Question]
Has anyone actually tried to convince Terry Tao or other top mathematicians to work on alignment?
P.
Jun 8, 2022, 10:26 PM
64
points
51
comments
4
min read
LW
link
Half-baked AI Safety ideas thread
Aryeh Englander
Jun 23, 2022, 4:11 PM
64
points
63
comments
1
min read
LW
link
“Brain enthusiasts” in AI Safety
Jan
and
Samuel Nellessen
Jun 18, 2022, 9:59 AM
63
points
5
comments
10
min read
LW
link
(universalprior.substack.com)
Ten experiments in modularity, which we’d like you to run!
CallumMcDougall
,
Lucius Bushnaq
and
Avery
Jun 16, 2022, 9:17 AM
62
points
3
comments
9
min read
LW
link
[Question]
What’s the contingency plan if we get AGI tomorrow?
Yitz
Jun 23, 2022, 3:10 AM
61
points
23
comments
1
min read
LW
link
Open Problems in AI X-Risk [PAIS #5]
Dan H
and
TW123
Jun 10, 2022, 2:08 AM
61
points
6
comments
36
min read
LW
link
How Do Selection Theorems Relate To Interpretability?
johnswentworth
Jun 9, 2022, 7:39 PM
60
points
14
comments
3
min read
LW
link
A short conceptual explainer of Immanuel Kant’s Critique of Pure Reason
jessicata
Jun 3, 2022, 1:06 AM
57
points
12
comments
16
min read
LW
link
(unstableontology.com)
Covid 6/2/22: Declining to Respond
Zvi
Jun 2, 2022, 1:50 PM
55
points
10
comments
7
min read
LW
link
(thezvi.wordpress.com)
Kurzgesagt – The Last Human (Youtube)
habryka
Jun 29, 2022, 3:28 AM
54
points
7
comments
1
min read
LW
link
(www.youtube.com)
How To: A Workshop (or anything)
Duncan Sabien (Inactive)
Jun 12, 2022, 8:00 AM
53
points
13
comments
37
min read
LW
link
1
review
[Link] OpenAI: Learning to Play Minecraft with Video PreTraining (VPT)
Aryeh Englander
Jun 23, 2022, 4:29 PM
53
points
3
comments
1
min read
LW
link
Paradigms of AI alignment: components and enablers
Vika
Jun 2, 2022, 6:19 AM
53
points
4
comments
8
min read
LW
link
How fast can we perform a forward pass?
jsteinhardt
Jun 10, 2022, 11:30 PM
53
points
9
comments
15
min read
LW
link
(bounded-regret.ghost.io)
The horror of what must, yet cannot, be true
Kaj_Sotala
Jun 2, 2022, 10:20 AM
52
points
18
comments
2
min read
LW
link
(kajsotala.fi)
Latent Adversarial Training
Adam Jermyn
Jun 29, 2022, 8:04 PM
52
points
13
comments
5
min read
LW
link
What’s it like to have sex with Duncan?
Duncan Sabien (Inactive)
Jun 17, 2022, 2:32 AM
52
points
19
comments
17
min read
LW
link
Perils of optimizing in social contexts
owencb
Jun 16, 2022, 5:40 PM
50
points
1
comment
2
min read
LW
link
Our mental building blocks are more different than I thought
Marius Hobbhahn
Jun 15, 2022, 11:07 AM
50
points
11
comments
14
min read
LW
link
Child Contracting
jefftk
Jun 26, 2022, 2:30 AM
48
points
2
comments
1
min read
LW
link
(www.jefftk.com)
Poorly-Aimed Death Rays
Thane Ruthenis
11 Jun 2022 18:29 UTC
48
points
5
comments
4
min read
LW
link
Pitching an Alignment Softball
mu_(negative)
7 Jun 2022 4:10 UTC
47
points
13
comments
10
min read
LW
link
Why so little AI risk on rationalist-adjacent blogs?
Grant Demaree
13 Jun 2022 6:31 UTC
46
points
23
comments
8
min read
LW
link
[Link] Childcare : what the science says
Gunnar_Zarncke
24 Jun 2022 21:45 UTC
46
points
4
comments
1
min read
LW
link
(criticalscience.medium.com)
Summary of “AGI Ruin: A List of Lethalities”
Stephen McAleese
10 Jun 2022 22:35 UTC
45
points
2
comments
8
min read
LW
link
Dagger of Detect Evil
lsusr
21 Jun 2022 6:23 UTC
45
points
22
comments
3
min read
LW
link
Continuity Assumptions
Jan_Kulveit
13 Jun 2022 21:31 UTC
44
points
13
comments
4
min read
LW
link
FYI: I’m working on a book about the threat of AGI/ASI for a general audience. I hope it will be of value to the cause and the community
Darren McKee
15 Jun 2022 18:08 UTC
43
points
15
comments
2
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel