Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
2
Don’t expect your life partner to be better than your exes in more than one way: a mathematical model
mdd
Oct 29, 2022, 6:47 PM
7
points
1
comment
9
min read
LW
link
The Social Recession: By the Numbers
antonomon
Oct 29, 2022, 6:45 PM
165
points
29
comments
8
min read
LW
link
(novum.substack.com)
Electric Kettle vs Stove
jefftk
Oct 29, 2022, 12:50 PM
18
points
7
comments
1
min read
LW
link
(www.jefftk.com)
Quantum Immortality, foiled
Ben
Oct 29, 2022, 11:00 AM
27
points
4
comments
2
min read
LW
link
Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small
RowanWang
,
Alexandre Variengien
,
Arthur Conmy
,
Buck
and
jsteinhardt
Oct 28, 2022, 11:55 PM
101
points
9
comments
9
min read
LW
link
2
reviews
(arxiv.org)
Resources that (I think) new alignment researchers should know about
Orpheus16
Oct 28, 2022, 10:13 PM
70
points
9
comments
4
min read
LW
link
How often does One Person succeed?
Mayank Modi
Oct 28, 2022, 7:32 PM
1
point
3
comments
LW
link
aisafety.community—A living document of AI safety communities
zeshen
and
plex
Oct 28, 2022, 5:50 PM
58
points
23
comments
1
min read
LW
link
Rapid Test Throat Swabbing?
jefftk
Oct 28, 2022, 4:30 PM
18
points
2
comments
1
min read
LW
link
(www.jefftk.com)
Join the interpretability research hackathon
Esben Kran
Oct 28, 2022, 4:26 PM
15
points
0
comments
LW
link
Syncretism
Annapurna
Oct 28, 2022, 4:08 PM
16
points
4
comments
1
min read
LW
link
(jorgevelez.substack.com)
Pondering computation in the real world
Adam Shai
Oct 28, 2022, 3:57 PM
24
points
13
comments
5
min read
LW
link
Ukraine and the Crimea Question
ChristianKl
Oct 28, 2022, 12:26 PM
−2
points
153
comments
11
min read
LW
link
New book on s-risks
Tobias_Baumann
Oct 28, 2022, 9:36 AM
68
points
1
comment
LW
link
Cryptic symbols
Adam Scherlis
Oct 28, 2022, 6:44 AM
6
points
17
comments
1
min read
LW
link
(adam.scherlis.com)
All life’s helpers’ beliefs
Tehdastehdas
Oct 28, 2022, 5:47 AM
−12
points
1
comment
5
min read
LW
link
Prizes for ML Safety Benchmark Ideas
joshc
Oct 28, 2022, 2:51 AM
36
points
5
comments
1
min read
LW
link
Worldview iPeople—Future Fund’s AI Worldview Prize
Toni MUENDEL
Oct 28, 2022, 1:53 AM
−22
points
4
comments
9
min read
LW
link
Anatomy of change
Jose Miguel Cruz y Celis
Oct 28, 2022, 1:21 AM
1
point
0
comments
1
min read
LW
link
Nash equilibria of symmetric zero-sum games
Ege Erdil
Oct 27, 2022, 11:50 PM
14
points
0
comments
14
min read
LW
link
[Question]
Good psychology books/books that contain good psychological models?
shuffled-cantaloupe
Oct 27, 2022, 11:04 PM
1
point
1
comment
1
min read
LW
link
Podcast: The Left and Effective Altruism with Habiba Islam
garrison
Oct 27, 2022, 5:41 PM
2
points
2
comments
LW
link
Lessons from ‘Famine, Affluence, and Morality’ and its reflection on today.
Mayank Modi
Oct 27, 2022, 5:20 PM
4
points
0
comments
LW
link
[Question]
Is the Orthogonality Thesis true for humans?
Noosphere89
Oct 27, 2022, 2:41 PM
12
points
20
comments
1
min read
LW
link
Historicism in the math-adjacent sciences
mrcbarbier
Oct 27, 2022, 2:38 PM
3
points
0
comments
5
min read
LW
link
How Risky Is Trick-or-Treating?
jefftk
Oct 27, 2022, 2:10 PM
58
points
18
comments
2
min read
LW
link
(www.jefftk.com)
Covid 10/27/22: Another Origin Story
Zvi
Oct 27, 2022, 1:40 PM
32
points
1
comment
13
min read
LW
link
(thezvi.wordpress.com)
[Question]
Why are probabilities represented as real numbers instead of rational numbers?
Yaakov T
Oct 27, 2022, 11:23 AM
5
points
9
comments
1
min read
LW
link
Five Areas I Wish EAs Gave More Focus
Prometheus
Oct 27, 2022, 6:13 AM
13
points
18
comments
LW
link
Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley
maxnadeau
,
Xander Davies
,
Buck
and
Nate Thomas
Oct 27, 2022, 1:32 AM
135
points
14
comments
12
min read
LW
link
[Question]
Quantum Suicide and Aumann’s Agreement Theorem
Isaac King
Oct 27, 2022, 1:32 AM
14
points
20
comments
1
min read
LW
link
Reslab Request for Information: EA hardware projects
Joel Becker
Oct 26, 2022, 9:13 PM
10
points
0
comments
LW
link
A list of Petrov buttons
philh
Oct 26, 2022, 8:50 PM
19
points
8
comments
5
min read
LW
link
(reasonableapproximation.net)
The Game of Antonyms
Faustify
Oct 26, 2022, 7:26 PM
4
points
6
comments
8
min read
LW
link
Paper: In-context Reinforcement Learning with Algorithm Distillation [Deepmind]
LawrenceC
Oct 26, 2022, 6:45 PM
29
points
5
comments
1
min read
LW
link
(arxiv.org)
[Question]
How to become more articulate?
just_browsing
Oct 26, 2022, 2:43 PM
19
points
14
comments
1
min read
LW
link
Open Bands: Leading Rhythm
jefftk
26 Oct 2022 14:30 UTC
10
points
0
comments
4
min read
LW
link
(www.jefftk.com)
Signals of war in August 2021
yieldthought
26 Oct 2022 8:11 UTC
70
points
16
comments
2
min read
LW
link
Trigger-based rapid checklists
VipulNaik
26 Oct 2022 4:05 UTC
44
points
0
comments
9
min read
LW
link
Why some people believe in AGI, but I don’t.
cveres
26 Oct 2022 3:09 UTC
−15
points
6
comments
LW
link
Intent alignment should not be the goal for AGI x-risk reduction
John Nay
26 Oct 2022 1:24 UTC
1
point
10
comments
3
min read
LW
link
Reinforcement Learning Goal Misgeneralization: Can we guess what kind of goals are selected by default?
StefanHex
and
Julian_R
25 Oct 2022 20:48 UTC
15
points
2
comments
4
min read
LW
link
A Walkthrough of A Mathematical Framework for Transformer Circuits
Neel Nanda
25 Oct 2022 20:24 UTC
52
points
7
comments
1
min read
LW
link
(www.youtube.com)
Nothing.
rogersbacon
25 Oct 2022 16:33 UTC
−10
points
4
comments
6
min read
LW
link
(www.secretorum.life)
Maps and Blueprint; the Two Sides of the Alignment Equation
Nora_Ammann
25 Oct 2022 16:29 UTC
24
points
1
comment
5
min read
LW
link
Consider Applying to the Future Fellowship at MIT
jefftk
25 Oct 2022 15:40 UTC
29
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Beyond Kolmogorov and Shannon
Alexander Gietelink Oldenziel
and
Adam Shai
25 Oct 2022 15:13 UTC
63
points
22
comments
5
min read
LW
link
What does it take to defend the world against out-of-control AGIs?
Steven Byrnes
25 Oct 2022 14:47 UTC
208
points
49
comments
30
min read
LW
link
1
review
Refine: what helped me write more?
Alexander Gietelink Oldenziel
25 Oct 2022 14:44 UTC
12
points
0
comments
2
min read
LW
link
Logical Decision Theories: Our final failsafe?
Noosphere89
25 Oct 2022 12:51 UTC
−7
points
8
comments
1
min read
LW
link
(www.lesswrong.com)
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel