Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
My hour of memoryless lucidity
Eric Neyman
May 4, 2024, 1:40 AM
369
points
35
comments
5
min read
LW
link
(ericneyman.wordpress.com)
Notifications Received in 30 Minutes of Class
tanagrabeast
May 26, 2024, 5:02 PM
356
points
16
comments
8
min read
LW
link
MIRI 2024 Communications Strategy
Gretta Duleba
May 29, 2024, 7:33 PM
325
points
216
comments
7
min read
LW
link
Non-Disparagement Canaries for OpenAI
aysja
and
Adam Scholl
May 30, 2024, 7:20 PM
288
points
51
comments
2
min read
LW
link
Truthseeking is the ground in which other principles grow
Elizabeth
May 27, 2024, 1:09 AM
248
points
16
comments
16
min read
LW
link
Ilya Sutskever and Jan Leike resign from OpenAI [updated]
Zach Stein-Perlman
May 15, 2024, 12:45 AM
246
points
95
comments
2
min read
LW
link
AI companies aren’t really using external evaluators
Zach Stein-Perlman
May 24, 2024, 4:01 PM
242
points
15
comments
4
min read
LW
link
OpenAI: Fallout
Zvi
May 28, 2024, 1:20 PM
204
points
25
comments
36
min read
LW
link
(thezvi.wordpress.com)
Jaan Tallinn’s 2023 Philanthropy Overview
jaan
May 20, 2024, 12:11 PM
203
points
5
comments
1
min read
LW
link
(jaan.info)
Maybe Anthropic’s Long-Term Benefit Trust is powerless
Zach Stein-Perlman
May 27, 2024, 1:00 PM
202
points
21
comments
2
min read
LW
link
What’s Going on With OpenAI’s Messaging?
ozziegooen
May 21, 2024, 2:22 AM
191
points
13
comments
LW
link
DeepMind’s “Frontier Safety Framework” is weak and unambitious
Zach Stein-Perlman
May 18, 2024, 3:00 AM
159
points
14
comments
4
min read
LW
link
Deep Honesty
Aletheophile
May 7, 2024, 8:31 PM
159
points
25
comments
9
min read
LW
link
Language Models Model Us
eggsyntax
May 17, 2024, 9:00 PM
158
points
55
comments
7
min read
LW
link
EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024
scasper
May 21, 2024, 8:15 PM
157
points
16
comments
3
min read
LW
link
Dyslucksia
Shoshannah Tekofsky
May 9, 2024, 7:21 PM
154
points
45
comments
6
min read
LW
link
OpenAI: Exodus
Zvi
May 20, 2024, 1:10 PM
153
points
26
comments
44
min read
LW
link
(thezvi.wordpress.com)
Value Claims (In Particular) Are Usually Bullshit
johnswentworth
May 30, 2024, 6:26 AM
144
points
18
comments
2
min read
LW
link
The Pearly Gates
lsusr
May 30, 2024, 4:01 AM
127
points
6
comments
3
min read
LW
link
Awakening
lsusr
May 30, 2024, 7:03 AM
123
points
79
comments
9
min read
LW
link
Do you believe in hundred dollar bills lying on the ground? Consider humming
Elizabeth
May 16, 2024, 12:00 AM
122
points
22
comments
6
min read
LW
link
(acesounderglass.com)
[Question]
Which skincare products are evidence-based?
Vanessa Kosoy
May 2, 2024, 3:22 PM
120
points
48
comments
1
min read
LW
link
Talent Needs of Technical AI Safety Teams
yams
,
Carson Jones
,
McKennaFitzgerald
and
Ryan Kidd
May 24, 2024, 12:36 AM
117
points
65
comments
14
min read
LW
link
introduction to cancer vaccines
bhauth
May 5, 2024, 1:06 AM
113
points
19
comments
5
min read
LW
link
(www.bhauth.com)
Key takeaways from our EA and alignment research surveys
Cameron Berg
,
Judd Rosenblatt
,
florin_pop
and
AE Studio
May 3, 2024, 6:10 PM
111
points
10
comments
21
min read
LW
link
Clarifying METR’s Auditing Role
Beth Barnes
May 30, 2024, 6:41 PM
108
points
1
comment
2
min read
LW
link
The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks
Lucius Bushnaq
,
jake_mendel
,
Dan Braun
,
StefanHex
,
Nicholas Goldowsky-Dill
,
Kaarel
,
Avery
,
Joern Stoehler
,
debrevitatevitae
,
Magdalena Wache
and
Marius Hobbhahn
May 20, 2024, 5:53 PM
105
points
4
comments
3
min read
LW
link
Response to nostalgebraist: proudly waving my moral-antirealist battle flag
Steven Byrnes
May 29, 2024, 4:48 PM
103
points
29
comments
11
min read
LW
link
Advice for Activists from the History of Environmentalism
Jeffrey Heninger
May 16, 2024, 6:40 PM
100
points
8
comments
6
min read
LW
link
(blog.aiimpacts.org)
We might be missing some key feature of AI takeoff; it’ll probably seem like “we could’ve seen this coming”
Lukas_Gloor
May 9, 2024, 3:43 PM
98
points
36
comments
5
min read
LW
link
Explaining a Math Magic Trick
Robert_AIZI
May 5, 2024, 7:41 PM
97
points
10
comments
5
min read
LW
link
Uncovering Deceptive Tendencies in Language Models: A Simulated Company AI Assistant
Olli Järviniemi
and
evhub
May 6, 2024, 7:07 AM
95
points
13
comments
1
min read
LW
link
(arxiv.org)
I am the Golden Gate Bridge
Zvi
May 27, 2024, 2:40 PM
95
points
6
comments
27
min read
LW
link
(thezvi.wordpress.com)
[Question]
How to get nerds fascinated about mysterious chronic illness research?
riceissa
May 27, 2024, 10:58 PM
95
points
50
comments
2
min read
LW
link
Apollo Research 1-year update
Marius Hobbhahn
,
Lee Sharkey
,
Lucius Bushnaq
,
Dan Braun
,
Mikita Balesni
,
Jérémy Scheurer
,
Nicholas Goldowsky-Dill
,
StefanHex
,
jake_mendel
,
AlexMeinke
and
rusheb
May 29, 2024, 5:44 PM
93
points
0
comments
7
min read
LW
link
“AI Safety for Fleshy Humans” an AI Safety explainer by Nicky Case
habryka
May 3, 2024, 6:10 PM
90
points
11
comments
4
min read
LW
link
(aisafety.dance)
Teaching CS During Take-Off
andrew carle
May 14, 2024, 10:45 PM
89
points
13
comments
2
min read
LW
link
Hardshipification
Jonathan Moregård
May 28, 2024, 8:02 PM
88
points
17
comments
2
min read
LW
link
(honestliving.substack.com)
Review: Conor Moreton’s “Civilization & Cooperation”
Duncan Sabien (Deactivated)
May 26, 2024, 7:32 PM
88
points
8
comments
38
min read
LW
link
MATS Winter 2023-24 Retrospective
utilistrutil
,
LauraVaughan
,
McKennaFitzgerald
,
Christian Smith
,
Juan Gil
,
Henry Sleight
,
Matthew Wearden
and
Ryan Kidd
May 11, 2024, 12:09 AM
86
points
28
comments
49
min read
LW
link
OpenAI: Helen Toner Speaks
Zvi
May 30, 2024, 9:10 PM
86
points
8
comments
13
min read
LW
link
(thezvi.wordpress.com)
Environmentalism in the United States Is Unusually Partisan
Jeffrey Heninger
May 13, 2024, 9:23 PM
85
points
26
comments
4
min read
LW
link
(blog.aiimpacts.org)
AISafety.com – Resources for AI Safety
Søren Elverlin
,
plex
,
Bryce Robertson
and
Melissa Samworth
May 17, 2024, 3:57 PM
82
points
3
comments
1
min read
LW
link
My thesis (Algorithmic Bayesian Epistemology) explained in more depth
Eric Neyman
May 9, 2024, 7:43 PM
82
points
4
comments
27
min read
LW
link
(ericneyman.wordpress.com)
New voluntary commitments (AI Seoul Summit)
Zach Stein-Perlman
May 21, 2024, 11:00 AM
81
points
17
comments
7
min read
LW
link
(www.gov.uk)
Reward hacking behavior can generalize across tasks
Kei
,
Isaac Dunn
,
Henry Sleight
,
Miles Turpin
,
evhub
,
Carson Denison
and
Ethan Perez
May 28, 2024, 4:33 PM
79
points
5
comments
21
min read
LW
link
LessWrong Community Weekend 2024, open for applications
UnplannedCauliflower
and
jt
1 May 2024 10:18 UTC
79
points
2
comments
7
min read
LW
link
Instruction-following AGI is easier and more likely than value aligned AGI
Seth Herd
15 May 2024 19:38 UTC
79
points
28
comments
12
min read
LW
link
MIRI’s May 2024 Newsletter
Harlan
15 May 2024 0:13 UTC
79
points
1
comment
3
min read
LW
link
(intelligence.org)
ACX Covid Origins Post convinced readers
ErnestScribbler
1 May 2024 13:06 UTC
77
points
7
comments
2
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel