Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Page
3
dalle2 comments
nostalgebraist
Apr 26, 2022, 5:30 AM
183
points
14
comments
13
min read
LW
link
(nostalgebraist.tumblr.com)
Look For Principles Which Will Carry Over To The Next Paradigm
johnswentworth
Jan 14, 2022, 8:22 PM
182
points
7
comments
5
min read
LW
link
1
review
Language models seem to be much better than humans at next-token prediction
Buck
,
Fabien Roger
and
LawrenceC
Aug 11, 2022, 5:45 PM
182
points
60
comments
13
min read
LW
link
1
review
Conjecture: a retrospective after 8 months of work
Connor Leahy
,
Sid Black
,
Gabriel Alfour
and
Chris Scammell
Nov 23, 2022, 5:10 PM
180
points
9
comments
8
min read
LW
link
The prototypical catastrophic AI action is getting root access to its datacenter
Buck
Jun 2, 2022, 11:46 PM
180
points
13
comments
2
min read
LW
link
1
review
Postmortem on DIY Recombinant Covid Vaccine
caffemacchiavelli
Jan 22, 2022, 2:12 PM
179
points
27
comments
5
min read
LW
link
1
review
IMO challenge bet with Eliezer
paulfchristiano
Feb 26, 2022, 4:50 AM
179
points
26
comments
3
min read
LW
link
Some conceptual alignment research projects
Richard_Ngo
Aug 25, 2022, 10:51 PM
177
points
15
comments
3
min read
LW
link
AGI ruin scenarios are likely (and disjunctive)
So8res
Jul 27, 2022, 3:21 AM
177
points
38
comments
6
min read
LW
link
7 traps that (we think) new alignment researchers often fall into
Orpheus16
and
Thomas Larsen
Sep 27, 2022, 11:13 PM
176
points
10
comments
4
min read
LW
link
Geometric Rationality is Not VNM Rational
Scott Garrabrant
Nov 27, 2022, 7:36 PM
176
points
27
comments
3
min read
LW
link
What AI Safety Materials Do ML Researchers Find Compelling?
Vael Gates
and
Collin
Dec 28, 2022, 2:03 AM
175
points
34
comments
2
min read
LW
link
The next decades might be wild
Marius Hobbhahn
Dec 15, 2022, 4:10 PM
175
points
42
comments
41
min read
LW
link
1
review
Russia has Invaded Ukraine
lsusr
Feb 24, 2022, 7:52 AM
174
points
268
comments
3
min read
LW
link
Finite Factored Sets in Pictures
Magdalena Wache
Dec 11, 2022, 6:49 PM
174
points
35
comments
12
min read
LW
link
The inordinately slow spread of good AGI conversations in ML
Rob Bensinger
Jun 21, 2022, 4:09 PM
173
points
62
comments
8
min read
LW
link
What’s Up With Confusingly Pervasive Goal Directedness?
Raemon
Jan 20, 2022, 7:22 PM
172
points
89
comments
4
min read
LW
link
Announcing the Inverse Scaling Prize ($250k Prize Pool)
Ethan Perez
,
Ian McKenzie
and
Sam Bowman
Jun 27, 2022, 3:58 PM
171
points
14
comments
7
min read
LW
link
Decision theory does not imply that we get to have nice things
So8res
Oct 18, 2022, 3:04 AM
171
points
73
comments
26
min read
LW
link
2
reviews
Transcripts of interviews with AI researchers
Vael Gates
May 9, 2022, 5:57 AM
170
points
9
comments
2
min read
LW
link
Do bamboos set themselves on fire?
Malmesbury
Sep 19, 2022, 3:34 PM
170
points
14
comments
6
min read
LW
link
1
review
Using GPT-Eliezer against ChatGPT Jailbreaking
Stuart_Armstrong
and
rgorman
Dec 6, 2022, 7:54 PM
170
points
85
comments
9
min read
LW
link
Six (and a half) intuitions for KL divergence
CallumMcDougall
Oct 12, 2022, 9:07 PM
170
points
27
comments
10
min read
LW
link
1
review
(www.perfectlynormal.co.uk)
AI Could Defeat All Of Us Combined
HoldenKarnofsky
Jun 9, 2022, 3:50 PM
170
points
42
comments
17
min read
LW
link
(www.cold-takes.com)
Searching for outliers
benkuhn
Mar 21, 2022, 2:40 AM
169
points
16
comments
18
min read
LW
link
1
review
(www.benkuhn.net)
Planes are still decades away from displacing most bird jobs
guzey
Nov 25, 2022, 4:49 PM
168
points
13
comments
3
min read
LW
link
Impossibility results for unbounded utilities
paulfchristiano
Feb 2, 2022, 3:52 AM
167
points
109
comments
8
min read
LW
link
1
review
Shard Theory: An Overview
David Udell
Aug 11, 2022, 5:44 AM
166
points
34
comments
10
min read
LW
link
Things that can kill you quickly: What everyone should know about first aid
jasoncrawford
Dec 27, 2022, 4:23 PM
166
points
21
comments
2
min read
LW
link
1
review
(jasoncrawford.org)
Playing with DALL·E 2
Dave Orr
Apr 7, 2022, 6:49 PM
166
points
118
comments
6
min read
LW
link
[Beta Feature] Google-Docs-like editing for LessWrong posts
Ruby
and
jimrandomh
Feb 23, 2022, 1:52 AM
165
points
26
comments
3
min read
LW
link
The Social Recession: By the Numbers
antonomon
Oct 29, 2022, 6:45 PM
165
points
29
comments
8
min read
LW
link
(novum.substack.com)
Everything I Need To Know About Takeoff Speeds I Learned From Air Conditioner Ratings On Amazon
johnswentworth
Apr 15, 2022, 7:05 PM
165
points
128
comments
5
min read
LW
link
On A List of Lethalities
Zvi
Jun 13, 2022, 12:30 PM
165
points
50
comments
54
min read
LW
link
1
review
(thezvi.wordpress.com)
Deepmind’s Gato: Generalist Agent
Daniel Kokotajlo
May 12, 2022, 4:01 PM
165
points
62
comments
1
min read
LW
link
Most People Start With The Same Few Bad Ideas
johnswentworth
9 Sep 2022 0:29 UTC
165
points
30
comments
3
min read
LW
link
Why I think there’s a one-in-six chance of an imminent global nuclear war
Max Tegmark
8 Oct 2022 6:26 UTC
164
points
169
comments
4
min read
LW
link
A transparency and interpretability tech tree
evhub
16 Jun 2022 23:44 UTC
163
points
11
comments
18
min read
LW
link
1
review
The Onion Test for Personal and Institutional Honesty
chanamessinger
and
Andrew_Critch
27 Sep 2022 15:26 UTC
163
points
31
comments
3
min read
LW
link
3
reviews
Logical induction for software engineers
Alex Flint
3 Dec 2022 19:55 UTC
163
points
8
comments
27
min read
LW
link
1
review
Be less scared of overconfidence
benkuhn
30 Nov 2022 15:20 UTC
163
points
22
comments
9
min read
LW
link
(www.benkuhn.net)
A Bird’s Eye View of the ML Field [Pragmatic AI Safety #2]
Dan H
and
TW123
9 May 2022 17:18 UTC
163
points
8
comments
35
min read
LW
link
ITT-passing and civility are good; “charity” is bad; steelmanning is niche
Rob Bensinger
5 Jul 2022 0:15 UTC
163
points
36
comments
6
min read
LW
link
1
review
Threat-Resistant Bargaining Megapost: Introducing the ROSE Value
Diffractor
28 Sep 2022 1:20 UTC
162
points
19
comments
53
min read
LW
link
2
reviews
Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/Etc
johnswentworth
4 Jun 2022 5:41 UTC
160
points
55
comments
2
min read
LW
link
1
review
[Intro to brain-like-AGI safety] 1. What’s the problem & Why work on it now?
Steven Byrnes
26 Jan 2022 15:23 UTC
159
points
19
comments
26
min read
LW
link
The Geometric Expectation
Scott Garrabrant
23 Nov 2022 18:05 UTC
159
points
22
comments
4
min read
LW
link
Godzilla Strategies
johnswentworth
11 Jun 2022 15:44 UTC
159
points
72
comments
3
min read
LW
link
Repeal the Foreign Dredge Act of 1906
Zvi
5 May 2022 15:20 UTC
159
points
16
comments
19
min read
LW
link
(thezvi.wordpress.com)
Why all the fuss about recursive self-improvement?
So8res
12 Jun 2022 20:53 UTC
158
points
62
comments
7
min read
LW
link
1
review
Back to first
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel