Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
Page
1
AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
5 Jun 2022 22:05 UTC
921
points
704
comments
30
min read
LW
link
3
reviews
Where I agree and disagree with Eliezer
paulfchristiano
19 Jun 2022 19:15 UTC
890
points
223
comments
18
min read
LW
link
2
reviews
Eight Short Studies On Excuses
Scott Alexander
20 Apr 2010 23:01 UTC
839
points
253
comments
10
min read
LW
link
Preface
Eliezer Yudkowsky
11 Mar 2015 19:00 UTC
791
points
15
comments
4
min read
LW
link
The Best Textbooks on Every Subject
lukeprog
16 Jan 2011 8:30 UTC
751
points
414
comments
7
min read
LW
link
SolidGoldMagikarp (plus, prompt generation)
Jessica Rumbelow
and
mwatkins
5 Feb 2023 22:02 UTC
677
points
206
comments
12
min read
LW
link
1
review
What an actually pessimistic containment strategy looks like
lc
5 Apr 2022 0:19 UTC
676
points
138
comments
6
min read
LW
link
2
reviews
The Waluigi Effect (mega-post)
Cleo Nardo
3 Mar 2023 3:22 UTC
632
points
188
comments
16
min read
LW
link
Simulators
janus
2 Sep 2022 12:45 UTC
614
points
168
comments
41
min read
LW
link
8
reviews
(generative.ink)
Schelling fences on slippery slopes
Scott Alexander
16 Mar 2012 23:44 UTC
599
points
250
comments
6
min read
LW
link
(The) Lightcone is nothing without its people: LW + Lighthaven’s big fundraiser
habryka
30 Nov 2024 2:55 UTC
596
points
243
comments
42
min read
LW
link
Rationalism before the Sequences
Eric Raymond
30 Mar 2021 14:04 UTC
594
points
83
comments
11
min read
LW
link
2
reviews
Making Vaccine
johnswentworth
3 Feb 2021 20:24 UTC
579
points
249
comments
6
min read
LW
link
3
reviews
LessWrong’s (first) album: I Have Been A Good Bing
habryka
and
kave
1 Apr 2024 7:33 UTC
566
points
179
comments
11
min read
LW
link
Humans are not automatically strategic
AnnaSalamon
8 Sep 2010 7:02 UTC
561
points
278
comments
4
min read
LW
link
Let’s think about slowing down AI
KatjaGrace
22 Dec 2022 17:40 UTC
551
points
182
comments
38
min read
LW
link
3
reviews
(aiimpacts.org)
Pain is not the unit of Effort
alkjash
24 Nov 2020 20:00 UTC
550
points
90
comments
5
min read
LW
link
2
reviews
(radimentary.wordpress.com)
Diseased thinking: dissolving questions about disease
Scott Alexander
30 May 2010 21:16 UTC
530
points
356
comments
9
min read
LW
link
What 2026 looks like
Daniel Kokotajlo
6 Aug 2021 16:14 UTC
525
points
156
comments
16
min read
LW
link
1
review
OpenAI Email Archives (from Musk v. Altman and OpenAI blog)
habryka
16 Nov 2024 6:38 UTC
523
points
80
comments
51
min read
LW
link
The Talk: a brief explanation of sexual dimorphism
Malmesbury
18 Sep 2023 16:23 UTC
508
points
75
comments
16
min read
LW
link
3
reviews
The Redaction Machine
Ben
20 Sep 2022 22:03 UTC
502
points
48
comments
27
min read
LW
link
1
review
Reason as memetic immune disorder
PhilGoetz
19 Sep 2009 21:05 UTC
500
points
185
comments
5
min read
LW
link
Luck based medicine: my resentful story of becoming a medical miracle
Elizabeth
16 Oct 2022 17:40 UTC
485
points
121
comments
12
min read
LW
link
3
reviews
(acesounderglass.com)
How much do you believe your results?
Eric Neyman
6 May 2023 20:31 UTC
483
points
17
comments
15
min read
LW
link
3
reviews
(ericneyman.wordpress.com)
Making Beliefs Pay Rent (in Anticipated Experiences)
Eliezer Yudkowsky
28 Jul 2007 22:59 UTC
478
points
267
comments
4
min read
LW
link
Losing the root for the tree
Adam Zerner
20 Sep 2022 4:53 UTC
475
points
31
comments
9
min read
LW
link
1
review
Being the (Pareto) Best in the World
johnswentworth
24 Jun 2019 18:36 UTC
462
points
60
comments
3
min read
LW
link
3
reviews
How To Write Quickly While Maintaining Epistemic Rigor
johnswentworth
28 Aug 2021 17:52 UTC
453
points
38
comments
4
min read
LW
link
3
reviews
Alignment Faking in Large Language Models
ryan_greenblatt
,
evhub
,
Carson Denison
,
Benjamin Wright
,
Fabien Roger
,
Monte M
,
Sam Marks
,
Johannes Treutlein
,
Sam Bowman
and
Buck
18 Dec 2024 17:19 UTC
451
points
53
comments
10
min read
LW
link
100 Tips for a Better Life
Ideopunk
22 Dec 2020 14:30 UTC
450
points
130
comments
9
min read
LW
link
1
review
Counter-theses on Sleep
Natália
21 Mar 2022 23:21 UTC
446
points
135
comments
15
min read
LW
link
1
review
Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible
GeneSmith
and
kman
12 Dec 2023 18:14 UTC
443
points
198
comments
33
min read
LW
link
1
review
It’s Probably Not Lithium
Natália
28 Jun 2022 21:24 UTC
442
points
187
comments
28
min read
LW
link
1
review
Focus on the places where you feel shocked everyone’s dropping the ball
So8res
2 Feb 2023 0:27 UTC
440
points
63
comments
4
min read
LW
link
3
reviews
I would have shit in that alley, too
Declan Molony
18 Jun 2024 4:41 UTC
440
points
135
comments
4
min read
LW
link
The ants and the grasshopper
Richard_Ngo
4 Jun 2023 22:00 UTC
440
points
38
comments
5
min read
LW
link
2
reviews
(www.narrativeark.xyz)
Welcome to LessWrong!
Ruby
,
Raemon
,
RobertM
and
habryka
14 Jun 2019 19:42 UTC
437
points
63
comments
2
min read
LW
link
Steering GPT-2-XL by adding an activation vector
TurnTrout
,
Monte M
,
David Udell
,
lisathiergart
and
Ulisse Mini
13 May 2023 18:42 UTC
437
points
98
comments
50
min read
LW
link
1
review
Generalizing From One Example
Scott Alexander
28 Apr 2009 22:00 UTC
437
points
423
comments
6
min read
LW
link
Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)?
gwern
3 Jul 2023 0:48 UTC
425
points
54
comments
7
min read
LW
link
(www.youtube.com)
Bets, Bonds, and Kindergarteners
jefftk
3 Jan 2021 21:20 UTC
421
points
35
comments
2
min read
LW
link
1
review
(www.jefftk.com)
The noncentral fallacy—the worst argument in the world?
Scott Alexander
27 Aug 2012 3:36 UTC
421
points
1,768
comments
7
min read
LW
link
chinchilla’s wild implications
nostalgebraist
31 Jul 2022 1:18 UTC
420
points
128
comments
10
min read
LW
link
1
review
What failure looks like
paulfchristiano
17 Mar 2019 20:18 UTC
419
points
54
comments
8
min read
LW
link
2
reviews
Things I Learned by Spending Five Thousand Hours In Non-EA Charities
jenn
1 Jun 2023 20:48 UTC
416
points
35
comments
8
min read
LW
link
1
review
(jenn.site)
(My understanding of) What Everyone in Technical Alignment is Doing and Why
Thomas Larsen
and
elifland
29 Aug 2022 1:23 UTC
413
points
90
comments
37
min read
LW
link
1
review
Transformers Represent Belief State Geometry in their Residual Stream
Adam Shai
16 Apr 2024 21:16 UTC
412
points
100
comments
12
min read
LW
link
Failures in Kindness
silentbob
26 Mar 2024 21:30 UTC
411
points
60
comments
9
min read
LW
link
GPTs are Predictors, not Imitators
Eliezer Yudkowsky
8 Apr 2023 19:59 UTC
409
points
99
comments
3
min read
LW
link
3
reviews
Back to top
Next