Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Scaling prediction markets with meta-markets
Dentosal
Oct 10, 2024, 9:17 PM
1
point
0
comments
2
min read
LW
link
Startup Success Rates Are So Low Because the Rewards Are So Large
AppliedDivinityStudies
Oct 10, 2024, 8:22 PM
42
points
6
comments
2
min read
LW
link
Can AI Outpredict Humans? Results From Metaculus’s Q3 AI Forecasting Benchmark
ChristianWilliams
Oct 10, 2024, 6:58 PM
50
points
2
comments
1
min read
LW
link
(www.metaculus.com)
Rationality Quotes—Fall 2024
Screwtape
Oct 10, 2024, 6:37 PM
79
points
26
comments
1
min read
LW
link
[Question]
why won’t this alignment plan work?
KvmanThinking
Oct 10, 2024, 3:44 PM
8
points
7
comments
1
min read
LW
link
AI #85: AI Wins the Nobel Prize
Zvi
Oct 10, 2024, 1:40 PM
30
points
6
comments
31
min read
LW
link
(thezvi.wordpress.com)
Behavioral red-teaming is unlikely to produce clear, strong evidence that models aren’t scheming
Buck
Oct 10, 2024, 1:36 PM
100
points
4
comments
13
min read
LW
link
Joshua Achiam Public Statement Analysis
Zvi
Oct 10, 2024, 12:50 PM
73
points
14
comments
21
min read
LW
link
(thezvi.wordpress.com)
Do you want to do a debate on youtube? I’m looking for polite, truth-seeking participants.
Nathan Young
Oct 10, 2024, 9:32 AM
12
points
0
comments
1
min read
LW
link
Rationalist Gnosticism
tailcalled
Oct 10, 2024, 9:06 AM
9
points
10
comments
3
min read
LW
link
The deepest atheist: Sam Altman
Trey Edwin
Oct 10, 2024, 3:27 AM
14
points
2
comments
4
min read
LW
link
Values Are Real Like Harry Potter
johnswentworth
and
David Lorell
Oct 9, 2024, 11:42 PM
83
points
20
comments
5
min read
LW
link
Momentum of Light in Glass
Ben
Oct 9, 2024, 8:19 PM
143
points
44
comments
11
min read
LW
link
vgillioz’s Shortform
vgillioz
Oct 9, 2024, 7:31 PM
1
point
2
comments
1
min read
LW
link
Hamiltonian Dynamics in AI: A Novel Approach to Optimizing Reasoning in Language Models
Javier Marin Valenzuela
Oct 9, 2024, 7:14 PM
3
points
0
comments
10
min read
LW
link
Triangulating My Interpretation of Methods: Black Boxes by Marco J. Nathan
adamShimi
Oct 9, 2024, 7:13 PM
8
points
0
comments
6
min read
LW
link
(formethods.substack.com)
Scaffolding for “Noticing Metacognition”
Raemon
Oct 9, 2024, 5:54 PM
80
points
4
comments
17
min read
LW
link
Safe Predictive Agents with Joint Scoring Rules
Rubi J. Hudson
Oct 9, 2024, 4:38 PM
55
points
10
comments
17
min read
LW
link
Demis Hassabis and Geoffrey Hinton Awarded Nobel Prizes
Anna Gajdova
Oct 9, 2024, 12:56 PM
48
points
14
comments
1
min read
LW
link
Humans are (mostly) metarational
Yair Halberstadt
Oct 9, 2024, 5:51 AM
14
points
6
comments
3
min read
LW
link
[Job Ad] MATS is hiring!
Jana
,
LauraVaughan
,
yams
,
Christian Smith
and
Ryan Kidd
Oct 9, 2024, 2:17 AM
10
points
0
comments
5
min read
LW
link
Palisade is hiring: Exec Assistant, Content Lead, Ops Lead, and Policy Lead
Charlie Rogers-Smith
Oct 9, 2024, 12:04 AM
11
points
0
comments
4
min read
LW
link
AGI & Consciousness—Joscha Bach
Rahul Chand
Oct 8, 2024, 10:51 PM
1
point
0
comments
10
min read
LW
link
Video and transcript of presentation on Otherness and control in the age of AGI
Joe Carlsmith
Oct 8, 2024, 10:30 PM
35
points
1
comment
27
min read
LW
link
From seeded complexity to consciousness—yes, it’s all the same.
eschatail
Oct 8, 2024, 9:31 PM
−23
points
0
comments
2
min read
LW
link
Limits of safe and aligned AI
Shivam
Oct 8, 2024, 9:30 PM
2
points
0
comments
4
min read
LW
link
[Question]
What constitutes an infohazard?
K1r4d4rk.v1
Oct 8, 2024, 9:29 PM
−4
points
8
comments
1
min read
LW
link
[Question]
What makes one a “rationalist”?
mathyouf
Oct 8, 2024, 8:25 PM
7
points
5
comments
3
min read
LW
link
[Intuitive self-models] 4. Trance
Steven Byrnes
Oct 8, 2024, 1:30 PM
75
points
7
comments
24
min read
LW
link
Schelling game evaluations for AI control
Olli Järviniemi
Oct 8, 2024, 12:01 PM
65
points
5
comments
11
min read
LW
link
Thinking About a Pedalboard
jefftk
Oct 8, 2024, 11:50 AM
9
points
2
comments
1
min read
LW
link
(www.jefftk.com)
Overview of strong human intelligence amplification methods
TsviBT
Oct 8, 2024, 8:37 AM
271
points
142
comments
10
min read
LW
link
Near-death experiences
Declan Molony
Oct 8, 2024, 6:34 AM
3
points
1
comment
2
min read
LW
link
The unreasonable effectiveness of plasmid sequencing as a service
Abhishaike Mahajan
Oct 8, 2024, 2:02 AM
23
points
2
comments
13
min read
LW
link
(www.owlposting.com)
There is a globe in your LLM
jacob_drori
Oct 8, 2024, 12:43 AM
86
points
4
comments
1
min read
LW
link
MATS AI Safety Strategy Curriculum v2
DanielFilan
and
Ryan Kidd
Oct 7, 2024, 10:44 PM
42
points
6
comments
13
min read
LW
link
2025 Color Trends
sarahconstantin
Oct 7, 2024, 9:20 PM
40
points
7
comments
6
min read
LW
link
(sarahconstantin.substack.com)
Clarifying Alignment Fundamentals Through the Lens of Ontology
eternal/ephemera
Oct 7, 2024, 8:57 PM
12
points
4
comments
24
min read
LW
link
Ethics on Cosmic Scale, Outer Space Treaty, Directed Panspermia, Forwards-Contamination, Technology Assessment, Planetary Protection, and Fermi’s Paradox
MrFantastic
Oct 7, 2024, 8:56 PM
−12
points
0
comments
1
min read
LW
link
Domain-specific SAEs
jacob_drori
Oct 7, 2024, 8:15 PM
27
points
0
comments
5
min read
LW
link
Metaculus Is Open Source
ChristianWilliams
Oct 7, 2024, 7:55 PM
13
points
0
comments
1
min read
LW
link
(www.metaculus.com)
Research update: Towards a Law of Iterated Expectations for Heuristic Estimators
Eric Neyman
Oct 7, 2024, 7:29 PM
87
points
2
comments
22
min read
LW
link
AI Model Registries: A Foundational Tool for AI Governance
Elliot Mckernon
,
Deric Cheng
and
Gwyn Glasser
Oct 7, 2024, 7:27 PM
20
points
1
comment
4
min read
LW
link
(www.convergenceanalysis.org)
Evaluating the truth of statements in a world of ambiguous language.
Hastings
Oct 7, 2024, 6:08 PM
48
points
19
comments
2
min read
LW
link
Advice for journalists
Nathan Young
Oct 7, 2024, 4:46 PM
100
points
53
comments
9
min read
LW
link
(nathanpmyoung.substack.com)
Time Efficient Resistance Training
romeostevensit
Oct 7, 2024, 3:15 PM
42
points
10
comments
3
min read
LW
link
A Narrow Path: a plan to deal with AI extinction risk
Andrea_Miotti
,
davekasten
and
Tolga
Oct 7, 2024, 1:02 PM
73
points
12
comments
2
min read
LW
link
(www.narrowpath.co)
Toy Models of Feature Absorption in SAEs
chanind
,
hrdkbhatnagar
,
TomasD
and
Joseph Bloom
Oct 7, 2024, 9:56 AM
49
points
8
comments
10
min read
LW
link
An argument that consequentialism is incomplete
cousin_it
Oct 7, 2024, 9:45 AM
32
points
27
comments
1
min read
LW
link
An X-Ray is Worth 15 Features: Sparse Autoencoders for Interpretable Radiology Report Generation
hugofry
,
Ahmed Abdulaal
,
NMontanaBrown
and
a-ijishakin
Oct 7, 2024, 8:53 AM
38
points
0
comments
5
min read
LW
link
(arxiv.org)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel