Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
2
AI Safety Cheatsheet / Quick Reference
Zohar Jackson
Jul 20, 2022, 9:39 AM
3
points
0
comments
1
min read
LW
link
(github.com)
Getting Unstuck on Counterfactuals
Chris_Leong
Jul 20, 2022, 5:31 AM
7
points
1
comment
2
min read
LW
link
Pitfalls with Proofs
scasper
Jul 19, 2022, 10:21 PM
19
points
21
comments
8
min read
LW
link
A daily routine I do for my AI safety research work
scasper
Jul 19, 2022, 9:58 PM
22
points
7
comments
1
min read
LW
link
Progress links and tweets, 2022-07-19
jasoncrawford
Jul 19, 2022, 8:50 PM
11
points
1
comment
1
min read
LW
link
(rootsofprogress.org)
Applications are open for CFAR workshops in Prague this fall!
John Steidley
Jul 19, 2022, 6:29 PM
64
points
3
comments
2
min read
LW
link
Sexual Abuse attitudes might be infohazardous
Pseudonymous Otter
Jul 19, 2022, 6:06 PM
256
points
72
comments
1
min read
LW
link
Spending Update 2022
jefftk
Jul 19, 2022, 2:10 PM
28
points
0
comments
3
min read
LW
link
(www.jefftk.com)
Abram Demski’s ELK thoughts and proposal—distillation
Rubi J. Hudson
Jul 19, 2022, 6:57 AM
19
points
8
comments
16
min read
LW
link
Bounded complexity of solving ELK and its implications
Rubi J. Hudson
Jul 19, 2022, 6:56 AM
11
points
4
comments
18
min read
LW
link
Help ARC evaluate capabilities of current language models (still need people)
Beth Barnes
Jul 19, 2022, 4:55 AM
95
points
6
comments
2
min read
LW
link
A Critique of AI Alignment Pessimism
ExCeph
Jul 19, 2022, 2:28 AM
9
points
1
comment
9
min read
LW
link
Ars D&D.Sci: Mysteries of Mana Evaluation & Ruleset
aphyer
Jul 19, 2022, 2:06 AM
33
points
4
comments
5
min read
LW
link
Marburg Virus Pandemic Prediction Checklist
DirectedEvolution
Jul 18, 2022, 11:15 PM
30
points
0
comments
5
min read
LW
link
At what point will we know if Eliezer’s predictions are right or wrong?
anonymous123456
Jul 18, 2022, 10:06 PM
5
points
6
comments
1
min read
LW
link
Modelling Deception
Garrett Baker
Jul 18, 2022, 9:21 PM
15
points
0
comments
7
min read
LW
link
Are Intelligence and Generality Orthogonal?
cubefox
Jul 18, 2022, 8:07 PM
18
points
16
comments
1
min read
LW
link
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra
Jul 18, 2022, 7:06 PM
368
points
95
comments
75
min read
LW
link
1
review
Turning Some Inconsistent Preferences into Consistent Ones
niplav
Jul 18, 2022, 6:40 PM
23
points
5
comments
12
min read
LW
link
Addendum: A non-magical explanation of Jeffrey Epstein
lc
Jul 18, 2022, 5:40 PM
81
points
21
comments
11
min read
LW
link
Launching a new progress institute, seeking a CEO
jasoncrawford
Jul 18, 2022, 4:58 PM
25
points
2
comments
3
min read
LW
link
(rootsofprogress.org)
Machine Learning Model Sizes and the Parameter Gap [abridged]
Pablo Villalobos
Jul 18, 2022, 4:51 PM
20
points
0
comments
1
min read
LW
link
(epochai.org)
Quantilizers and Generative Models
Adam Jermyn
Jul 18, 2022, 4:32 PM
24
points
5
comments
4
min read
LW
link
AI Hiroshima (Does A Vivid Example Of Destruction Forestall Apocalypse?)
Sable
Jul 18, 2022, 12:06 PM
4
points
4
comments
2
min read
LW
link
How the ---- did Feynman Get Here !?
George3d6
Jul 18, 2022, 9:43 AM
8
points
8
comments
3
min read
LW
link
(www.epistem.ink)
Conditioning Generative Models for Alignment
Jozdien
Jul 18, 2022, 7:11 AM
60
points
8
comments
20
min read
LW
link
Training goals for large language models
Johannes Treutlein
Jul 18, 2022, 7:09 AM
28
points
5
comments
19
min read
LW
link
A distillation of Evan Hubinger’s training stories (for SERI MATS)
Daphne_W
Jul 18, 2022, 3:38 AM
15
points
1
comment
10
min read
LW
link
Forecasting ML Benchmarks in 2023
jsteinhardt
Jul 18, 2022, 2:50 AM
36
points
20
comments
12
min read
LW
link
(bounded-regret.ghost.io)
What should you change in response to an “emergency”? And AI risk
AnnaSalamon
Jul 18, 2022, 1:11 AM
339
points
60
comments
6
min read
LW
link
1
review
Deception?! I ain’t got time for that!
Paul Colognese
Jul 18, 2022, 12:06 AM
55
points
5
comments
13
min read
LW
link
How Interpretability can be Impactful
Connall Garrod
Jul 18, 2022, 12:06 AM
18
points
0
comments
37
min read
LW
link
Why you might expect homogeneous take-off: evidence from ML research
Andrei Alexandru
Jul 17, 2022, 8:31 PM
24
points
0
comments
10
min read
LW
link
Examples of AI Increasing AI Progress
TW123
Jul 17, 2022, 8:06 PM
107
points
14
comments
1
min read
LW
link
Four questions I ask AI safety researchers
Orpheus16
Jul 17, 2022, 5:25 PM
17
points
0
comments
1
min read
LW
link
Why I Think Abrupt AI Takeoff
lincolnquirk
Jul 17, 2022, 5:04 PM
14
points
6
comments
1
min read
LW
link
Culture wars in riddle format
Malmesbury
Jul 17, 2022, 2:51 PM
7
points
28
comments
3
min read
LW
link
Bangalore LW/ACX Meetup in person
Vyakart
Jul 17, 2022, 6:53 AM
1
point
0
comments
1
min read
LW
link
Resolve Cycles
CFAR!Duncan
Jul 16, 2022, 11:17 PM
140
points
8
comments
10
min read
LW
link
Alignment as Game Design
Shoshannah Tekofsky
Jul 16, 2022, 10:36 PM
11
points
7
comments
2
min read
LW
link
Risk Management from a Climbers Perspective
Annapurna
Jul 16, 2022, 9:14 PM
5
points
0
comments
6
min read
LW
link
(jorgevelez.substack.com)
Cognitive Instability, Physicalism, and Free Will
dadadarren
Jul 16, 2022, 1:13 PM
5
points
27
comments
2
min read
LW
link
(www.sleepingbeautyproblem.com)
All AGI safety questions welcome (especially basic ones) [July 2022]
plex
and
Robert Miles
Jul 16, 2022, 12:57 PM
84
points
132
comments
3
min read
LW
link
QNR Prospects
PeterMcCluskey
16 Jul 2022 2:03 UTC
40
points
3
comments
8
min read
LW
link
(www.bayesianinvestor.com)
To-do waves
Paweł Sysiak
16 Jul 2022 1:19 UTC
3
points
0
comments
3
min read
LW
link
Moneypumping Bryan Caplan’s Belief in Free Will
Morpheus
16 Jul 2022 0:46 UTC
5
points
9
comments
1
min read
LW
link
A summary of every “Highlights from the Sequences” post
Orpheus16
15 Jul 2022 23:01 UTC
98
points
7
comments
17
min read
LW
link
Safety Implications of LeCun’s path to machine intelligence
Ivan Vendrov
15 Jul 2022 21:47 UTC
102
points
18
comments
6
min read
LW
link
Comfort Zone Exploration
CFAR!Duncan
15 Jul 2022 21:18 UTC
51
points
2
comments
12
min read
LW
link
A time-invariant version of Laplace’s rule
Jsevillamol
and
Ege Erdil
15 Jul 2022 19:28 UTC
72
points
13
comments
17
min read
LW
link
(epochai.org)
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel