Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
2
Proposed Orthogonality Theses #2-5
rjbg
Jul 14, 2022, 10:59 PM
8
points
0
comments
2
min read
LW
link
Better Quiddler
jefftk
Jul 14, 2022, 5:40 PM
17
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Circumventing interpretability: How to defeat mind-readers
Lee Sharkey
Jul 14, 2022, 4:59 PM
114
points
15
comments
33
min read
LW
link
Covid 7/14/22: BA.2.75 Plus Tax
Zvi
Jul 14, 2022, 2:40 PM
39
points
9
comments
8
min read
LW
link
(thezvi.wordpress.com)
Criticism of EA Criticism Contest
Zvi
Jul 14, 2022, 2:30 PM
108
points
17
comments
31
min read
LW
link
1
review
(thezvi.wordpress.com)
Humans provide an untapped wealth of evidence about alignment
TurnTrout
and
Quintin Pope
Jul 14, 2022, 2:31 AM
212
points
94
comments
9
min read
LW
link
1
review
[Question]
Wacky, risky, anti-inductive intelligence-enhancement methods?
Nicholas / Heather Kross
Jul 14, 2022, 1:40 AM
20
points
30
comments
1
min read
LW
link
[Question]
How to impress students with recent advances in ML?
Charbel-Raphaël
Jul 14, 2022, 12:03 AM
12
points
2
comments
1
min read
LW
link
Notes on Love
David Gross
Jul 13, 2022, 11:35 PM
18
points
3
comments
29
min read
LW
link
Deep learning curriculum for large language model alignment
Jacob_Hilton
Jul 13, 2022, 9:58 PM
57
points
3
comments
1
min read
LW
link
(github.com)
Artificial Sandwiching: When can we test scalable alignment protocols without humans?
Sam Bowman
Jul 13, 2022, 9:14 PM
42
points
6
comments
5
min read
LW
link
[Question]
Any tips for eliciting one’s own latent knowledge?
MSRayne
Jul 13, 2022, 9:12 PM
16
points
20
comments
2
min read
LW
link
Goal Alignment Is Robust To the Sharp Left Turn
Thane Ruthenis
Jul 13, 2022, 8:23 PM
43
points
16
comments
4
min read
LW
link
Making decisions using multiple worldviews
Richard_Ngo
Jul 13, 2022, 7:15 PM
50
points
10
comments
11
min read
LW
link
[Question]
App idea to help with reading STEM textbooks (feedback request)
DirectedEvolution
Jul 13, 2022, 6:28 PM
16
points
8
comments
2
min read
LW
link
MIRI Conversations: Technology Forecasting & Gradualism (Distillation)
CallumMcDougall
Jul 13, 2022, 3:55 PM
31
points
1
comment
20
min read
LW
link
Passing Up Pay
jefftk
Jul 13, 2022, 2:10 PM
29
points
8
comments
5
min read
LW
link
(www.jefftk.com)
[Question]
How could the universe be infinitely large?
amarai
Jul 13, 2022, 1:45 PM
0
points
8
comments
1
min read
LW
link
John von Neumann on how to safely progress with technology
Dalton Mabery
Jul 13, 2022, 11:07 AM
14
points
0
comments
1
min read
LW
link
Everyone is an Imposter
Tharin
Jul 13, 2022, 8:46 AM
19
points
1
comment
9
min read
LW
link
(echoesandchimes.com)
[Question]
Which AI Safety research agendas are the most promising?
Chris_Leong
Jul 13, 2022, 7:54 AM
27
points
5
comments
1
min read
LW
link
Straw-Steelmanning
Chris van Merwijk
Jul 13, 2022, 5:48 AM
29
points
2
comments
1
min read
LW
link
Alien Message Contest: Solution
DaemonicSigil
Jul 13, 2022, 4:07 AM
29
points
2
comments
4
min read
LW
link
[Question]
What is wrong with this approach to corrigibility?
Rafael Cosman
Jul 12, 2022, 10:55 PM
7
points
8
comments
1
min read
LW
link
Acceptability Verification: A Research Agenda
David Udell
and
evhub
Jul 12, 2022, 8:11 PM
50
points
0
comments
1
min read
LW
link
(docs.google.com)
Progress links and tweets, 2022-07-12
jasoncrawford
Jul 12, 2022, 3:30 PM
12
points
0
comments
1
min read
LW
link
(rootsofprogress.org)
Response to Blake Richards: AGI, generality, alignment, & loss functions
Steven Byrnes
Jul 12, 2022, 1:56 PM
62
points
9
comments
15
min read
LW
link
Three Minimum Pivotal Acts Possible by Narrow AI
Michael Soareverix
Jul 12, 2022, 9:51 AM
0
points
4
comments
2
min read
LW
link
Mosaic and Palimpsests: Two Shapes of Research
adamShimi
Jul 12, 2022, 9:05 AM
39
points
3
comments
9
min read
LW
link
[Question]
How do you concisely communicate & navigate the politics / culture at your job working at a large corporation or institution?
Willa
Jul 12, 2022, 3:22 AM
10
points
6
comments
1
min read
LW
link
On how various plans miss the hard bits of the alignment challenge
So8res
Jul 12, 2022, 2:49 AM
313
points
89
comments
29
min read
LW
link
3
reviews
Rainmaking
WalterL
Jul 12, 2022, 12:42 AM
26
points
5
comments
1
min read
LW
link
(www.youtube.com)
Book Review: Neal Stephenson’s “Termination Shock”
Tyler Simmons
Jul 12, 2022, 12:07 AM
13
points
0
comments
30
min read
LW
link
(www.words-and-dirt.com)
Announcing Future Forum—Apply Now
wANIEL
and
freemany
Jul 11, 2022, 10:57 PM
8
points
0
comments
4
min read
LW
link
(forum.effectivealtruism.org)
Defining Optimization in a Deeper Way Part 2
J Bostock
Jul 11, 2022, 8:29 PM
7
points
0
comments
4
min read
LW
link
Marriage, the Giving What We Can Pledge, and the damage caused by vague public commitments
Jeffrey Ladish
Jul 11, 2022, 7:38 PM
98
points
27
comments
6
min read
LW
link
1
review
Systemization
CFAR!Duncan
Jul 11, 2022, 6:39 PM
42
points
5
comments
12
min read
LW
link
[Question]
How do AI timelines affect how you live your life?
Quadratic Reciprocity
Jul 11, 2022, 1:54 PM
80
points
50
comments
1
min read
LW
link
Cambridge LW Meetup: Free Speech
Darmani
Jul 11, 2022, 4:36 AM
7
points
0
comments
1
min read
LW
link
Checksum Sensor Alignment
lsusr
Jul 11, 2022, 3:31 AM
12
points
2
comments
1
min read
LW
link
The Alignment Problem
lsusr
Jul 11, 2022, 3:03 AM
47
points
18
comments
3
min read
LW
link
Immanuel Kant and the Decision Theory App Store
Daniel Kokotajlo
Jul 10, 2022, 4:04 PM
92
points
12
comments
5
min read
LW
link
Metaculus is seeking experienced leaders, researchers & operators for high-impact roles
ChristianWilliams
Jul 10, 2022, 2:27 PM
9
points
0
comments
1
min read
LW
link
(apply.workable.com)
Avoid the abbreviation “FLOPs” – use “FLOP” or “FLOP/s” instead
Daniel_Eth
Jul 10, 2022, 10:44 AM
70
points
13
comments
1
min read
LW
link
My Opportunity Costs
abstractapplic
Jul 10, 2022, 10:14 AM
22
points
3
comments
3
min read
LW
link
Why Portland
Adam Zerner
Jul 10, 2022, 7:20 AM
25
points
18
comments
9
min read
LW
link
Hessian and Basin volume
Vivek Hebbar
Jul 10, 2022, 6:59 AM
35
points
10
comments
4
min read
LW
link
Taste & Shaping
CFAR!Duncan
Jul 10, 2022, 5:50 AM
67
points
1
comment
16
min read
LW
link
Comment on “Propositions Concerning Digital Minds and Society”
Zack_M_Davis
Jul 10, 2022, 5:48 AM
99
points
12
comments
8
min read
LW
link
Heaven: The last part of dystopia
Existism
Jul 9, 2022, 10:36 PM
−1
points
1
comment
6
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel