Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
2
The Slippery Slope from DALLE-2 to Deepfake Anarchy
scasper
Nov 5, 2022, 2:53 PM
17
points
9
comments
11
min read
LW
link
When can a mimic surprise you? Why generative models handle seemingly ill-posed problems
David Johnston
Nov 5, 2022, 1:19 PM
8
points
4
comments
16
min read
LW
link
My summary of “Pragmatic AI Safety”
Eleni Angelou
Nov 5, 2022, 12:54 PM
3
points
0
comments
5
min read
LW
link
Review of the Challenge
SD Marlow
Nov 5, 2022, 6:38 AM
−14
points
5
comments
2
min read
LW
link
Spectrum of Independence
jefftk
Nov 5, 2022, 2:40 AM
43
points
7
comments
1
min read
LW
link
(www.jefftk.com)
[paper link] Interpreting systems as solving POMDPs: a step towards a formal understanding of agency
the gears to ascension
Nov 5, 2022, 1:06 AM
13
points
2
comments
1
min read
LW
link
(www.semanticscholar.org)
Metaculus is seeking Software Engineers
dschwarz
Nov 5, 2022, 12:42 AM
18
points
0
comments
1
min read
LW
link
(apply.workable.com)
Should we “go against nature”?
jasoncrawford
Nov 4, 2022, 10:14 PM
10
points
3
comments
2
min read
LW
link
(rootsofprogress.org)
How much should we care about non-human animals?
bokov
Nov 4, 2022, 9:36 PM
16
points
8
comments
2
min read
LW
link
(www.lesswrong.com)
For ELK truth is mostly a distraction
c.trout
Nov 4, 2022, 9:14 PM
44
points
0
comments
21
min read
LW
link
Toy Models and Tegum Products
Adam Jermyn
Nov 4, 2022, 6:51 PM
28
points
7
comments
5
min read
LW
link
Ethan Caballero on Broken Neural Scaling Laws, Deception, and Recursive Self Improvement
Michaël Trazzi
and
Ethan Caballero
Nov 4, 2022, 6:09 PM
16
points
11
comments
10
min read
LW
link
(theinsideview.ai)
Follow up to medical miracle
Elizabeth
Nov 4, 2022, 6:00 PM
76
points
5
comments
6
min read
LW
link
(acesounderglass.com)
Cross-Void Optimization
pneumynym
Nov 4, 2022, 5:47 PM
1
point
1
comment
8
min read
LW
link
Monthly Shorts 10/22
Celer
Nov 4, 2022, 4:30 PM
12
points
0
comments
6
min read
LW
link
(keller.substack.com)
Weekly Roundup #4
Zvi
Nov 4, 2022, 3:00 PM
42
points
1
comment
6
min read
LW
link
(thezvi.wordpress.com)
A new place to discuss cognitive science, ethics and human alignment
Daniel_Friedrich
Nov 4, 2022, 2:34 PM
3
points
4
comments
LW
link
A newcomer’s guide to the technical AI safety field
zeshen
Nov 4, 2022, 2:29 PM
42
points
3
comments
10
min read
LW
link
[Question]
Are alignment researchers devoting enough time to improving their research capacity?
Carson Jones
Nov 4, 2022, 12:58 AM
13
points
3
comments
3
min read
LW
link
[Question]
Don’t you think RLHF solves outer alignment?
Charbel-Raphaël
Nov 4, 2022, 12:36 AM
9
points
23
comments
1
min read
LW
link
Mechanistic Interpretability as Reverse Engineering (follow-up to “cars and elephants”)
David Scott Krueger (formerly: capybaralet)
Nov 3, 2022, 11:19 PM
28
points
3
comments
1
min read
LW
link
[Question]
Could a Supreme Court suit work to solve NEPA problems?
ChristianKl
Nov 3, 2022, 9:10 PM
15
points
0
comments
1
min read
LW
link
[Video] How having Fast Fourier Transforms sooner could have helped with Nuclear Disarmament—Veritaserum
mako yass
Nov 3, 2022, 9:04 PM
17
points
1
comment
LW
link
Further considerations on the Evidentialist’s Wager
Martín Soto
Nov 3, 2022, 8:06 PM
3
points
9
comments
8
min read
LW
link
AI as a Civilizational Risk Part 6/6: What can be done
PashaKamyshev
Nov 3, 2022, 7:48 PM
2
points
4
comments
4
min read
LW
link
A Mystery About High Dimensional Concept Encoding
Fabien Roger
Nov 3, 2022, 5:05 PM
46
points
13
comments
7
min read
LW
link
Why do we post our AI safety plans on the Internet?
Peter S. Park
Nov 3, 2022, 4:02 PM
4
points
4
comments
11
min read
LW
link
Multiple Deploy-Key Repos
jefftk
Nov 3, 2022, 3:10 PM
15
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Covid 11/3/22: Asking Forgiveness
Zvi
Nov 3, 2022, 1:50 PM
23
points
3
comments
6
min read
LW
link
(thezvi.wordpress.com)
Adversarial Policies Beat Professional-Level Go AIs
sanxiyn
Nov 3, 2022, 1:27 PM
31
points
35
comments
1
min read
LW
link
(goattack.alignmentfund.org)
K-types vs T-types — what priors do you have?
Cleo Nardo
Nov 3, 2022, 11:29 AM
74
points
25
comments
7
min read
LW
link
Information Markets 2: Optimally Shaped Reward Bets
eva_
Nov 3, 2022, 11:08 AM
9
points
0
comments
3
min read
LW
link
The Rational Utilitarian Love Movement (A Historical Retrospective)
Caleb Biddulph
Nov 3, 2022, 7:11 AM
3
points
0
comments
LW
link
The Mirror Chamber: A short story exploring the anthropic measure function and why it can matter
mako yass
Nov 3, 2022, 6:47 AM
30
points
13
comments
10
min read
LW
link
Open Letter Against Reckless Nuclear Escalation and Use
Max Tegmark
Nov 3, 2022, 5:34 AM
27
points
25
comments
1
min read
LW
link
Lazy Python Argument Parsing
jefftk
Nov 3, 2022, 2:20 AM
20
points
3
comments
1
min read
LW
link
(www.jefftk.com)
AI as a Civilizational Risk Part 5/6: Relationship between C-risk and X-risk
PashaKamyshev
Nov 3, 2022, 2:19 AM
2
points
0
comments
7
min read
LW
link
[Question]
Is there a good way to award a fixed prize in a prediction contest?
jchan
Nov 2, 2022, 9:37 PM
18
points
5
comments
1
min read
LW
link
“Are Experiments Possible?” Seeds of Science call for reviewers
rogersbacon
Nov 2, 2022, 8:05 PM
8
points
0
comments
1
min read
LW
link
Humans do acausal coordination all the time
Adam Jermyn
Nov 2, 2022, 2:40 PM
57
points
35
comments
3
min read
LW
link
Far-UVC Light Update: No, LEDs are not around the corner (tweetstorm)
Davidmanheim
Nov 2, 2022, 12:57 PM
73
points
27
comments
4
min read
LW
link
(twitter.com)
Housing and Transit Thoughts #1
Zvi
Nov 2, 2022, 12:10 PM
35
points
5
comments
16
min read
LW
link
(thezvi.wordpress.com)
Mind is uncountable
Filip Sondej
Nov 2, 2022, 11:51 AM
18
points
22
comments
LW
link
AI Safety Needs Great Product Builders
goodgravy
Nov 2, 2022, 11:33 AM
14
points
2
comments
LW
link
Why is fiber good for you?
braces
Nov 2, 2022, 2:04 AM
18
points
2
comments
2
min read
LW
link
Information Markets
eva_
Nov 2, 2022, 1:24 AM
46
points
6
comments
12
min read
LW
link
Sequence Reread: Fake Beliefs [plus sequence spotlight meta]
Raemon
Nov 2, 2022, 12:09 AM
27
points
3
comments
1
min read
LW
link
Real-Time Research Recording: Can a Transformer Re-Derive Positional Info?
Neel Nanda
Nov 1, 2022, 11:56 PM
69
points
16
comments
1
min read
LW
link
(youtu.be)
All AGI Safety questions welcome (especially basic ones) [~monthly thread]
Robert Miles
Nov 1, 2022, 11:23 PM
68
points
105
comments
2
min read
LW
link
[Question]
Which Issues in Conceptual Alignment have been Formalised or Observed (or not)?
ojorgensen
Nov 1, 2022, 10:32 PM
4
points
0
comments
1
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel