Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
2
Playing Without Affordances
Alex Hollow
Aug 18, 2022, 11:53 AM
11
points
0
comments
1
min read
LW
link
(alexhollow.wordpress.com)
Goal-directedness: relativising complexity
Morgan_Rogers
Aug 18, 2022, 9:48 AM
3
points
0
comments
11
min read
LW
link
What’s up with the bad Meta projects?
Yitz
Aug 18, 2022, 5:34 AM
42
points
29
comments
1
min read
LW
link
Announcing Encultured AI: Building a Video Game
Andrew_Critch
and
Nick Hay
Aug 18, 2022, 2:16 AM
103
points
26
comments
4
min read
LW
link
Detroit ACX September Meetup
MattArnold
Aug 18, 2022, 12:48 AM
1
point
0
comments
1
min read
LW
link
Matt Yglesias on AI Policy
Grant Demaree
Aug 17, 2022, 11:57 PM
25
points
1
comment
1
min read
LW
link
(www.slowboring.com)
Spoons and Myofascial Trigger Points
vitaliya
Aug 17, 2022, 10:54 PM
5
points
3
comments
1
min read
LW
link
Concrete Advice for Forming Inside Views on AI Safety
Neel Nanda
Aug 17, 2022, 10:02 PM
30
points
6
comments
10
min read
LW
link
Progress links and tweets, 2022-08-17
jasoncrawford
Aug 17, 2022, 9:27 PM
11
points
0
comments
2
min read
LW
link
(rootsofprogress.org)
Conditioning, Prompts, and Fine-Tuning
Adam Jermyn
Aug 17, 2022, 8:52 PM
38
points
9
comments
4
min read
LW
link
The Core of the Alignment Problem is...
Thomas Larsen
,
Jeremy Gillen
and
JamesH
Aug 17, 2022, 8:07 PM
76
points
10
comments
9
min read
LW
link
[Question]
Could the simulation argument also apply to dreams?
Nathan1123
Aug 17, 2022, 7:55 PM
6
points
4
comments
3
min read
LW
link
Interpretability Tools Are an Attack Channel
Thane Ruthenis
Aug 17, 2022, 6:47 PM
42
points
14
comments
1
min read
LW
link
Human Mimicry Mainly Works When We’re Already Close
johnswentworth
Aug 17, 2022, 6:41 PM
82
points
16
comments
5
min read
LW
link
Thoughts on ‘List of Lethalities’
Alex Lawsen
Aug 17, 2022, 6:33 PM
27
points
0
comments
10
min read
LW
link
The longest training run
Jsevillamol
,
Tamay
,
Owen D
and
anson.ho
Aug 17, 2022, 5:18 PM
71
points
12
comments
9
min read
LW
link
(epochai.org)
Spoiler-Free Review: Across the Obelisk
Zvi
Aug 17, 2022, 2:30 PM
17
points
0
comments
6
min read
LW
link
(thezvi.wordpress.com)
Autonomy as taking responsibility for reference maintenance
Ramana Kumar
Aug 17, 2022, 12:50 PM
61
points
3
comments
5
min read
LW
link
Duplicating Rasberry Pi Images
jefftk
Aug 17, 2022, 12:10 PM
9
points
4
comments
4
min read
LW
link
(www.jefftk.com)
ACX Meetup—Amsterdam
Pierre Vandenberghe
Aug 17, 2022, 9:56 AM
2
points
1
comment
1
min read
LW
link
Insufficient awareness of how everything sucks
Flaglandbase
Aug 17, 2022, 8:01 AM
−13
points
5
comments
1
min read
LW
link
Mesa-optimization for goals defined only within a training environment is dangerous
Rubi J. Hudson
Aug 17, 2022, 3:56 AM
6
points
2
comments
4
min read
LW
link
ACX / SSC Meetup Singapore
DG
Aug 17, 2022, 2:08 AM
2
points
1
comment
1
min read
LW
link
That-time-of-year Astral Codex Ten Meetup
Ben Smith
Aug 17, 2022, 12:02 AM
3
points
2
comments
1
min read
LW
link
SSC Reno Meetup
Steven
Aug 16, 2022, 11:37 PM
1
point
3
comments
1
min read
LW
link
My thoughts on direct work (and joining LessWrong)
RobertM
Aug 16, 2022, 6:53 PM
58
points
4
comments
6
min read
LW
link
We can make the future a million years from now go better [video]
Writer
Aug 16, 2022, 1:03 PM
7
points
1
comment
6
min read
LW
link
(youtu.be)
The Open Society and Its Enemies: Summary and Thoughts
matto
Aug 16, 2022, 11:44 AM
12
points
4
comments
17
min read
LW
link
An introduction to signalling theory
Mvolz
Aug 16, 2022, 9:37 AM
17
points
1
comment
5
min read
LW
link
Understanding differences between humans and intelligence-in-general to build safe AGI
Florian_Dietz
Aug 16, 2022, 8:27 AM
7
points
8
comments
1
min read
LW
link
Against population ethics
jasoncrawford
Aug 16, 2022, 5:19 AM
29
points
39
comments
3
min read
LW
link
Deception as the optimal: mesa-optimizers and inner alignment
Eleni Angelou
Aug 16, 2022, 4:49 AM
11
points
0
comments
5
min read
LW
link
Crowdsourcing Anki Decks
Arden
Aug 16, 2022, 2:53 AM
1
point
0
comments
1
min read
LW
link
What Makes an Idea Understandable? On Architecturally and Culturally Natural Ideas.
NickyP
,
Peter S. Park
and
Stephen Fowler
Aug 16, 2022, 2:09 AM
21
points
2
comments
16
min read
LW
link
Dwarves & D.Sci: Data Fortress Evaluation & Ruleset
aphyer
Aug 16, 2022, 12:15 AM
26
points
10
comments
8
min read
LW
link
I’m mildly skeptical that blindness prevents schizophrenia
Steven Byrnes
Aug 15, 2022, 11:36 PM
83
points
9
comments
4
min read
LW
link
What’s General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?
johnswentworth
Aug 15, 2022, 10:48 PM
156
points
18
comments
10
min read
LW
link
“What Mistakes Are You Making Right Now?”
David Udell
Aug 15, 2022, 9:19 PM
13
points
2
comments
1
min read
LW
link
On Preference Manipulation in Reward Learning Processes
Felix Hofstätter
Aug 15, 2022, 7:32 PM
8
points
0
comments
4
min read
LW
link
Cambist Booking: Discussing What We Value
Screwtape
Aug 15, 2022, 6:24 PM
5
points
1
comment
1
min read
LW
link
Capital and inequality
NathanBarnard
Aug 15, 2022, 5:23 PM
7
points
2
comments
5
min read
LW
link
[Question]
Are there practical exercises for developing the Scout mindset?
ChristianKl
Aug 15, 2022, 5:23 PM
15
points
2
comments
1
min read
LW
link
[Question]
How do you get a job as a software developer?
lsusr
Aug 15, 2022, 2:45 PM
22
points
24
comments
1
min read
LW
link
The Parable of the Boy Who Cried 5% Chance of Wolf
KatWoods
Aug 15, 2022, 2:33 PM
140
points
24
comments
2
min read
LW
link
And the Revenues Are So Small
Zvi
Aug 15, 2022, 1:00 PM
19
points
5
comments
11
min read
LW
link
(thezvi.wordpress.com)
Extreme Security
lc
Aug 15, 2022, 12:11 PM
38
points
6
comments
5
min read
LW
link
No shortcuts to knowledge: Why AI needs to ease up on scaling and learn how to code
Yldedly
Aug 15, 2022, 8:42 AM
5
points
0
comments
1
min read
LW
link
(deoxyribose.github.io)
Seeking Interns/RAs for Mechanistic Interpretability Projects
Neel Nanda
Aug 15, 2022, 7:11 AM
61
points
0
comments
2
min read
LW
link
A Mechanistic Interpretability Analysis of Grokking
Neel Nanda
and
Tom Lieberum
Aug 15, 2022, 2:41 AM
373
points
48
comments
36
min read
LW
link
1
review
(colab.research.google.com)
[Question]
If a nuke is coming towards SF Bay can people bunker in BART tunnels?
Pee Doom
Aug 15, 2022, 1:56 AM
15
points
2
comments
1
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel