Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
2
Gliders in Language Models
Alexandre Variengien
Nov 25, 2022, 12:38 AM
30
points
11
comments
10
min read
LW
link
On Kelly and altruism
philh
Nov 24, 2022, 11:40 PM
17
points
6
comments
12
min read
LW
link
(reasonableapproximation.net)
Open technical problem: A Quinean proof of Löb’s theorem, for an easier cartoon guide
Andrew_Critch
Nov 24, 2022, 9:16 PM
58
points
35
comments
3
min read
LW
link
1
review
[Question]
Historical examples of people gaining unusual cognitive abilities?
Nicholas / Heather Kross
Nov 24, 2022, 7:01 PM
8
points
2
comments
1
min read
LW
link
Corrigibility Via Thought-Process Deference
Thane Ruthenis
Nov 24, 2022, 5:06 PM
17
points
5
comments
9
min read
LW
link
Geometric Exploration, Arithmetic Exploitation
Scott Garrabrant
Nov 24, 2022, 3:36 PM
126
points
5
comments
7
min read
LW
link
What I Learned Running Refine
adamShimi
Nov 24, 2022, 2:49 PM
108
points
5
comments
4
min read
LW
link
Covid 11/24/22: Thanks for Good Health
Zvi
Nov 24, 2022, 1:00 PM
26
points
4
comments
8
min read
LW
link
(thezvi.wordpress.com)
[Question]
Dumb and ill-posed question: Is conceptual research like this MIRI paper on the shutdown problem/Corrigibility “real”
joraine
Nov 24, 2022, 5:08 AM
26
points
11
comments
1
min read
LW
link
Clarifying wireheading terminology
leogao
Nov 24, 2022, 4:53 AM
66
points
6
comments
1
min read
LW
link
LW Beta Feature: Side-Comments
jimrandomh
Nov 24, 2022, 1:55 AM
103
points
47
comments
1
min read
LW
link
Against “Classic Style”
Cleo Nardo
Nov 23, 2022, 10:10 PM
67
points
30
comments
4
min read
LW
link
South Bay ACX/LW Meetup
IS
Nov 23, 2022, 10:05 PM
2
points
0
comments
1
min read
LW
link
Meme Dialects
jefftk
Nov 23, 2022, 9:30 PM
26
points
1
comment
2
min read
LW
link
(www.jefftk.com)
[Question]
When do you visualize (or not) while doing math?
Alex_Altair
Nov 23, 2022, 8:15 PM
21
points
9
comments
1
min read
LW
link
When AI solves a game, focus on the game’s mechanics, not its theme.
Cleo Nardo
Nov 23, 2022, 7:16 PM
89
points
7
comments
2
min read
LW
link
The Geometric Expectation
Scott Garrabrant
Nov 23, 2022, 6:05 PM
159
points
22
comments
4
min read
LW
link
“Far Coordination”
DragonGod
Nov 23, 2022, 5:14 PM
6
points
17
comments
9
min read
LW
link
Conjecture Second Hiring Round
Connor Leahy
,
Sid Black
,
Gabriel Alfour
and
Chris Scammell
Nov 23, 2022, 5:11 PM
92
points
0
comments
1
min read
LW
link
Conjecture: a retrospective after 8 months of work
Connor Leahy
,
Sid Black
,
Gabriel Alfour
and
Chris Scammell
Nov 23, 2022, 5:10 PM
180
points
9
comments
8
min read
LW
link
Against a General Factor of Doom
Jeffrey Heninger
Nov 23, 2022, 4:50 PM
61
points
19
comments
4
min read
LW
link
1
review
(aiimpacts.org)
Injecting some numbers into the AGI debate—by Boaz Barak
Jsevillamol
Nov 23, 2022, 4:10 PM
12
points
0
comments
3
min read
LW
link
(windowsontheory.org)
Notes on an Experiment with Markets
Jeffrey Heninger
Nov 23, 2022, 4:10 PM
8
points
0
comments
4
min read
LW
link
(aiimpacts.org)
Announcing AI safety Mentors and Mentees
Marius Hobbhahn
Nov 23, 2022, 3:21 PM
62
points
7
comments
10
min read
LW
link
Ex nihilo
Hopkins Stanley
Nov 23, 2022, 2:38 PM
1
point
0
comments
1
min read
LW
link
Human-level Diplomacy was my fire alarm
Lao Mein
Nov 23, 2022, 10:05 AM
54
points
15
comments
3
min read
LW
link
Sets of objectives for a multi-objective RL agent to optimize
Ben Smith
and
Roland Pihlakas
Nov 23, 2022, 6:49 AM
13
points
0
comments
8
min read
LW
link
Simulators, constraints, and goal agnosticism: porbynotes vol. 1
porby
Nov 23, 2022, 4:22 AM
37
points
2
comments
35
min read
LW
link
Rationalist Town Hall: FTX Fallout Edition (RSVP Required)
Ben Pace
Nov 23, 2022, 1:38 AM
43
points
13
comments
2
min read
LW
link
Feeling Old: Leaving your 20s in the 2020s
squidious
Nov 22, 2022, 10:50 PM
37
points
3
comments
1
min read
LW
link
(opalsandbonobos.blogspot.com)
Brute-forcing the universe: a non-standard shot at diamond alignment
Martín Soto
Nov 22, 2022, 10:36 PM
9
points
2
comments
20
min read
LW
link
Announcing AI Alignment Awards: $100k research contests about goal misgeneralization & corrigibility
Orpheus16
and
OliviaJ
Nov 22, 2022, 10:19 PM
73
points
20
comments
4
min read
LW
link
ACX Zurich November Meetup
MB
Nov 22, 2022, 9:41 PM
1
point
0
comments
1
min read
LW
link
Human-level Full-Press Diplomacy (some bare facts).
Cleo Nardo
Nov 22, 2022, 8:59 PM
50
points
7
comments
3
min read
LW
link
[Question]
How does late-2022 COVID transmissibility drop over time?
Daniel Dewey
Nov 22, 2022, 7:54 PM
8
points
2
comments
1
min read
LW
link
AI will change the world, but won’t take it over by playing “3-dimensional chess”.
boazbarak
and
benedelman
Nov 22, 2022, 6:57 PM
134
points
97
comments
24
min read
LW
link
Progress links and tweets, 2022-11-22
jasoncrawford
Nov 22, 2022, 5:39 PM
17
points
0
comments
1
min read
LW
link
(rootsofprogress.org)
Tyranny of the Epistemic Majority
Scott Garrabrant
Nov 22, 2022, 5:19 PM
192
points
13
comments
9
min read
LW
link
1
review
A Walkthrough of In-Context Learning and Induction Heads (w/ Charles Frye) Part 1 of 2
Neel Nanda
Nov 22, 2022, 5:12 PM
20
points
0
comments
1
min read
LW
link
(www.youtube.com)
Simple Improvement to College Football Overtime Rules
Zvi
Nov 22, 2022, 5:00 PM
10
points
0
comments
1
min read
LW
link
(thezvi.wordpress.com)
Meta AI announces Cicero: Human-Level Diplomacy play (with dialogue)
Jacy Reese Anthis
Nov 22, 2022, 4:50 PM
93
points
64
comments
1
min read
LW
link
(www.science.org)
Austin LW meetup notes: The FTX Affair
jchan
Nov 22, 2022, 2:01 PM
20
points
3
comments
16
min read
LW
link
Motivated Cognition and the Multiverse of Truth
Q Home
Nov 22, 2022, 12:51 PM
8
points
16
comments
24
min read
LW
link
LessWrong readers are invited to apply to the Lurkshop
Jonas V
and
GradientDissenter
Nov 22, 2022, 9:19 AM
101
points
41
comments
3
min read
LW
link
Gaoxing Guy
Alok Singh
Nov 22, 2022, 1:50 AM
3
points
1
comment
1
min read
LW
link
(alok.github.io)
Miscellaneous First-Pass Alignment Thoughts
NickGabs
Nov 21, 2022, 9:23 PM
12
points
4
comments
10
min read
LW
link
[Hebbian Natural Abstractions] Introduction
Samuel Nellessen
and
Jan
Nov 21, 2022, 8:34 PM
34
points
3
comments
4
min read
LW
link
(www.snellessen.com)
Utilitarianism Meets Egalitarianism
Scott Garrabrant
Nov 21, 2022, 7:00 PM
121
points
16
comments
6
min read
LW
link
1
review
Interview with Matt Freeman
Evenflair
Nov 21, 2022, 6:17 PM
15
points
0
comments
1
min read
LW
link
(overcast.fm)
Here’s the exit.
Valentine
Nov 21, 2022, 6:07 PM
115
points
180
comments
10
min read
LW
link
5
reviews
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel