Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Seattle Winter Solstice
a7x
Dec 20, 2023, 8:30 PM
6
points
1
comment
1
min read
LW
link
How Would an Utopia-Maximizer Look Like?
Thane Ruthenis
Dec 20, 2023, 8:01 PM
31
points
23
comments
10
min read
LW
link
Succession
Richard_Ngo
Dec 20, 2023, 7:25 PM
159
points
48
comments
11
min read
LW
link
(www.narrativeark.xyz)
Metaculus Introduces Multiple Choice Questions
ChristianWilliams
Dec 20, 2023, 7:00 PM
4
points
0
comments
1
min read
LW
link
(www.metaculus.com)
Brighter Than Today Versions
jefftk
Dec 20, 2023, 6:20 PM
16
points
2
comments
2
min read
LW
link
(www.jefftk.com)
Gaia Network: a practical, incremental pathway to Open Agency Architecture
Roman Leventov
and
Rafael Kaufmann Nedal
Dec 20, 2023, 5:11 PM
22
points
8
comments
16
min read
LW
link
On the future of language models
owencb
Dec 20, 2023, 4:58 PM
105
points
17
comments
1
min read
LW
link
[Valence series] Appendix A: Hedonic tone / (dis)pleasure / (dis)liking
Steven Byrnes
Dec 20, 2023, 3:54 PM
18
points
0
comments
13
min read
LW
link
Matrix completion prize results
paulfchristiano
Dec 20, 2023, 3:40 PM
41
points
0
comments
2
min read
LW
link
(www.alignment.org)
[Question]
What’s the minimal additive constant for Kolmogorov Complexity that a programming language can achieve?
Noosphere89
Dec 20, 2023, 3:36 PM
11
points
15
comments
1
min read
LW
link
Legalize butanol?
bhauth
Dec 20, 2023, 2:24 PM
39
points
20
comments
5
min read
LW
link
(www.bhauth.com)
A short dialogue on comparability of values
cousin_it
Dec 20, 2023, 2:08 PM
27
points
7
comments
1
min read
LW
link
Inside View, Outside View… And Opposing View
chaosmage
Dec 20, 2023, 12:35 PM
21
points
1
comment
5
min read
LW
link
Heuristics for preventing major life mistakes
SK2
Dec 20, 2023, 8:01 AM
28
points
2
comments
3
min read
LW
link
What should be reified?
herschel
Dec 20, 2023, 4:52 AM
4
points
2
comments
2
min read
LW
link
(brothernin.substack.com)
(In)appropriate (De)reification
herschel
Dec 20, 2023, 4:51 AM
10
points
1
comment
4
min read
LW
link
(brothernin.substack.com)
Escaping Skeuomorphism
Stuart Johnson
Dec 20, 2023, 3:51 AM
28
points
0
comments
8
min read
LW
link
Ronny and Nate discuss what sorts of minds humanity is likely to find by Machine Learning
So8res
and
Ronny Fernandez
Dec 19, 2023, 11:39 PM
40
points
30
comments
25
min read
LW
link
[Question]
What are the best Siderea posts?
mike_hawke
Dec 19, 2023, 11:07 PM
17
points
2
comments
1
min read
LW
link
Meaning & Agency
abramdemski
Dec 19, 2023, 10:27 PM
91
points
17
comments
14
min read
LW
link
s/acc: Safe Accelerationism Manifesto
lorepieri
Dec 19, 2023, 10:19 PM
−4
points
5
comments
2
min read
LW
link
(lorenzopieri.com)
Don’t Share Information Exfohazardous on Others’ AI-Risk Models
Thane Ruthenis
Dec 19, 2023, 8:09 PM
66
points
11
comments
1
min read
LW
link
Paper: Tell, Don’t Show- Declarative facts influence how LLMs generalize
Owain_Evans
and
AlexMeinke
Dec 19, 2023, 7:14 PM
45
points
4
comments
6
min read
LW
link
(arxiv.org)
Interview: Applications w/ Alice Rigg
jacobhaimes
Dec 19, 2023, 7:03 PM
12
points
0
comments
1
min read
LW
link
(into-ai-safety.github.io)
How does a toy 2 digit subtraction transformer predict the sign of the output?
Evan Anders
Dec 19, 2023, 6:56 PM
14
points
0
comments
8
min read
LW
link
(evanhanders.blog)
Incremental AI Risks from Proxy-Simulations
kmenou
Dec 19, 2023, 6:56 PM
2
points
0
comments
1
min read
LW
link
(individual.utoronto.ca)
A proposition for the modification of our epistemology
JacobBowden
Dec 19, 2023, 6:55 PM
−4
points
2
comments
4
min read
LW
link
Goal-Completeness is like Turing-Completeness for AGI
Liron
Dec 19, 2023, 6:12 PM
50
points
26
comments
3
min read
LW
link
SociaLLM: proposal for a language model design for personalised apps, social science, and AI safety research
Roman Leventov
Dec 19, 2023, 4:49 PM
17
points
5
comments
3
min read
LW
link
Chording “The Next Right Thing”
jefftk
Dec 19, 2023, 3:40 PM
11
points
0
comments
2
min read
LW
link
(www.jefftk.com)
Monthly Roundup #13: December 2023
Zvi
Dec 19, 2023, 3:10 PM
32
points
5
comments
26
min read
LW
link
(thezvi.wordpress.com)
Effective Aspersions: How the Nonlinear Investigation Went Wrong
TracingWoodgrains
Dec 19, 2023, 12:00 PM
188
points
171
comments
1
min read
LW
link
1
review
A Universal Emergent Decomposition of Retrieval Tasks in Language Models
Alexandre Variengien
and
Eric Winsor
Dec 19, 2023, 11:52 AM
84
points
3
comments
10
min read
LW
link
(arxiv.org)
Assessment of AI safety agendas: think about the downside risk
Roman Leventov
Dec 19, 2023, 9:00 AM
13
points
1
comment
1
min read
LW
link
Constellations are Younger than Continents
Jeffrey Heninger
Dec 19, 2023, 6:12 AM
261
points
22
comments
2
min read
LW
link
The Dark Arts
lsusr
and
Lyrongolem
Dec 19, 2023, 4:41 AM
132
points
49
comments
9
min read
LW
link
When scientists consider whether their research will end the world
Harlan
Dec 19, 2023, 3:47 AM
30
points
4
comments
11
min read
LW
link
(blog.aiimpacts.org)
Is the far future inevitably zero sum?
Srdjan Miletic
Dec 19, 2023, 1:45 AM
8
points
2
comments
2
min read
LW
link
(dissent.blog)
The ‘Neglected Approaches’ Approach: AE Studio’s Alignment Agenda
Cameron Berg
,
Judd Rosenblatt
,
AE Studio
and
Marc Carauleanu
Dec 18, 2023, 8:35 PM
168
points
21
comments
12
min read
LW
link
The Shortest Path Between Scylla and Charybdis
Thane Ruthenis
Dec 18, 2023, 8:08 PM
50
points
8
comments
5
min read
LW
link
OpenAI: Preparedness framework
Zach Stein-Perlman
Dec 18, 2023, 6:30 PM
70
points
23
comments
4
min read
LW
link
(openai.com)
[Valence series] 5. “Valence Disorders” in Mental Health & Personality
Steven Byrnes
Dec 18, 2023, 3:26 PM
43
points
12
comments
13
min read
LW
link
Discussion: Challenges with Unsupervised LLM Knowledge Discovery
Seb Farquhar
,
Vikrant Varma
,
zac_kenton
,
gasteigerjo
,
Vlad Mikulik
and
Rohin Shah
Dec 18, 2023, 11:58 AM
147
points
21
comments
10
min read
LW
link
Interpreting the Learning of Deceit
RogerDearnaley
Dec 18, 2023, 8:12 AM
30
points
14
comments
9
min read
LW
link
Talk: “AI Would Be A Lot Less Alarming If We Understood Agents”
johnswentworth
Dec 17, 2023, 11:46 PM
58
points
3
comments
1
min read
LW
link
(www.youtube.com)
∀: a story
Richard_Ngo
Dec 17, 2023, 10:42 PM
37
points
1
comment
8
min read
LW
link
(www.narrativeark.xyz)
Reviving a 2015 MacBook
jefftk
Dec 17, 2023, 9:00 PM
11
points
0
comments
1
min read
LW
link
(www.jefftk.com)
A Common-Sense Case For Mutually-Misaligned AGIs Allying Against Humans
Thane Ruthenis
Dec 17, 2023, 8:28 PM
29
points
7
comments
11
min read
LW
link
The Limits of Artificial Consciousness: A Biology-Based Critique of Chalmers’ Fading Qualia Argument
Štěpán Los
Dec 17, 2023, 7:11 PM
−6
points
9
comments
17
min read
LW
link
What makes teaching math special
Viliam
Dec 17, 2023, 2:15 PM
41
points
27
comments
11
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel