Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Some reasons to not say “Doomer”
Ruby
9 Jul 2023 21:05 UTC
46
points
18
comments
4
min read
LW
link
The Seeker’s Game – Vignettes from the Bay
Yulia
9 Jul 2023 19:32 UTC
137
points
19
comments
16
min read
LW
link
[Question]
Why have exposure notification apps been (mostly) discontinued?
VipulNaik
9 Jul 2023 19:07 UTC
10
points
5
comments
2
min read
LW
link
[Question]
The Necessity of Privacy: A Condition for Social Change and Experimentation?
Blake
9 Jul 2023 18:42 UTC
−8
points
1
comment
1
min read
LW
link
Attempting to Deconstruct “Real”
herschel
9 Jul 2023 16:40 UTC
21
points
23
comments
2
min read
LW
link
Quick proposal: Decision market regrantor using manifund (please improve)
Nathan Young
9 Jul 2023 12:49 UTC
10
points
5
comments
5
min read
LW
link
[Question]
Where are the people building AGI in the non-dumb way?
Johannes C. Mayer
9 Jul 2023 11:39 UTC
10
points
19
comments
2
min read
LW
link
[Question]
What to read on the “informal multi-world model”?
mishka
9 Jul 2023 4:48 UTC
13
points
23
comments
1
min read
LW
link
Whether LLMs “understand” anything is mostly a terminological dispute
RobertM
9 Jul 2023 3:31 UTC
10
points
1
comment
1
min read
LW
link
Taboo Truth
Tomás B.
8 Jul 2023 23:23 UTC
36
points
16
comments
2
min read
LW
link
“View”
herschel
8 Jul 2023 23:19 UTC
6
points
0
comments
2
min read
LW
link
[Question]
H5N1. Just how bad is the situation?
Q Home
8 Jul 2023 22:09 UTC
16
points
8
comments
1
min read
LW
link
A Two-Part System for Practical Self-Care
Jonathan Moregård
8 Jul 2023 21:23 UTC
11
points
0
comments
3
min read
LW
link
(honestliving.substack.com)
Really Strong Features Found in Residual Stream
Logan Riggs
8 Jul 2023 19:40 UTC
69
points
6
comments
2
min read
LW
link
Eight Strategies for Tackling the Hard Part of the Alignment Problem
scasper
8 Jul 2023 18:55 UTC
42
points
11
comments
7
min read
LW
link
“Concepts of Agency in Biology” (Okasha, 2023) - Brief Paper Summary
Nora_Ammann
8 Jul 2023 18:22 UTC
40
points
3
comments
7
min read
LW
link
Blanchard’s Dangerous Idea and the Plight of the Lucid Crossdreamer
Zack_M_Davis
8 Jul 2023 18:03 UTC
38
points
135
comments
72
min read
LW
link
(unremediatedgender.space)
Continuous Adversarial Quality Assurance: Extending RLHF and Constitutional AI
Benaya Koren
8 Jul 2023 17:32 UTC
6
points
0
comments
9
min read
LW
link
Commentless downvoting is not a good way to fight infohazards
DirectedEvolution
8 Jul 2023 17:29 UTC
6
points
9
comments
3
min read
LW
link
[Question]
Why does anxiety (?) make me dumb?
TeaTieAndHat
8 Jul 2023 16:13 UTC
18
points
14
comments
3
min read
LW
link
Economic Time Bomb: An Overlooked Employment Bubble Threatening the US Economy
Glenn Clayton
8 Jul 2023 15:19 UTC
4
points
10
comments
6
min read
LW
link
What is everyone doing in AI governance
Igor Ivanov
8 Jul 2023 15:16 UTC
10
points
0
comments
5
min read
LW
link
LLM misalignment can probably be found without manual prompt engineering
ProgramCrafter
8 Jul 2023 14:35 UTC
1
point
0
comments
1
min read
LW
link
You must not fool yourself, and you are the easiest person to fool
Richard_Ngo
8 Jul 2023 14:05 UTC
33
points
5
comments
4
min read
LW
link
Fixed Point: a love story
Richard_Ngo
8 Jul 2023 13:56 UTC
95
points
2
comments
7
min read
LW
link
Announcing AI Alignment workshop at the ALIFE 2023 conference
rorygreig
8 Jul 2023 13:52 UTC
16
points
0
comments
1
min read
LW
link
(humanvaluesandartificialagency.com)
3D Printed Talkbox Cap
jefftk
8 Jul 2023 13:00 UTC
9
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Writing this post as rationality case study
Ben Amitay
8 Jul 2023 12:24 UTC
10
points
8
comments
2
min read
LW
link
[Question]
What Does LessWrong/EA Think of Human Intelligence Augmentation as of mid-2023?
lukemarks
8 Jul 2023 11:42 UTC
84
points
28
comments
2
min read
LW
link
[Question]
Request for feedback—infohazards in testing LLMs for causal reasoning?
DirectedEvolution
8 Jul 2023 9:01 UTC
16
points
0
comments
2
min read
LW
link
Views on when AGI comes and on strategy to reduce existential risk
TsviBT
8 Jul 2023 9:00 UTC
103
points
33
comments
14
min read
LW
link
Weekday Evening Beach Picnics
jefftk
8 Jul 2023 2:20 UTC
2
points
4
comments
1
min read
LW
link
(www.jefftk.com)
ACI#4: Seed AI is the new Perpetual Motion Machine
Akira Pyinya
8 Jul 2023 1:17 UTC
−7
points
0
comments
6
min read
LW
link
[Question]
Links to discussions on social equilibrium and human value after (aligned) super-AI?
Michael Tontchev
8 Jul 2023 1:01 UTC
7
points
3
comments
1
min read
LW
link
Notes from the Qatar Center for Global Banking and Finance 3rd Annual Conference
PixelatedPenguin
7 Jul 2023 23:48 UTC
2
points
0
comments
1
min read
LW
link
Introducing bayescalc.io
Adele Lopez
7 Jul 2023 16:11 UTC
114
points
29
comments
1
min read
LW
link
(bayescalc.io)
Meetup Tip: Ask Attendees To Explain It
Screwtape
7 Jul 2023 16:08 UTC
10
points
0
comments
4
min read
LW
link
Interpreting Modular Addition in MLPs
Bart Bussmann
7 Jul 2023 9:22 UTC
19
points
0
comments
6
min read
LW
link
Internal independent review for language model agent alignment
Seth Herd
7 Jul 2023 6:54 UTC
55
points
30
comments
11
min read
LW
link
[Question]
Can LessWrong provide me with something I find obviously highly useful to my own practical life?
agrippa
7 Jul 2023 3:08 UTC
32
points
4
comments
1
min read
LW
link
ask me about technology
bhauth
7 Jul 2023 2:03 UTC
23
points
42
comments
1
min read
LW
link
Apparently, of the 195 Million the DoD allocated in University Research Funding Awards in 2022, more than half of them concerned AI or compute hardware research
mako yass
7 Jul 2023 1:20 UTC
41
points
5
comments
2
min read
LW
link
(www.defense.gov)
What are the best non-LW places to read on alignment progress?
Raemon
7 Jul 2023 0:57 UTC
50
points
14
comments
1
min read
LW
link
Two paths to win the AGI transition
Nathan Helm-Burger
6 Jul 2023 21:59 UTC
11
points
8
comments
4
min read
LW
link
Empirical Evidence Against “The Longest Training Run”
NickGabs
6 Jul 2023 18:32 UTC
24
points
0
comments
14
min read
LW
link
Progress Studies Fellowship looking for members
jay ram
6 Jul 2023 17:41 UTC
3
points
0
comments
1
min read
LW
link
BOUNTY AVAILABLE: AI ethicists, what are your object-level arguments against AI notkilleveryoneism?
Peter Berggren
6 Jul 2023 17:32 UTC
17
points
6
comments
2
min read
LW
link
Layering and Technical Debt in the Global Wayfinding Model
herschel
6 Jul 2023 17:30 UTC
14
points
0
comments
3
min read
LW
link
Localizing goal misgeneralization in a maze-solving policy network
jan betley
6 Jul 2023 16:21 UTC
37
points
2
comments
7
min read
LW
link
Jesse Hoogland on Developmental Interpretability and Singular Learning Theory
Michaël Trazzi
6 Jul 2023 15:46 UTC
42
points
2
comments
4
min read
LW
link
(theinsideview.ai)
Back to top
Next