All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 282930 31

Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small

RowanWang, Alexandre Variengien, Arthur Conmy, Buck and jsteinhardt

Oct 28, 2022, 11:55 PM

101 points

9 comments9 min readLW link 2 reviews

(arxiv.org)

Resources that (I think) new alignment researchers should know about

Orpheus16Oct 28, 2022, 10:13 PM

70 points

9 comments4 min readLW link

How often does One Person succeed?

Mayank ModiOct 28, 2022, 7:32 PM

1 point

3 comments LW link

aisafety.community—A living document of AI safety communities

zeshen and plex

Oct 28, 2022, 5:50 PM

58 points

23 comments1 min readLW link

Rapid Test Throat Swabbing?

jefftkOct 28, 2022, 4:30 PM

18 points

2 comments1 min readLW link

(www.jefftk.com)

Join the interpretability research hackathon

Esben KranOct 28, 2022, 4:26 PM

15 points

0 comments LW link

Syncretism

AnnapurnaOct 28, 2022, 4:08 PM

16 points

4 comments1 min readLW link

(jorgevelez.substack.com)

Pondering computation in the real world

Adam ShaiOct 28, 2022, 3:57 PM

24 points

13 comments5 min readLW link

Ukraine and the Crimea Question

ChristianKlOct 28, 2022, 12:26 PM

−2 points

153 comments11 min readLW link

New book on s-risks

Tobias_BaumannOct 28, 2022, 9:36 AM

68 points

1 comment LW link

Cryptic symbols

Adam ScherlisOct 28, 2022, 6:44 AM

6 points

17 comments1 min readLW link

(adam.scherlis.com)

All life’s helpers’ beliefs

TehdastehdasOct 28, 2022, 5:47 AM

−12 points

1 comment5 min readLW link

Prizes for ML Safety Benchmark Ideas

joshcOct 28, 2022, 2:51 AM

36 points

5 comments1 min readLW link

Worldview iPeople—Future Fund’s AI Worldview Prize

Toni MUENDELOct 28, 2022, 1:53 AM

−22 points

4 comments9 min readLW link

Anatomy of change

Jose Miguel Cruz y CelisOct 28, 2022, 1:21 AM

1 point

0 comments1 min readLW link

Nash equilibria of symmetric zero-sum games

Ege ErdilOct 27, 2022, 11:50 PM

14 points

0 comments14 min readLW link

[Question] Good psychology books/books that contain good psychological models?

shuffled-cantaloupeOct 27, 2022, 11:04 PM

1 point

1 comment1 min readLW link

Podcast: The Left and Effective Altruism with Habiba Islam

garrisonOct 27, 2022, 5:41 PM

2 points

2 comments LW link

Lessons from ‘Famine, Affluence, and Morality’ and its reflection on today.

Mayank ModiOct 27, 2022, 5:20 PM

4 points

0 comments LW link

[Question] Is the Orthogonality Thesis true for humans?

Noosphere89Oct 27, 2022, 2:41 PM

12 points

20 comments1 min readLW link

Historicism in the math-adjacent sciences

mrcbarbierOct 27, 2022, 2:38 PM

3 points

0 comments5 min readLW link

How Risky Is Trick-or-Treating?

jefftkOct 27, 2022, 2:10 PM

58 points

18 comments2 min readLW link

(www.jefftk.com)

Covid 10/27/22: Another Origin Story

ZviOct 27, 2022, 1:40 PM

32 points

1 comment13 min readLW link

(thezvi.wordpress.com)

[Question] Why are probabilities represented as real numbers instead of rational numbers?

Yaakov TOct 27, 2022, 11:23 AM

5 points

9 comments1 min readLW link

Five Areas I Wish EAs Gave More Focus

PrometheusOct 27, 2022, 6:13 AM

13 points

18 comments LW link

Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley

maxnadeau, Xander Davies, Buck and Nate Thomas

Oct 27, 2022, 1:32 AM

135 points

14 comments12 min readLW link

[Question] Quantum Suicide and Aumann’s Agreement Theorem

Isaac KingOct 27, 2022, 1:32 AM

16 points

20 comments1 min readLW link

Reslab Request for Information: EA hardware projects

Joel BeckerOct 26, 2022, 9:13 PM

10 points

0 comments LW link

A list of Petrov buttons

philhOct 26, 2022, 8:50 PM

19 points

8 comments5 min readLW link

(reasonableapproximation.net)

The Game of Antonyms

FaustifyOct 26, 2022, 7:26 PM

4 points

6 comments8 min readLW link

Paper: In-context Reinforcement Learning with Algorithm Distillation [Deepmind]

LawrenceCOct 26, 2022, 6:45 PM

29 points

5 comments1 min readLW link

(arxiv.org)

[Question] How to become more articulate?

just_browsingOct 26, 2022, 2:43 PM

19 points

14 comments1 min readLW link

Open Bands: Leading Rhythm

jefftkOct 26, 2022, 2:30 PM

10 points

0 comments4 min readLW link

(www.jefftk.com)

Signals of war in August 2021

yieldthoughtOct 26, 2022, 8:11 AM

70 points

16 comments2 min readLW link

Trigger-based rapid checklists

VipulNaikOct 26, 2022, 4:05 AM

44 points

0 comments9 min readLW link

Why some people believe in AGI, but I don’t.

cveresOct 26, 2022, 3:09 AM

−15 points

6 comments LW link

Intent alignment should not be the goal for AGI x-risk reduction

John NayOct 26, 2022, 1:24 AM

1 point

10 comments3 min readLW link

Reinforcement Learning Goal Misgeneralization: Can we guess what kind of goals are selected by default?

StefanHex and Julian_R

Oct 25, 2022, 8:48 PM

15 points

2 comments4 min readLW link

A Walkthrough of A Mathematical Framework for Transformer Circuits

Neel NandaOct 25, 2022, 8:24 PM

52 points

7 comments1 min readLW link

(www.youtube.com)

Nothing.

rogersbaconOct 25, 2022, 4:33 PM

−10 points

4 comments6 min readLW link

(www.secretorum.life)

Maps and Blueprint; the Two Sides of the Alignment Equation

Nora_AmmannOct 25, 2022, 4:29 PM

24 points

1 comment5 min readLW link

Consider Applying to the Future Fellowship at MIT

jefftkOct 25, 2022, 3:40 PM

29 points

0 comments1 min readLW link

(www.jefftk.com)

Beyond Kolmogorov and Shannon

Alexander Gietelink Oldenziel and Adam Shai

Oct 25, 2022, 3:13 PM

63 points

22 comments5 min readLW link

What does it take to defend the world against out-of-control AGIs?

Steven ByrnesOct 25, 2022, 2:47 PM

212 points

51 comments30 min readLW link 1 review

Refine: what helped me write more?

Alexander Gietelink OldenzielOct 25, 2022, 2:44 PM

12 points

0 comments2 min readLW link

Logical Decision Theories: Our final failsafe?

Noosphere89Oct 25, 2022, 12:51 PM

−7 points

8 comments1 min readLW link

(www.lesswrong.com)

What will the scaled up GATO look like? (Updated with questions)

Amal Oct 25, 2022, 12:44 PM

34 points

22 comments1 min readLW link

Mechanism Design for AI Safety—Reading Group Curriculum

Rubi J. HudsonOct 25, 2022, 3:54 AM

15 points

3 comments LW link

Furry Rationalists & Effective Anthropomorphism both exist

agentydragonOct 25, 2022, 3:37 AM

42 points

3 comments1 min readLW link

EA & LW Forums Weekly Summary (17 − 23 Oct 22′)

Zoe WilliamsOct 25, 2022, 2:57 AM

10 points

0 comments LW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer