All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 789 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Follow up to medical miracle

ElizabethNov 4, 2022, 6:00 PM

76 points

5 comments6 min readLW link

(acesounderglass.com)

Cross-Void Optimization

pneumynymNov 4, 2022, 5:47 PM

1 point

1 comment8 min readLW link

Monthly Shorts 10/22

CelerNov 4, 2022, 4:30 PM

12 points

0 comments6 min readLW link

(keller.substack.com)

Weekly Roundup #4

ZviNov 4, 2022, 3:00 PM

42 points

1 comment6 min readLW link

(thezvi.wordpress.com)

A new place to discuss cognitive science, ethics and human alignment

Daniel_FriedrichNov 4, 2022, 2:34 PM

3 points

4 comments LW link

A newcomer’s guide to the technical AI safety field

zeshenNov 4, 2022, 2:29 PM

42 points

3 comments10 min readLW link

[Question] Are alignment researchers devoting enough time to improving their research capacity?

Carson JonesNov 4, 2022, 12:58 AM

13 points

3 comments3 min readLW link

[Question] Don’t you think RLHF solves outer alignment?

Charbel-RaphaëlNov 4, 2022, 12:36 AM

9 points

23 comments1 min readLW link

Mechanistic Interpretability as Reverse Engineering (follow-up to “cars and elephants”)

David Scott Krueger (formerly: capybaralet)Nov 3, 2022, 11:19 PM

28 points

3 comments1 min readLW link

[Question] Could a Supreme Court suit work to solve NEPA problems?

ChristianKlNov 3, 2022, 9:10 PM

15 points

0 comments1 min readLW link

[Video] How having Fast Fourier Transforms sooner could have helped with Nuclear Disarmament—Veritaserum

mako yassNov 3, 2022, 9:04 PM

17 points

1 comment LW link

Further considerations on the Evidentialist’s Wager

Martín SotoNov 3, 2022, 8:06 PM

3 points

9 comments8 min readLW link

AI as a Civilizational Risk Part 6/6: What can be done

PashaKamyshevNov 3, 2022, 7:48 PM

2 points

4 comments4 min readLW link

A Mystery About High Dimensional Concept Encoding

Fabien RogerNov 3, 2022, 5:05 PM

46 points

13 comments7 min readLW link

Why do we post our AI safety plans on the Internet?

Peter S. ParkNov 3, 2022, 4:02 PM

4 points

4 comments11 min readLW link

Multiple Deploy-Key Repos

jefftkNov 3, 2022, 3:10 PM

15 points

0 comments1 min readLW link

(www.jefftk.com)

Covid 11/3/22: Asking Forgiveness

ZviNov 3, 2022, 1:50 PM

23 points

3 comments6 min readLW link

(thezvi.wordpress.com)

Adversarial Policies Beat Professional-Level Go AIs

sanxiynNov 3, 2022, 1:27 PM

31 points

35 comments1 min readLW link

(goattack.alignmentfund.org)

K-types vs T-types — what priors do you have?

Cleo NardoNov 3, 2022, 11:29 AM

74 points

25 comments7 min readLW link

Information Markets 2: Optimally Shaped Reward Bets

eva_Nov 3, 2022, 11:08 AM

9 points

0 comments3 min readLW link

The Rational Utilitarian Love Movement (A Historical Retrospective)

Caleb BiddulphNov 3, 2022, 7:11 AM

3 points

0 comments LW link

The Mirror Chamber: A short story exploring the anthropic measure function and why it can matter

mako yassNov 3, 2022, 6:47 AM

30 points

13 comments10 min readLW link

Open Letter Against Reckless Nuclear Escalation and Use

Max TegmarkNov 3, 2022, 5:34 AM

27 points

25 comments1 min readLW link

Lazy Python Argument Parsing

jefftkNov 3, 2022, 2:20 AM

20 points

3 comments1 min readLW link

(www.jefftk.com)

AI as a Civilizational Risk Part 5/6: Relationship between C-risk and X-risk

PashaKamyshevNov 3, 2022, 2:19 AM

2 points

0 comments7 min readLW link

[Question] Is there a good way to award a fixed prize in a prediction contest?

jchanNov 2, 2022, 9:37 PM

18 points

5 comments1 min readLW link

“Are Experiments Possible?” Seeds of Science call for reviewers

rogersbaconNov 2, 2022, 8:05 PM

8 points

0 comments1 min readLW link

Humans do acausal coordination all the time

Adam JermynNov 2, 2022, 2:40 PM

57 points

35 comments3 min readLW link

Far-UVC Light Update: No, LEDs are not around the corner (tweetstorm)

DavidmanheimNov 2, 2022, 12:57 PM

73 points

27 comments4 min readLW link

(twitter.com)

Housing and Transit Thoughts #1

ZviNov 2, 2022, 12:10 PM

35 points

5 comments16 min readLW link

(thezvi.wordpress.com)

Mind is uncountable

Filip SondejNov 2, 2022, 11:51 AM

18 points

22 comments LW link

AI Safety Needs Great Product Builders

goodgravyNov 2, 2022, 11:33 AM

14 points

2 comments LW link

Why is fiber good for you?

bracesNov 2, 2022, 2:04 AM

18 points

2 comments2 min readLW link

Information Markets

eva_Nov 2, 2022, 1:24 AM

46 points

6 comments12 min readLW link

Sequence Reread: Fake Beliefs [plus sequence spotlight meta]

RaemonNov 2, 2022, 12:09 AM

27 points

3 comments1 min readLW link

Real-Time Research Recording: Can a Transformer Re-Derive Positional Info?

Neel NandaNov 1, 2022, 11:56 PM

69 points

16 comments1 min readLW link

(youtu.be)

All AGI Safety questions welcome (especially basic ones) [~monthly thread]

Robert MilesNov 1, 2022, 11:23 PM

68 points

105 comments2 min readLW link

[Question] Which Issues in Conceptual Alignment have been Formalised or Observed (or not)?

ojorgensenNov 1, 2022, 10:32 PM

4 points

0 comments1 min readLW link

AI as a Civilizational Risk Part 4/6: Bioweapons and Philosophy of Modification

PashaKamyshevNov 1, 2022, 8:50 PM

7 points

1 comment8 min readLW link

Open & Welcome Thread—November 2022

MondSemmelNov 1, 2022, 6:47 PM

14 points

46 comments1 min readLW link

Mildly Against Donor Lotteries

jefftkNov 1, 2022, 6:10 PM

10 points

9 comments3 min readLW link

(www.jefftk.com)

Progress links and tweets, 2022-11-01

jasoncrawfordNov 1, 2022, 5:48 PM

16 points

4 comments3 min readLW link

(rootsofprogress.org)

On the correspondence between AI-misalignment and cognitive dissonance using a behavioral economics model

Stijn BruersNov 1, 2022, 5:39 PM

4 points

0 comments6 min readLW link

Threat Model Literature Review

zac_kenton, Rohin Shah, David Lindner, Vikrant Varma, Vika, Mary Phuong, Ramana Kumar and Elliot Catt

1 Nov 2022 11:03 UTC

78 points

4 comments25 min readLW link

Clarifying AI X-risk

zac_kenton, Rohin Shah, David Lindner, Vikrant Varma, Vika, Mary Phuong, Ramana Kumar and Elliot Catt

1 Nov 2022 11:03 UTC

127 points

24 comments4 min readLW link 1 review

Auditing games for high-level interpretability

Paul Colognese1 Nov 2022 10:44 UTC

33 points

1 comment7 min readLW link

Remember to translate your thoughts back again

brook1 Nov 2022 8:49 UTC

25 points

11 comments3 min readLW link

(forum.effectivealtruism.org)

Conversations on Alcohol Consumption

Annapurna1 Nov 2022 5:09 UTC

20 points

6 comments9 min readLW link

ML Safety Scholars Summer 2022 Retrospective

TW1231 Nov 2022 3:09 UTC

29 points

0 comments LW link

EA & LW Forums Weekly Summary (24 − 30th Oct 22′)

Zoe Williams1 Nov 2022 2:58 UTC

13 points

1 comment LW link