All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 234 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

My Mid-Career Transition into Biosecurity

jefftk2 Oct 2023 21:20 UTC

26 points

4 comments2 min readLW link

(www.jefftk.com)

Dall-E 3

p.b.2 Oct 2023 20:33 UTC

37 points

9 comments1 min readLW link

(openai.com)

Thomas Kwa’s MIRI research experience

Thomas Kwa, peterbarnett, Vivek Hebbar, Jeremy Gillen, Bird Concept and Raemon

2 Oct 2023 16:42 UTC

172 points

53 comments1 min readLW link

Population After a Catastrophe

Stan Pinsent2 Oct 2023 16:06 UTC

3 points

5 comments14 min readLW link

Expectations for Gemini: hopefully not a big deal

Maxime Riché2 Oct 2023 15:38 UTC

15 points

5 comments1 min readLW link

A counterexample for measurable factor spaces

Matthias G. Mayer2 Oct 2023 15:16 UTC

14 points

0 comments3 min readLW link

Will early transformative AIs primarily use text? [Manifold question]

Fabien Roger2 Oct 2023 15:05 UTC

16 points

0 comments3 min readLW link

energy landscapes of experts

bhauth2 Oct 2023 14:08 UTC

45 points

2 comments3 min readLW link

(www.bhauth.com)

Direction of Fit

NicholasKees2 Oct 2023 12:34 UTC

34 points

0 comments3 min readLW link

The 99% principle for personal problems

Kaj_Sotala2 Oct 2023 8:20 UTC

137 points

20 comments2 min readLW link

(kajsotala.fi)

Linkpost: They Studied Dishonesty. Was Their Work a Lie?

Linch2 Oct 2023 8:10 UTC

91 points

12 comments2 min readLW link

(www.newyorker.com)

Why I got the smallpox vaccine in 2023

joec2 Oct 2023 5:11 UTC

25 points

6 comments4 min readLW link

Instrumental Convergence and human extinction.

Spiritus Dei2 Oct 2023 0:41 UTC

−10 points

3 comments7 min readLW link

Revisiting the Manifold Hypothesis

Aidan Rocke1 Oct 2023 23:55 UTC

13 points

19 comments4 min readLW link

AI Alignment Breakthroughs this Week [new substack]

Logan Zoellner1 Oct 2023 22:13 UTC

0 points

8 comments2 min readLW link

[Question] Looking for study

Robert Feinstein1 Oct 2023 19:52 UTC

4 points

0 comments1 min readLW link

Join AISafety.info’s Distillation Hackathon (Oct 6-9th)

smallsilo1 Oct 2023 18:43 UTC

21 points

0 comments2 min readLW link

(forum.effectivealtruism.org)

Fifty Flips

abstractapplic1 Oct 2023 15:30 UTC

33 points

15 comments1 min readLW link 1 review

(h-b-p.github.io)

AI Safety Impact Markets: Your Charity Evaluator for AI Safety

Dawn Drescher1 Oct 2023 10:47 UTC

16 points

5 comments1 min readLW link

(impactmarkets.substack.com)

“Absence of Evidence is Not Evidence of Absence” As a Limit

transhumanist_atom_understander1 Oct 2023 8:15 UTC

16 points

1 comment2 min readLW link

New Tool: the Residual Stream Viewer

AdamYedidia1 Oct 2023 0:49 UTC

32 points

7 comments4 min readLW link

(tinyurl.com)

My Effortless Weightloss Story: A Quick Runthrough

CuoreDiVetro30 Sep 2023 23:02 UTC

123 points

78 comments9 min readLW link

Arguments for moral indefinability

Richard_Ngo30 Sep 2023 22:40 UTC

47 points

16 comments7 min readLW link

(www.thinkingcomplete.com)

Conditionals All The Way Down

lunatic_at_large30 Sep 2023 21:06 UTC

33 points

2 comments3 min readLW link

Focusing your impact on short vs long TAI timelines

kuhanj30 Sep 2023 19:34 UTC

4 points

0 comments10 min readLW link

How model editing could help with the alignment problem

Michael Ripa30 Sep 2023 17:47 UTC

12 points

1 comment15 min readLW link

My submission to the ALTER Prize

Lorxus30 Sep 2023 16:07 UTC

6 points

0 comments1 min readLW link

(www.docdroid.net)

Anki deck for learning the main AI safety orgs, projects, and programs

Bryce Robertson30 Sep 2023 16:06 UTC

2 points

0 comments1 min readLW link

The Lighthaven Campus is open for bookings

habryka30 Sep 2023 1:08 UTC

209 points

18 comments5 min readLW link

(www.lighthaven.space)

Headphones hook

philh29 Sep 2023 22:50 UTC

21 points

1 comment3 min readLW link

(reasonableapproximation.net)

Paul Christiano’s views on “doom” (video explainer)

Michaël Trazzi29 Sep 2023 21:56 UTC

15 points

0 comments1 min readLW link

(youtu.be)

The Retroactive Funding Landscape: Innovations for Donors and Grantmakers

Dawn Drescher29 Sep 2023 17:39 UTC

13 points

0 comments1 min readLW link

(impactmarkets.substack.com)

Bids To Defer On Value Judgements

johnswentworth29 Sep 2023 17:07 UTC

58 points

6 comments3 min readLW link

Announcing FAR Labs, an AI safety coworking space

Ben Goldhaber29 Sep 2023 16:52 UTC

95 points

0 comments1 min readLW link

A tool for searching rationalist & EA webs

Daniel_Friedrich29 Sep 2023 15:23 UTC

4 points

0 comments1 min readLW link

(ratsearch.blogspot.com)

Basic Mathematics of Predictive Coding

Adam Shai29 Sep 2023 14:38 UTC

49 points

6 comments9 min readLW link

“Diamondoid bacteria” nanobots: deadly threat or dead-end? A nanotech investigation

titotal29 Sep 2023 14:01 UTC

160 points

79 comments1 min readLW link

(titotal.substack.com)

Steering subsystems: capabilities, agency, and alignment

Seth Herd29 Sep 2023 13:45 UTC

31 points

0 comments8 min readLW link

Apply to Usable Security Prize by September 30

Allison Duettmann29 Sep 2023 13:39 UTC

4 points

0 comments1 min readLW link

List of how people have become more hard-working

Chi Nguyen29 Sep 2023 11:30 UTC

65 points

7 comments1 min readLW link

Resolving moral uncertainty with randomization

B Jacobs and Jobst Heitzig

29 Sep 2023 11:23 UTC

7 points

1 comment11 min readLW link

EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem

Elizabeth28 Sep 2023 23:30 UTC

317 points

250 comments22 min readLW link 2 reviews

(acesounderglass.com)

Competitive, Cooperative, and Cohabitive

Screwtape28 Sep 2023 23:25 UTC

49 points

13 comments5 min readLW link 1 review

The Coming Wave

PeterMcCluskey28 Sep 2023 22:59 UTC

25 points

1 comment6 min readLW link

(bayesianinvestor.com)

High-level interpretability: detecting an AI’s objectives

Paul Colognese and Jozdien

28 Sep 2023 19:30 UTC

71 points

4 comments21 min readLW link

How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

JanB, Owain_Evans and SoerenMind

28 Sep 2023 18:53 UTC

187 points

39 comments3 min readLW link 1 review

Responsible scaling policy TLDR

lemonhope28 Sep 2023 18:51 UTC

9 points

0 comments1 min readLW link

Alignment Workshop talks

Richard_Ngo28 Sep 2023 18:26 UTC

37 points

1 comment1 min readLW link

(www.alignment-workshop.com)

My Current Thoughts on the AI Strategic Landscape

Jeffrey Heninger28 Sep 2023 17:59 UTC

11 points

28 comments14 min readLW link

My Arrogant Plan for Alignment

MrArrogant28 Sep 2023 17:51 UTC

2 points

6 comments6 min readLW link