All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 20232024

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 232425 26 27 28 29 30 31

Miles Brundage resigned from OpenAI, and his AGI readiness team was disbanded

garrison23 Oct 2024 23:40 UTC

118 points

1 comment7 min readLW link

(garrisonlovely.substack.com)

A metaphor: what “green lights” for AGI would look like

Lorec23 Oct 2024 23:24 UTC

−1 points

6 comments2 min readLW link

Motte-and-Bailey: a Short Explanation

Lorec23 Oct 2024 22:29 UTC

12 points

0 comments1 min readLW link

Self-prediction acts as an emergent regularizer

Cameron Berg, Judd Rosenblatt, Mike Vaiana, Diogo de Lucena, florin_pop and AE Studio

23 Oct 2024 22:27 UTC

84 points

4 comments4 min readLW link

Technical Risks of (Lethal) Autonomous Weapons Systems

Heramb23 Oct 2024 20:41 UTC

2 points

0 comments1 min readLW link

(encodejustice.org)

Appealing to the Public

jefftk23 Oct 2024 19:00 UTC

16 points

0 comments5 min readLW link

(www.jefftk.com)

Introducing Transluce — A Letter from the Founders

jsteinhardt23 Oct 2024 18:10 UTC

74 points

2 comments3 min readLW link

(bounded-regret.ghost.io)

Are we dropping the ball on Recommendation AIs?

Charbel-Raphaël23 Oct 2024 17:48 UTC

41 points

17 comments6 min readLW link

A bird’s eye view of ARC’s research

Jacob_Hilton23 Oct 2024 15:50 UTC

119 points

12 comments7 min readLW link

(www.alignment.org)

[Question] Artificial V/S Organoid Intelligence

10xyz23 Oct 2024 14:31 UTC

5 points

0 comments1 min readLW link

AI safety tax dynamics

owencb23 Oct 2024 12:18 UTC

22 points

0 comments6 min readLW link

(strangecities.substack.com)

What is malevolence? On the nature, measurement, and distribution of dark traits

David Althaus, Chi Nguyen and Clare

23 Oct 2024 8:41 UTC

76 points

15 comments1 min readLW link

Join a LessWrong Team for the Unaging System Challenge

Crissman23 Oct 2024 6:01 UTC

15 points

5 comments1 min readLW link

Word Spaghetti

Gordon Seidoh Worley23 Oct 2024 5:39 UTC

18 points

9 comments3 min readLW link

Monosemanticity & Quantization

Rahul Chand22 Oct 2024 22:57 UTC

1 point

0 comments9 min readLW link

[Question] What is the alpha in one bit of evidence?

J Bostock22 Oct 2024 21:57 UTC

20 points

13 comments1 min readLW link

Catastrophic sabotage as a major threat model for human-level AI systems

evhub22 Oct 2024 20:57 UTC

91 points

11 comments15 min readLW link

Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)

Elizabeth22 Oct 2024 18:20 UTC

75 points

79 comments1 min readLW link

(acesounderglass.com)

Decision-Making Under Uncertainty: Lessons From AI

Jonasb22 Oct 2024 17:54 UTC

−1 points

0 comments5 min readLW link

(www.denominations.io)

Testing Genetic Engineering Detection with Spike-Ins

jefftk22 Oct 2024 17:20 UTC

9 points

0 comments1 min readLW link

(naobservatory.org)

Predictions as Public Works Project — What Metaculus Is Building Next

ChristianWilliams22 Oct 2024 16:35 UTC

4 points

0 comments1 min readLW link

(www.metaculus.com)

Gorges of gender on a terrain of traits

dkl922 Oct 2024 16:18 UTC

−7 points

1 comment3 min readLW link

(dkl9.net)

A Defense of Peer Review

Niko_McCarty and delton137

22 Oct 2024 16:16 UTC

23 points

1 comment22 min readLW link

(www.asimov.press)

BIG-Bench Canary Contamination in GPT-4

Jozdien22 Oct 2024 15:40 UTC

123 points

13 comments4 min readLW link

[Paper Blogpost] When Your AIs Deceive You: Challenges with Partial Observability in RLHF

Leon Lang22 Oct 2024 13:57 UTC

50 points

1 comment18 min readLW link

(arxiv.org)

[Intuitive self-models] 6. Awakening / Enlightenment / PNSE

Steven Byrnes22 Oct 2024 13:23 UTC

62 points

8 comments21 min readLW link

Resolving von Neumann-Morgenstern Inconsistent Preferences

niplav22 Oct 2024 11:45 UTC

31 points

5 comments58 min readLW link

Lenses of Control

WillPetillo22 Oct 2024 7:51 UTC

14 points

0 comments9 min readLW link

A Brief Explanation of AI Control

Aaron_Scher22 Oct 2024 7:00 UTC

7 points

1 comment6 min readLW link

Longevity, AI, and Cognitive Research Hackathon @ MIT

ekkolápto22 Oct 2024 6:19 UTC

1 point

0 comments1 min readLW link

Conversational Signposts—An Antidote to Dull Social Interactions

Declan Molony22 Oct 2024 5:37 UTC

11 points

6 comments2 min readLW link

I got dysentery so you don’t have to

eukaryote22 Oct 2024 4:55 UTC

315 points

4 comments17 min readLW link

(eukaryotewritesblog.com)

Transformers Explained (Again)

RohanS22 Oct 2024 4:06 UTC

3 points

0 comments18 min readLW link

Sleeping on Stage

jefftk22 Oct 2024 0:50 UTC

26 points

3 comments1 min readLW link

(www.jefftk.com)

The Mask Comes Off: At What Price?

Zvi21 Oct 2024 23:50 UTC

71 points

16 comments8 min readLW link

(thezvi.wordpress.com)

Distinguishing ways AI can be “concentrated”

Matthew Barnett21 Oct 2024 22:21 UTC

28 points

2 comments1 min readLW link

Jailbreaking ChatGPT and Claude using Web API Context Injection

Jaehyuk Lim21 Oct 2024 21:34 UTC

4 points

0 comments3 min readLW link

How to Teach Your Brain to Hate Procrastination

10xyz21 Oct 2024 20:12 UTC

3 points

0 comments2 min readLW link

Pausing for what?

MountainPath21 Oct 2024 20:12 UTC

0 points

1 comment1 min readLW link

What is autonomy? Why boundaries are necessary.

Chipmonk21 Oct 2024 17:56 UTC

8 points

1 comment1 min readLW link

(chrislakin.blog)

Could randomly choosing people to serve as representatives lead to better government?

John Huang21 Oct 2024 17:10 UTC

75 points

13 comments10 min readLW link

There aren’t enough smart people in biology doing something boring

Abhishaike Mahajan21 Oct 2024 15:52 UTC

27 points

13 comments10 min readLW link

Automation collapse

Geoffrey Irving, Tomek Korbak and Benjamin Hilton

21 Oct 2024 14:50 UTC

70 points

9 comments7 min readLW link

What AI companies should do: Some rough ideas

Zach Stein-Perlman21 Oct 2024 14:00 UTC

33 points

10 comments5 min readLW link

[Question] What should OpenAI do that it hasn’t already done, to stop their vacancies from being advertised on the 80k Job Board?

WitheringWeights21 Oct 2024 13:57 UTC

21 points

0 comments1 min readLW link

A Rocket–Interpretability Analogy

plex21 Oct 2024 13:55 UTC

149 points

31 comments1 min readLW link

Tokyo AI Safety 2025: Call For Papers

Blaine21 Oct 2024 8:43 UTC

24 points

0 comments3 min readLW link

(www.tais2025.cc)

OpenAI defected, but we can take honest actions

Remmelt21 Oct 2024 8:41 UTC

17 points

16 comments1 min readLW link

Slightly More Than You Wanted To Know: Pregnancy Length Effects

JustisMills21 Oct 2024 1:26 UTC

62 points

4 comments5 min readLW link

(justismills.substack.com)

Information vs Assurance

johnswentworth20 Oct 2024 23:16 UTC

185 points

17 comments2 min readLW link