All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

AllJan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Two easy things that maybe Just Work to improve AI discourse

Bird ConceptJun 8, 2024, 3:51 PM

191 points

35 comments2 min readLW link

What’s Going on With OpenAI’s Messaging?

ozziegooenMay 21, 2024, 2:22 AM

191 points

13 comments LW link

My AI Model Delta Compared To Christiano

johnswentworthJun 12, 2024, 6:19 PM

191 points

73 comments4 min readLW link

Skills from a year of Purposeful Rationality Practice

RaemonSep 18, 2024, 2:05 AM

190 points

18 comments7 min readLW link

My Interview With Cade Metz on His Reporting About Slate Star Codex

Zack_M_DavisMar 26, 2024, 5:18 PM

189 points

187 comments6 min readLW link

On Not Pulling The Ladder Up Behind You

ScrewtapeApr 26, 2024, 9:58 PM

189 points

21 comments9 min readLW link

Shallow review of technical AI safety, 2024

technicalities, Stag, Stephen McAleese, jordine and Dr. David Mathers

Dec 29, 2024, 12:01 PM

189 points

34 comments41 min readLW link

OMMC Announces RIP

Adam Scholl and aysja

Apr 1, 2024, 11:20 PM

189 points

5 comments2 min readLW link

A basic systems architecture for AI agents that do autonomous research

BuckSep 23, 2024, 1:58 PM

189 points

16 comments8 min readLW link

Daniel Kahneman has died

DanielFilanMar 27, 2024, 3:59 PM

187 points

11 comments1 min readLW link

(www.washingtonpost.com)

Information vs Assurance

johnswentworthOct 20, 2024, 11:16 PM

187 points

17 comments2 min readLW link

Contra Ngo et al. “Every ‘Every Bay Area House Party’ Bay Area House Party”

Ricki HeicklenFeb 22, 2024, 11:56 PM

186 points

5 comments4 min readLW link

(bayesshammai.substack.com)

Why Would Belief-States Have A Fractal Structure, And Why Would That Matter For Interpretability? An Explainer

johnswentworth and David Lorell

Apr 18, 2024, 12:27 AM

185 points

21 comments7 min readLW link

Humming is not a free $100 bill

ElizabethJun 6, 2024, 8:10 PM

185 points

6 comments3 min readLW link

(acesounderglass.com)

This is already your second chance

MalmesburyJul 28, 2024, 5:13 PM

185 points

13 comments8 min readLW link

Struggling like a Shadowmoth

RaemonSep 24, 2024, 12:47 AM

184 points

38 comments7 min readLW link

Contra papers claiming superhuman AI forecasting

nikos, Peter Mühlbacher, Lawrence Phillips and dschwarz

Sep 12, 2024, 6:10 PM

182 points

16 comments7 min readLW link

Introducing Alignment Stress-Testing at Anthropic

evhubJan 12, 2024, 11:51 PM

182 points

23 comments2 min readLW link

Every “Every Bay Area House Party” Bay Area House Party

Richard_NgoFeb 16, 2024, 6:53 PM

181 points

6 comments4 min readLW link

Safety consultations for AI lab employees

Zach Stein-PerlmanJul 27, 2024, 3:00 PM

181 points

4 comments1 min readLW link

[Question] Why is o1 so deceptive?

abramdemskiSep 27, 2024, 5:27 PM

180 points

24 comments3 min readLW link

My motivation and theory of change for working in AI healthtech

Andrew_CritchOct 12, 2024, 12:36 AM

178 points

37 comments14 min readLW link

Toward a Broader Conception of Adverse Selection

Ricki HeicklenMar 14, 2024, 10:40 PM

177 points

61 comments13 min readLW link

(bayesshammai.substack.com)

FHI (Future of Humanity Institute) has shut down (2005–2024)

gwernApr 17, 2024, 1:54 PM

176 points

22 comments1 min readLW link

(www.futureofhumanityinstitute.org)

WTH is Cerebrolysin, actually?

gsfitzgerald and delton137

Aug 6, 2024, 8:40 PM

175 points

23 comments17 min readLW link

When Is Insurance Worth It?

kqrDec 19, 2024, 7:07 PM

175 points

71 comments4 min readLW link

(entropicthoughts.com)

Timaeus’s First Four Months

Jesse Hoogland, Daniel Murfet, Stan van Wingerden and Alexander Gietelink Oldenziel

Feb 28, 2024, 5:01 PM

173 points

6 comments6 min readLW link

Three Subtle Examples of Data Leakage

abstractapplicOct 1, 2024, 8:45 PM

172 points

16 comments4 min readLW link

‘Empiricism!’ as Anti-Epistemology

Eliezer YudkowskyMar 14, 2024, 2:02 AM

171 points

92 comments25 min readLW link

Did Christopher Hitchens change his mind about waterboarding?

Isaac KingSep 15, 2024, 8:28 AM

171 points

22 comments7 min readLW link

Reconsider the anti-cavity bacteria if you are Asian

Lao MeinApr 15, 2024, 7:02 AM

170 points

43 comments4 min readLW link

o1: A Technical Primer

Jesse HooglandDec 9, 2024, 7:09 PM

170 points

19 comments9 min readLW link

(www.youtube.com)

And All the Shoggoths Merely Players

Zack_M_DavisFeb 10, 2024, 7:56 PM

170 points

57 comments12 min readLW link

Overcoming Bias Anthology

Arjun PanicksseryOct 20, 2024, 2:01 AM

169 points

14 comments2 min readLW link

(overcoming-bias-anthology.com)

Recommendation: reports on the search for missing hiker Bill Ewasko

eukaryoteJul 31, 2024, 10:15 PM

169 points

28 comments14 min readLW link

(eukaryotewritesblog.com)

Masterpiece

Richard_NgoFeb 13, 2024, 11:10 PM

166 points

21 comments4 min readLW link

(www.narrativeark.xyz)

You can remove GPT2’s LayerNorm by fine-tuning for an hour

StefanHexAug 8, 2024, 6:33 PM

165 points

11 comments8 min readLW link

Gradient Routing: Masking Gradients to Localize Computation in Neural Networks

cloud, Jacob G-W, Evzen, Joseph Miller and TurnTrout

Dec 6, 2024, 10:19 PM

165 points

12 comments11 min readLW link

(arxiv.org)

Boycott OpenAI

PeterMcCluskeyJun 18, 2024, 7:52 PM

164 points

26 comments1 min readLW link

(bayesianinvestor.com)

The Summoned Heroine’s Prediction Markets Keep Providing Financial Services To The Demon King!

abstractapplicOct 26, 2024, 12:34 PM

164 points

16 comments7 min readLW link

Announcing ILIAD — Theoretical AI Alignment Conference

Nora_Ammann and Alexander Gietelink Oldenziel

Jun 5, 2024, 9:37 AM

163 points

18 comments2 min readLW link

Tips for Empirical Alignment Research

Ethan PerezFeb 29, 2024, 6:04 AM

163 points

4 comments23 min readLW link

Connecting the Dots: LLMs can Infer & Verbalize Latent Structure from Training Data

Johannes Treutlein and Owain_Evans

Jun 21, 2024, 3:54 PM

163 points

13 comments8 min readLW link

(arxiv.org)

Many arguments for AI x-risk are wrong

TurnTroutMar 5, 2024, 2:31 AM

162 points

87 comments12 min readLW link

Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI

Jeremy Gillen and peterbarnett

Jan 26, 2024, 7:22 AM

161 points

60 comments57 min readLW link

Sycophancy to subterfuge: Investigating reward tampering in large language models

Carson Denison and evhub

Jun 17, 2024, 6:41 PM

161 points

22 comments8 min readLW link

(arxiv.org)

o1 is a bad idea

abramdemski11 Nov 2024 21:20 UTC

161 points

39 comments2 min readLW link

DeepMind’s “Frontier Safety Framework” is weak and unambitious

Zach Stein-Perlman18 May 2024 3:00 UTC

159 points

14 comments4 min readLW link

Neutrality

sarahconstantin13 Nov 2024 23:10 UTC

159 points

27 comments11 min readLW link

(sarahconstantin.substack.com)

Deep Honesty

Aletheophile7 May 2024 20:31 UTC

159 points

25 comments9 min readLW link