All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 20232024

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 151617 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Singular Learning Theory for Dummies

Rahul Chand15 Oct 2024 21:13 UTC

2 points

0 comments8 min readLW link

Distillation Of DeepSeek-Prover V1.5

IvanLin15 Oct 2024 18:53 UTC

4 points

1 comment3 min readLW link

Improving Model-Written Evals for AI Safety Benchmarking

Sunishchal Dev and Marius Hobbhahn

15 Oct 2024 18:25 UTC

27 points

0 comments18 min readLW link

Taking nonlogical concepts seriously

Kris Brown15 Oct 2024 18:16 UTC

7 points

5 comments18 min readLW link

(topos.site)

Rashomon—A newsbetting site

ideasthete15 Oct 2024 18:15 UTC

23 points

8 comments1 min readLW link

On the Practical Applications of Interpretability

Nick Jiang15 Oct 2024 17:18 UTC

3 points

0 comments7 min readLW link

Anthropic’s updated Responsible Scaling Policy

Zac Hatfield-Dodds15 Oct 2024 16:46 UTC

51 points

3 comments3 min readLW link

(www.anthropic.com)

[Question] When is reward ever the optimization target?

Noosphere8915 Oct 2024 15:09 UTC

35 points

12 comments1 min readLW link

An Opinionated Evals Reading List

Marius Hobbhahn and Jérémy Scheurer

15 Oct 2024 14:38 UTC

65 points

0 comments13 min readLW link

(www.apolloresearch.ai)

Anthropic rewrote its RSP

Zach Stein-Perlman15 Oct 2024 14:25 UTC

46 points

19 comments6 min readLW link

[Intuitive self-models] 5. Dissociative Identity (Multiple Personality) Disorder

Steven Byrnes15 Oct 2024 13:31 UTC

58 points

7 comments11 min readLW link

Economics Roundup #4

Zvi15 Oct 2024 13:20 UTC

19 points

4 comments25 min readLW link

(thezvi.wordpress.com)

[Question] Is School of Thought related to the Rationality Community?

Shoshannah Tekofsky15 Oct 2024 12:41 UTC

7 points

11 comments1 min readLW link

Inverse Problems In Everyday Life

silentbob15 Oct 2024 11:42 UTC

14 points

2 comments8 min readLW link

Thinking LLMs: General Instruction Following with Thought Generation

Bogdan Ionut Cirstea15 Oct 2024 9:21 UTC

7 points

0 comments1 min readLW link

(arxiv.org)

Thoughts On the Nature of Capability Elicitation via Fine-tuning

Theodore Chapman15 Oct 2024 8:39 UTC

8 points

0 comments8 min readLW link

Minimal Motivation of Natural Latents

johnswentworth and David Lorell

14 Oct 2024 22:51 UTC

44 points

14 comments3 min readLW link

How long should political (and other) terms be?

ohmurphy14 Oct 2024 21:38 UTC

5 points

0 comments1 min readLW link

(ohmurphy.substack.com)

Examples of How I Use LLMs

jefftk14 Oct 2024 17:10 UTC

29 points

2 comments2 min readLW link

(www.jefftk.com)

It’s important to know when to stop: Mechanistic Exploration of Gemma 2 List Generation

Gerard Boxo14 Oct 2024 17:04 UTC

8 points

0 comments6 min readLW link

(gboxo.github.io)

[Question] LW resources on childhood experiences?

nahir9159514 Oct 2024 17:04 UTC

10 points

7 comments1 min readLW link

Free Will, Neurotypical Dominance, and the Path to ASI and Neuralinks: Evolving Beyond Scarcity

j_passeri14 Oct 2024 16:54 UTC

−2 points

3 comments3 min readLW link

Breakthroughs, Neurodivergence, and Working Outside the System

j_passeri14 Oct 2024 16:54 UTC

1 point

3 comments2 min readLW link

The case for unlearning that removes information from LLM weights

Fabien Roger14 Oct 2024 14:08 UTC

96 points

15 comments6 min readLW link

Circuits in Superposition: Compressing many small neural networks into one

Lucius Bushnaq and jake_mendel

14 Oct 2024 13:06 UTC

127 points

8 comments13 min readLW link

Beyond Defensive Technology

ejk6414 Oct 2024 11:34 UTC

11 points

1 comment10 min readLW link

Why Stop AI is barricading OpenAI

Remmelt14 Oct 2024 7:12 UTC

−16 points

32 comments1 min readLW link

(docs.google.com)

The Explore vs. Exploit Dilemma

nathanjzhao14 Oct 2024 6:20 UTC

1 point

0 comments1 min readLW link

(nathanzhao.cc)

AI Alignment via Slow Substrates: Early Empirical Results With StarCraft II

Lester Leong14 Oct 2024 4:05 UTC

60 points

9 comments12 min readLW link

some questionable space launch guns

bhauth13 Oct 2024 22:52 UTC

17 points

0 comments4 min readLW link

(bhauth.com)

[Question] What are your favorite books or blogs that are out of print, or whose domains have expired (especially if they also aren’t on LibGen/Wayback/etc, or on Amazon)?

Arjun Panickssery13 Oct 2024 20:21 UTC

13 points

4 comments1 min readLW link

The Hopium Wars: the AGI Entente Delusion

Max Tegmark13 Oct 2024 17:00 UTC

198 points

55 comments9 min readLW link

Parental Writing Selection Bias

jefftk13 Oct 2024 14:00 UTC

52 points

3 comments1 min readLW link

(www.jefftk.com)

Personal Philosophy

Xor13 Oct 2024 3:01 UTC

3 points

0 comments2 min readLW link

Contagious Beliefs—Simulating Political Alignment

James Stephen Brown13 Oct 2024 0:27 UTC

8 points

0 comments2 min readLW link

(nonzerosum.games)

Binary encoding as a simple explicit construction for superposition

tailcalled12 Oct 2024 21:18 UTC

12 points

0 comments1 min readLW link

[Question] How Should We Use Limited Time to Maximize Long-Term Impact?

queelius12 Oct 2024 20:02 UTC

10 points

3 comments1 min readLW link

A Percentage Model of a Person

Sable12 Oct 2024 17:55 UTC

37 points

3 comments9 min readLW link

(affablyevil.substack.com)

AI Compute governance: Verifying AI chip location

Farhan12 Oct 2024 17:36 UTC

5 points

0 comments6 min readLW link

Geoffrey Hinton on the Past, Present, and Future of AI

Stephen McAleese12 Oct 2024 16:41 UTC

22 points

5 comments18 min readLW link

[Question] I = W/T?

HNX12 Oct 2024 15:15 UTC

0 points

3 comments1 min readLW link

AI research assistants competition 2024Q3: Tie between Elicit and You.com

Elizabeth12 Oct 2024 15:10 UTC

64 points

4 comments3 min readLW link

(acesounderglass.com)

SAE features for refusal and sycophancy steering vectors

neverix, Dmitrii Kharlapenko, Arthur Conmy and Neel Nanda

12 Oct 2024 14:54 UTC

26 points

4 comments7 min readLW link

Prices are Bounties

Maxwell Tabarrok12 Oct 2024 14:51 UTC

51 points

13 comments2 min readLW link

(www.maximum-progress.com)

Differential knowledge interconnection

Roman Leventov12 Oct 2024 12:52 UTC

5 points

0 comments7 min readLW link

Most arguments for AI Doom are either bad or weak

Logan Zoellner12 Oct 2024 11:57 UTC

2 points

97 comments3 min readLW link

Kassel ACX/LW Meetup

Fernand012 Oct 2024 7:47 UTC

2 points

0 comments1 min readLW link

Neural Network And Newton’s Second Law

Max Ma12 Oct 2024 6:25 UTC

−10 points

0 comments1 min readLW link

[Question] If the DoJ goes through with the Google breakup,where does Deepmind end up?

O O12 Oct 2024 5:06 UTC

5 points

1 comment1 min readLW link

My motivation and theory of change for working in AI healthtech

Andrew_Critch12 Oct 2024 0:36 UTC

169 points

37 comments14 min readLW link