All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 20232024

All Jan Feb MarAprMay Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 262728 29 30

Superposition is not “just” neuron polysemanticity

LawrenceC26 Apr 2024 23:22 UTC

64 points

4 comments13 min readLW link

D&D.Sci Long War: Defender of Data-mocracy

aphyer26 Apr 2024 22:30 UTC

44 points

20 comments4 min readLW link

On Not Pulling The Ladder Up Behind You

Screwtape26 Apr 2024 21:58 UTC

188 points

21 comments9 min readLW link

We are headed into an extreme compute overhang

devrandom26 Apr 2024 21:38 UTC

53 points

33 comments2 min readLW link

[Concept Dependency] Edge Regular Lattice Graph

Johannes C. Mayer26 Apr 2024 21:14 UTC

9 points

1 comment1 min readLW link

[Concept Dependency] Concept Dependency Posts

Johannes C. Mayer26 Apr 2024 20:57 UTC

10 points

3 comments2 min readLW link

[Question] Wouldn’t weak AI agents provide warning?

Mandatory Topic26 Apr 2024 19:34 UTC

5 points

0 comments1 min readLW link

World models

A*26 Apr 2024 19:11 UTC

1 point

0 comments1 min readLW link

Duct Tape security

Isaac King26 Apr 2024 18:57 UTC

68 points

11 comments5 min readLW link

Fundamental Uncertainty: Chapter 8 - When does fundamental uncertainty matter?

Gordon Seidoh Worley26 Apr 2024 18:10 UTC

11 points

2 comments32 min readLW link

Scaling of AI training runs will slow down after GPT-5

Maxime Riché26 Apr 2024 16:05 UTC

40 points

5 comments3 min readLW link

Spatial attention as a “tell” for empathetic simulation?

Steven Byrnes26 Apr 2024 15:10 UTC

55 points

12 comments8 min readLW link

Arch-anarchy

Peter lawless 26 Apr 2024 15:05 UTC

−1 points

1 comment25 min readLW link

Breadboarding a Whistle Synth

jefftk26 Apr 2024 15:00 UTC

9 points

2 comments2 min readLW link

(www.jefftk.com)

An Introduction to AI Sandbagging

Teun van der Weij, Felix Hofstätter and Francis Rhys Ward

26 Apr 2024 13:40 UTC

45 points

13 comments8 min readLW link

LLMs seem (relatively) safe

JustisMills25 Apr 2024 22:13 UTC

53 points

24 comments7 min readLW link

(justismills.substack.com)

Losing Faith In Contrarianism

omnizoid25 Apr 2024 20:53 UTC

38 points

44 comments5 min readLW link

Why I stopped being into basin broadness

tailcalled25 Apr 2024 20:47 UTC

16 points

3 comments2 min readLW link

AXRP Episode 29 - Science of Deep Learning with Vikrant Varma

DanielFilan25 Apr 2024 19:10 UTC

20 points

1 comment63 min readLW link

Improving Dictionary Learning with Gated Sparse Autoencoders

Senthooran Rajamanoharan, Arthur Conmy, lewis smith, Tom Lieberum, Vikrant Varma, János Kramár, Rohin Shah and Neel Nanda

25 Apr 2024 18:43 UTC

63 points

38 comments1 min readLW link

(arxiv.org)

“Why I Write” by George Orwell (1946)

Arjun Panickssery25 Apr 2024 16:02 UTC

58 points

2 comments9 min readLW link

(www.orwellfoundation.com)

Knowledge Base 8: The truth as an attractor in the information space

iwis25 Apr 2024 15:28 UTC

−8 points

0 comments2 min readLW link

Cybersecurity of Frontier AI Models: A Regulatory Review

Deric Cheng and Elliot Mckernon

25 Apr 2024 14:51 UTC

8 points

0 comments8 min readLW link

The first future and the best future

KatjaGrace25 Apr 2024 6:40 UTC

106 points

12 comments1 min readLW link

(worldspiritsockpuppet.com)

NIH Cancer Myths Myths

belkarx and henry

25 Apr 2024 5:43 UTC

15 points

1 comment2 min readLW link

social lemon markets

bhauth25 Apr 2024 2:18 UTC

22 points

6 comments3 min readLW link

(www.bhauth.com)

Bayesian inference without priors

DanielFilan24 Apr 2024 23:50 UTC

26 points

8 comments8 min readLW link

(danielfilan.com)

The Inner Ring by C. S. Lewis

Saul Munn24 Apr 2024 22:48 UTC

69 points

6 comments13 min readLW link

(www.lewissociety.org)

This is Water by David Foster Wallace

Nathan Young24 Apr 2024 21:21 UTC

58 points

16 comments13 min readLW link

(fs.blog)

Is being a trans woman (or just low-T) +20 IQ?

lemonhope24 Apr 2024 20:04 UTC

6 points

29 comments1 min readLW link

Betadine oral rinses for covid and other viral infections

Elizabeth24 Apr 2024 17:50 UTC

22 points

3 comments5 min readLW link

(acesounderglass.com)

At last! ChatGPT does, shall we say, interesting imitations of “Kubla Khan”

Bill Benzon24 Apr 2024 14:56 UTC

−3 points

0 comments4 min readLW link

Magic by forgetting

avturchin24 Apr 2024 14:32 UTC

18 points

39 comments4 min readLW link

Changes in College Admissions

Zvi24 Apr 2024 13:50 UTC

50 points

11 comments39 min readLW link

(thezvi.wordpress.com)

1-page outline of Carlsmith’s otherness and control series

Nathan Young24 Apr 2024 11:25 UTC

22 points

3 comments3 min readLW link

How to use and interpret activation patching

StefanHex and Neel Nanda

24 Apr 2024 8:35 UTC

12 points

0 comments18 min readLW link

AI Generated Music as a Method of Installing Essential Rationalist Skills

keltan24 Apr 2024 7:48 UTC

13 points

3 comments1 min readLW link

Electronic Harp Mandolin Prototype

jefftk24 Apr 2024 2:20 UTC

9 points

0 comments1 min readLW link

(www.jefftk.com)

[Question] Examples of Highly Counterfactual Discoveries?

johnswentworth23 Apr 2024 22:19 UTC

194 points

101 comments1 min readLW link

[Question] Is there software to practice reading expressions?

lsusr23 Apr 2024 21:53 UTC

37 points

10 comments1 min readLW link

Let’s Design A School, Part 1

Sable23 Apr 2024 21:50 UTC

55 points

5 comments11 min readLW link

(affablyevil.substack.com)

WSJ: Inside Amazon’s Secret Operation to Gather Intel on Rivals

trevor23 Apr 2024 21:33 UTC

37 points

5 comments5 min readLW link

(www.wsj.com)

On Minicircle

Metacelsus23 Apr 2024 21:28 UTC

10 points

0 comments1 min readLW link

(docs.google.com)

Simple probes can catch sleeper agents

Monte M, Carson Denison, Zac Hatfield-Dodds, David Duvenaud, Sam Bowman, Ethan Perez and evhub

23 Apr 2024 21:10 UTC

133 points

21 comments1 min readLW link

(www.anthropic.com)

Manifold “exploring real cash prizes”

Rana Dexsin23 Apr 2024 21:07 UTC

7 points

0 comments1 min readLW link

(manifoldmarkets.notion.site)

[Question] (When) Should you work through the night when inspiration strikes you?

Chi Nguyen23 Apr 2024 21:07 UTC

21 points

4 comments1 min readLW link

Book review: Deep Utopia

PeterMcCluskey23 Apr 2024 19:55 UTC

45 points

14 comments4 min readLW link

(bayesianinvestor.com)

On what research policymakers actually need

MondSemmel23 Apr 2024 19:50 UTC

38 points

0 comments3 min readLW link

(www.slowboring.com)

Dequantifying first-order theories

jessicata23 Apr 2024 19:04 UTC

40 points

9 comments8 min readLW link

(unstableontology.com)

Vector Planning in a Lattice Graph

Johannes C. Mayer and Thomas Kehrenberg

23 Apr 2024 16:58 UTC

20 points

7 comments2 min readLW link