All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 20232024

All Jan Feb Mar Apr MayJunJul Aug Sep Oct Nov

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 282930

Activation Pattern SVD: A proposal for SAE Interpretability

Daniel Tan28 Jun 2024 22:12 UTC

15 points

2 comments2 min readLW link

Podcast: Elizabeth & Austin on “What Manifold was allowed to do”

Austin Chen28 Jun 2024 22:10 UTC

20 points

0 comments1 min readLW link

(share.descript.com)

The Incredible Fentanyl-Detecting Machine

sarahconstantin28 Jun 2024 22:10 UTC

154 points

26 comments7 min readLW link

(sarahconstantin.substack.com)

Saving Lives Reduces Over-Population—A Counter-Intuitive Non-Zero-Sum Game

James Stephen Brown28 Jun 2024 19:29 UTC

6 points

0 comments5 min readLW link

(nonzerosum.games)

Mentorship in AGI Safety: Applications for mentorship are open!

Valentin2026 and Joe Rogero

28 Jun 2024 14:49 UTC

5 points

0 comments1 min readLW link

Contra Acemoglu on AI

Maxwell Tabarrok28 Jun 2024 13:13 UTC

48 points

0 comments5 min readLW link

(www.maximum-progress.com)

Five toy worlds to think about heritability

David Hugh-Jones28 Jun 2024 13:11 UTC

13 points

0 comments9 min readLW link

(wyclif.substack.com)

[Question] How do natural sciences prove causation?

Kongo Landwalker28 Jun 2024 11:58 UTC

1 point

3 comments1 min readLW link

LessWrong/ACX meetup Transilvanya tour—Sibiu

Marius Adrian Nicoară28 Jun 2024 11:41 UTC

1 point

1 comment1 min readLW link

Bayes’ Theorem: In Search of Gold (Lesson 1)

bayesyatina28 Jun 2024 8:39 UTC

3 points

0 comments3 min readLW link

How a chip is designed

YM28 Jun 2024 8:04 UTC

65 points

4 comments5 min readLW link

The Wisdom of Living for 200 Years

Martin Sustrik28 Jun 2024 4:44 UTC

25 points

3 comments4 min readLW link

A Generally Intelligent Game

snerx28 Jun 2024 1:31 UTC

−1 points

1 comment4 min readLW link

Corrigibility = Tool-ness?

johnswentworth and David Lorell

28 Jun 2024 1:19 UTC

78 points

8 comments9 min readLW link

Situational Awareness

PeterMcCluskey28 Jun 2024 1:08 UTC

11 points

0 comments12 min readLW link

(bayesianinvestor.com)

Toward a taxonomy of cognitive benchmarks for agentic AGIs

Ben Smith27 Jun 2024 23:50 UTC

15 points

0 comments5 min readLW link

How Big a Deal are MatMul-Free Transformers?

JustisMills27 Jun 2024 22:28 UTC

19 points

6 comments5 min readLW link

(justismills.substack.com)

Secondary forces of debt

KatjaGrace27 Jun 2024 21:10 UTC

77 points

18 comments2 min readLW link

(worldspiritsockpuppet.com)

Distillation of ‘Do language models plan for future tokens’

TheManxLoiner27 Jun 2024 20:57 UTC

26 points

2 comments6 min readLW link

how birds sense magnetic fields

bhauth27 Jun 2024 18:59 UTC

51 points

4 comments5 min readLW link

(www.bhauth.com)

Representation Tuning

Christopher Ackerman27 Jun 2024 17:44 UTC

35 points

9 comments13 min readLW link

An issue with training schemers with supervised fine-tuning

Fabien Roger27 Jun 2024 15:37 UTC

49 points

12 comments6 min readLW link

AI #70: A Beautiful Sonnet

Zvi27 Jun 2024 14:40 UTC

38 points

0 comments44 min readLW link

(thezvi.wordpress.com)

Detecting Genetically Engineered Viruses With Metagenomic Sequencing

jefftk27 Jun 2024 14:01 UTC

87 points

10 comments1 min readLW link

(naobservatory.org)

Cross Robin

jefftk27 Jun 2024 3:10 UTC

11 points

2 comments1 min readLW link

(www.jefftk.com)

Live Theory Part 0: Taking Intelligence Seriously

Sahil26 Jun 2024 21:37 UTC

94 points

3 comments8 min readLW link

Instrumental vs Terminal Desiderata

Max Harms26 Jun 2024 20:57 UTC

21 points

0 comments3 min readLW link

Imbue (Generally Intelligent) continue to make progress

Nathan Helm-Burger26 Jun 2024 20:41 UTC

18 points

0 comments1 min readLW link

(imbue.com)

Tracing the steps

matimissona26 Jun 2024 19:22 UTC

−8 points

2 comments4 min readLW link

Countering AI disinformation and deep fakes with digital signatures

Dave Lindbergh26 Jun 2024 18:09 UTC

13 points

5 comments1 min readLW link

Progress Conference 2024: Toward Abundant Futures

jasoncrawford26 Jun 2024 15:39 UTC

40 points

2 comments1 min readLW link

(rootsofprogress.org)

Schelling points in the AGI policy space

mesaoptimizer26 Jun 2024 13:19 UTC

52 points

2 comments6 min readLW link

Bad lessons learned from the debate

bayesyatina26 Jun 2024 11:54 UTC

8 points

5 comments6 min readLW link

Childhood and Education Roundup #6: College Edition

Zvi26 Jun 2024 11:40 UTC

28 points

8 comments23 min readLW link

(thezvi.wordpress.com)

New fast transformer inference ASIC — Sohu by Etched

lemonhope26 Jun 2024 9:56 UTC

8 points

9 comments1 min readLW link

(www.etched.com)

Empirical vs. Mathematical Joints of Nature

Elizabeth and Alex_Altair

26 Jun 2024 1:55 UTC

35 points

1 comment5 min readLW link

My Current Claims and Cruxes on LLM Forecasting & Epistemics

ozziegooen26 Jun 2024 0:40 UTC

11 points

0 comments1 min readLW link

In favour of exploring nagging doubts about x-risk

owencb25 Jun 2024 23:52 UTC

105 points

2 comments1 min readLW link

What is a Tool?

johnswentworth and David Lorell

25 Jun 2024 23:40 UTC

62 points

4 comments6 min readLW link

[Question] When do alignment researchers retire?

Jordan Taylor25 Jun 2024 23:30 UTC

4 points

2 comments1 min readLW link

Compute Governance Literature Review

sijarvis25 Jun 2024 22:41 UTC

10 points

0 comments13 min readLW link

Computational Complexity as an Intuition Pump for LLM Generality

aribrill25 Jun 2024 20:25 UTC

18 points

6 comments3 min readLW link

Failure Modes of Teaching AI Safety

Eleni Angelou25 Jun 2024 19:07 UTC

20 points

0 comments1 min readLW link

Kingfisher Summer Tour 2024

jefftk25 Jun 2024 18:50 UTC

9 points

0 comments1 min readLW link

(www.jefftk.com)

Incentive Learning vs Dead Sea Salt Experiment

Steven Byrnes25 Jun 2024 17:49 UTC

27 points

1 comment28 min readLW link

An Intuitive Explanation of Sparse Autoencoders for Mechanistic Interpretability of LLMs

Adam Karvonen25 Jun 2024 15:57 UTC

25 points

0 comments9 min readLW link

(adamkarvonen.github.io)

Formal verification, heuristic explanations and surprise accounting

Jacob_Hilton25 Jun 2024 15:40 UTC

156 points

11 comments9 min readLW link

(www.alignment.org)

Metastrategy get-started guide

Tahp25 Jun 2024 15:04 UTC

5 points

1 comment8 min readLW link

Labor Participation is an Alignment Risk

alex25 Jun 2024 14:15 UTC

−5 points

2 comments17 min readLW link

Monthly Roundup #19: June 2024

Zvi25 Jun 2024 12:00 UTC

28 points

9 comments54 min readLW link

(thezvi.wordpress.com)