All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 20232024

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All 1 234 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Announcing the AI Forecasting Benchmark Series | July 8, $120k in Prizes

ChristianWilliams2 Jul 2024 22:33 UTC

15 points

0 comments1 min readLW link

(www.metaculus.com)

Open Sourcing Metaculus

ChristianWilliams2 Jul 2024 22:30 UTC

44 points

0 comments1 min readLW link

(www.metaculus.com)

[Question] Why Can’t Sub-AGI Solve AI Alignment? Or: Why Would Sub-AGI AI Not be Aligned?

MrThink2 Jul 2024 20:13 UTC

4 points

23 comments1 min readLW link

[Question] Why haven’t there been assassination attempts against high profile AI accelerationists like sam altman yet?

louisTrem2 Jul 2024 18:16 UTC

−13 points

4 comments2 min readLW link

How ARENA course material gets made

CallumMcDougall2 Jul 2024 18:04 UTC

41 points

2 comments7 min readLW link

An AI Race With China Can Be Better Than Not Racing

niplav2 Jul 2024 17:57 UTC

69 points

33 comments11 min readLW link

List of Collective Intelligence Projects

Chipmonk2 Jul 2024 14:10 UTC

40 points

9 comments2 min readLW link

(chrislakin.blog)

Decomposing the QK circuit with Bilinear Sparse Dictionary Learning

keith_wynroe and Lee Sharkey

2 Jul 2024 13:17 UTC

81 points

7 comments12 min readLW link

Economics Roundup #2

Zvi2 Jul 2024 12:40 UTC

35 points

5 comments23 min readLW link

(thezvi.wordpress.com)

How Congressional Offices Process Constituent Communication

Tristan Williams2 Jul 2024 12:38 UTC

24 points

0 comments1 min readLW link

OthelloGPT learned a bag of heuristics

jylin04, JackS, Adam Karvonen and Can

2 Jul 2024 9:12 UTC

108 points

10 comments9 min readLW link

Blueprint for a Brighter Future

Alex Beyman2 Jul 2024 6:15 UTC

−1 points

0 comments5 min readLW link

Covert Malicious Finetuning

Tony Wang and dannyhalawi

2 Jul 2024 2:41 UTC

89 points

4 comments3 min readLW link

Interpreting Preference Models w/ Sparse Autoencoders

Logan Riggs and Jannik Brinkmann

1 Jul 2024 21:35 UTC

74 points

12 comments9 min readLW link

Honest science is spirituality

pchvykov1 Jul 2024 20:33 UTC

−1 points

10 comments4 min readLW link

New Executive Team & Board — PIBBSS

Nora_Ammann1 Jul 2024 19:30 UTC

43 points

1 comment1 min readLW link

Uncursing Civilization

Lorec1 Jul 2024 18:44 UTC

−6 points

2 comments5 min readLW link

[Question] Self-censoring on AI x-risk discussions?

Decaeneus1 Jul 2024 18:24 UTC

17 points

2 comments1 min readLW link

Rationalists As People Who Build Piles Of Rocks

Sable1 Jul 2024 10:32 UTC

9 points

0 comments5 min readLW link

(affablyevil.substack.com)

How good are LLMs at doing ML on an unknown dataset?

Håvard Tveit Ihle1 Jul 2024 9:04 UTC

33 points

4 comments13 min readLW link

Whirlwind Tour of Chain of Thought Literature Relevant to Automating Alignment Research.

sevdeawesome1 Jul 2024 5:50 UTC

25 points

0 comments17 min readLW link

Probabilistic Logic ⇔ Oracles?

Yudhister Kumar1 Jul 2024 5:36 UTC

15 points

0 comments4 min readLW link

Important open problems in voting

Closed Limelike Curves1 Jul 2024 2:53 UTC

33 points

1 comment1 min readLW link

Anti-Circumcision Essay 3 of 3: Now That I Think About It, Is There Actually a Space Between “Info” and “Hazard”? Isn’t It Just One Word?

Harry Stevenage1 Jul 2024 2:21 UTC

12 points

0 comments7 min readLW link

In Defense of Lawyers Playing Their Part

Isaac King1 Jul 2024 1:32 UTC

32 points

9 comments9 min readLW link

Anti-circumcision Essay 2 of 3: Physical and Psychological Realities

Harry Stevenage30 Jun 2024 22:13 UTC

12 points

5 comments9 min readLW link

Review of METR’s public evaluation protocol

nahoj and JaimeRV

30 Jun 2024 22:03 UTC

10 points

0 comments5 min readLW link

Superposition, Self-Modeling, and the Path to AGI: A New Perspective

Peterpiper30 Jun 2024 17:20 UTC

−13 points

0 comments2 min readLW link

Anti-Circumcision Essay 1 of 3: According To Their Critics, Intactivists Are The Best-Behaved Protest Movement In History

Harry Stevenage30 Jun 2024 17:17 UTC

12 points

6 comments5 min readLW link

The Xerox Parc/ARPA version of the intellectual Turing test: Class 1 vs Class 2 disagreement

hamishtodd130 Jun 2024 15:34 UTC

6 points

3 comments1 min readLW link

LLMs Universally Learn a Feature Representing Token Frequency / Rarity

Sean Osier30 Jun 2024 2:48 UTC

12 points

5 comments6 min readLW link

(github.com)

My 5-step program for losing weight

Nikita Sokolsky30 Jun 2024 1:05 UTC

22 points

20 comments5 min readLW link

(nsokolsky.substack.com)

Datasets that change the odds you exist

dynomight29 Jun 2024 18:45 UTC

56 points

4 comments6 min readLW link

(dynomight.net)

A “Scaling Monosemanticity” Explainer

latterframe and Fedor Ryzhenkov

29 Jun 2024 17:50 UTC

10 points

0 comments3 min readLW link

Analysis of key AI analogies

Kevin Kohler29 Jun 2024 10:55 UTC

10 points

2 comments15 min readLW link

Georgism Crash Course

Zero Contradictions29 Jun 2024 6:18 UTC

9 points

5 comments1 min readLW link

(zerocontradictions.net)

Activation Pattern SVD: A proposal for SAE Interpretability

Daniel Tan28 Jun 2024 22:12 UTC

15 points

2 comments2 min readLW link

Podcast: Elizabeth & Austin on “What Manifold was allowed to do”

Austin Chen28 Jun 2024 22:10 UTC

20 points

0 comments1 min readLW link

(share.descript.com)

The Incredible Fentanyl-Detecting Machine

sarahconstantin28 Jun 2024 22:10 UTC

154 points

26 comments7 min readLW link

(sarahconstantin.substack.com)

Saving Lives Reduces Over-Population—A Counter-Intuitive Non-Zero-Sum Game

James Stephen Brown28 Jun 2024 19:29 UTC

6 points

0 comments5 min readLW link

(nonzerosum.games)

Mentorship in AGI Safety: Applications for mentorship are open!

Valentin2026 and Joe Rogero

28 Jun 2024 14:49 UTC

5 points

0 comments1 min readLW link

Contra Acemoglu on AI

Maxwell Tabarrok28 Jun 2024 13:13 UTC

48 points

0 comments5 min readLW link

(www.maximum-progress.com)

Five toy worlds to think about heritability

David Hugh-Jones28 Jun 2024 13:11 UTC

13 points

0 comments9 min readLW link

(wyclif.substack.com)

[Question] How do natural sciences prove causation?

Kongo Landwalker28 Jun 2024 11:58 UTC

1 point

3 comments1 min readLW link

LessWrong/ACX meetup Transilvanya tour—Sibiu

Marius Adrian Nicoară28 Jun 2024 11:41 UTC

1 point

1 comment1 min readLW link

Bayes’ Theorem: In Search of Gold (Lesson 1)

bayesyatina28 Jun 2024 8:39 UTC

3 points

0 comments3 min readLW link

How a chip is designed

YM28 Jun 2024 8:04 UTC

65 points

4 comments5 min readLW link

The Wisdom of Living for 200 Years

Martin Sustrik28 Jun 2024 4:44 UTC

25 points

3 comments4 min readLW link

A Generally Intelligent Game

snerx28 Jun 2024 1:31 UTC

−1 points

1 comment4 min readLW link

Corrigibility = Tool-ness?

johnswentworth and David Lorell

28 Jun 2024 1:19 UTC

78 points

8 comments9 min readLW link