All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

[Question] Has anyone increased their AGI timelines?

Darren McKeeNov 6, 2022, 12:03 AM

39 points

12 comments1 min readLW link

Refining the Sharp Left Turn threat model, part 2: applying alignment techniques

Vika, Vikrant Varma, Ramana Kumar and Rohin Shah

Nov 25, 2022, 2:36 PM

39 points

9 comments6 min readLW link

(vkrakovna.wordpress.com)

A caveat to the Orthogonality Thesis

Wuschel SchulzNov 9, 2022, 3:06 PM

38 points

10 comments2 min readLW link

Internal communication framework

rosehadshar and Nora_Ammann

Nov 15, 2022, 12:41 PM

38 points

13 comments12 min readLW link

Choosing the right dish

Adam ZernerNov 19, 2022, 1:38 AM

38 points

7 comments8 min readLW link

[Question] Is there any discussion on avoiding being Dutch-booked or otherwise taken advantage of one’s bounded rationality by refusing to engage?

ShmiNov 7, 2022, 2:36 AM

38 points

29 comments1 min readLW link

How do I start a programming career in the West?

Lao MeinNov 25, 2022, 6:37 AM

38 points

7 comments2 min readLW link

Feeling Old: Leaving your 20s in the 2020s

squidiousNov 22, 2022, 10:50 PM

37 points

3 comments1 min readLW link

(opalsandbonobos.blogspot.com)

Podcast: Shoshannah Tekofsky on skilling up in AI safety, visiting Berkeley, and developing novel research ideas

Orpheus16Nov 25, 2022, 8:47 PM

37 points

2 comments9 min readLW link

Discussing how to align Transformative AI if it’s developed very soon

eliflandNov 28, 2022, 4:17 PM

37 points

2 comments28 min readLW link

If Professional Investors Missed This...

jefftkNov 16, 2022, 3:00 PM

37 points

18 comments3 min readLW link

(www.jefftk.com)

Simulators, constraints, and goal agnosticism: porbynotes vol. 1

porbyNov 23, 2022, 4:22 AM

37 points

2 comments35 min readLW link

User-Controlled Algorithmic Feeds

jefftkNov 12, 2022, 3:20 PM

35 points

7 comments2 min readLW link

(www.jefftk.com)

Some research ideas in forecasting

JsevillamolNov 15, 2022, 7:47 PM

35 points

2 comments LW link

Housing and Transit Thoughts #1

ZviNov 2, 2022, 12:10 PM

35 points

5 comments16 min readLW link

(thezvi.wordpress.com)

[Hebbian Natural Abstractions] Introduction

Samuel Nellessen and Jan

Nov 21, 2022, 8:34 PM

34 points

3 comments4 min readLW link

(www.snellessen.com)

Value Formation: An Overarching Model

Thane RuthenisNov 15, 2022, 5:16 PM

34 points

20 comments34 min readLW link

Solstice 2022 Roundup

dspeyerNov 12, 2022, 9:26 PM

34 points

12 comments1 min readLW link

Ways to buy time

Orpheus16, OliviaJ and Thomas Larsen

Nov 12, 2022, 7:31 PM

34 points

23 comments12 min readLW link

Weekly Roundup #5

ZviNov 11, 2022, 4:20 PM

33 points

0 comments6 min readLW link

(thezvi.wordpress.com)

Thinking About Mastodon

jefftkNov 7, 2022, 7:40 PM

33 points

17 comments1 min readLW link

(www.jefftk.com)

People care about each other even though they have imperfect motivational pointers?

TurnTroutNov 8, 2022, 6:15 PM

33 points

25 comments7 min readLW link

Covid 11/17/22: Slow Recovery

ZviNov 17, 2022, 2:50 PM

33 points

3 comments4 min readLW link

(thezvi.wordpress.com)

Auditing games for high-level interpretability

Paul CologneseNov 1, 2022, 10:44 AM

33 points

1 comment7 min readLW link

Make the Drought Evaporate!

AnthonyRepettoNov 19, 2022, 11:41 PM

32 points

25 comments3 min readLW link

Charging for the Dharma

jchanNov 11, 2022, 2:02 PM

32 points

18 comments5 min readLW link

Why bet Kelly?

AlexMennenNov 15, 2022, 6:12 PM

32 points

14 comments5 min readLW link

Review: LOVE in a simbox

PeterMcCluskeyNov 27, 2022, 5:41 PM

32 points

4 comments9 min readLW link

(bayesianinvestor.com)

When should we be surprised that an invention took “so long”?

jasoncrawfordNov 16, 2022, 8:04 PM

32 points

11 comments4 min readLW link

(rootsofprogress.org)

Covid 11/10/22: Into the Background

ZviNov 10, 2022, 1:40 PM

31 points

5 comments4 min readLW link

(thezvi.wordpress.com)

Adversarial Policies Beat Professional-Level Go AIs

sanxiynNov 3, 2022, 1:27 PM

31 points

35 comments1 min readLW link

(goattack.alignmentfund.org)

Unpacking “Shard Theory” as Hunch, Question, Theory, and Insight

Jacy Reese AnthisNov 16, 2022, 1:54 PM

31 points

9 comments2 min readLW link

Gliders in Language Models

Alexandre VariengienNov 25, 2022, 12:38 AM

30 points

11 comments10 min readLW link

A Walkthrough of Interpretability in the Wild (w/ authors Kevin Wang, Arthur Conmy & Alexandre Variengien)

Neel NandaNov 7, 2022, 10:39 PM

30 points

15 comments3 min readLW link

(youtu.be)

What videos should Rational Animations make?

WriterNov 26, 2022, 8:28 PM

30 points

24 comments LW link

The Mirror Chamber: A short story exploring the anthropic measure function and why it can matter

mako yassNov 3, 2022, 6:47 AM

30 points

13 comments10 min readLW link

ML Safety Scholars Summer 2022 Retrospective

TW123Nov 1, 2022, 3:09 AM

29 points

0 comments LW link

You won’t solve alignment without agent foundations

Mikhail SaminNov 6, 2022, 8:07 AM

29 points

3 comments8 min readLW link

Response

Jarred FilmerNov 6, 2022, 1:03 AM

29 points

2 comments12 min readLW link

Good Futures Initiative: Winter Project Internship

ArisNov 27, 2022, 11:41 PM

28 points

4 comments4 min readLW link

The economy as an analogy for advanced AI systems

rosehadshar and particlemania

Nov 15, 2022, 11:16 AM

28 points

0 comments5 min readLW link

Mechanistic Interpretability as Reverse Engineering (follow-up to “cars and elephants”)

David Scott Krueger (formerly: capybaralet)Nov 3, 2022, 11:19 PM

28 points

3 comments1 min readLW link

A short critique of Vanessa Kosoy’s PreDCA

Martín SotoNov 13, 2022, 4:00 PM

28 points

8 comments4 min readLW link

Semi-conductor/AI Stock Discussion.

sapphireNov 25, 2022, 11:35 PM

28 points

25 comments1 min readLW link

Estimating the probability that FTX Future Fund grant money gets clawed back

spencergNov 14, 2022, 3:33 AM

28 points

6 comments LW link

Toy Models and Tegum Products

Adam JermynNov 4, 2022, 6:51 PM

28 points

7 comments5 min readLW link

LLMs may capture key components of human agency

catubcNov 17, 2022, 8:14 PM

27 points

0 comments4 min readLW link

Why I’m Working On Model Agnostic Interpretability

Jessica RumbelowNov 11, 2022, 9:24 AM

27 points

9 comments2 min readLW link

The Ground Truth Problem (Or, Why Evaluating Interpretability Methods Is Hard)

Jessica RumbelowNov 17, 2022, 11:06 AM

27 points

2 comments2 min readLW link

Inverse scaling can become U-shaped

Edouard HarrisNov 8, 2022, 7:04 PM

27 points

15 comments1 min readLW link

(arxiv.org)