All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 202122 23 24 25 26 27 28 29 30

Current themes in mechanistic interpretability research

Lee Sharkey, Sid Black and beren

Nov 16, 2022, 2:14 PM

89 points

2 comments12 min readLW link

Unpacking “Shard Theory” as Hunch, Question, Theory, and Insight

Jacy Reese AnthisNov 16, 2022, 1:54 PM

31 points

9 comments2 min readLW link

Miracles and why not to believe them

mruwnikNov 16, 2022, 12:07 PM

4 points

0 comments2 min readLW link

[Question] How do people do remote research collaborations effectively?

KriegerNov 16, 2022, 11:51 AM

8 points

0 comments1 min readLW link

Method of statements: an alternative to taboo

Q HomeNov 16, 2022, 10:57 AM

7 points

0 comments41 min readLW link

The two conceptions of Active Inference: an intelligence architecture and a theory of agency

Roman LeventovNov 16, 2022, 9:30 AM

18 points

0 comments4 min readLW link

Developer experience for the motivation

Adam ZernerNov 16, 2022, 7:12 AM

49 points

7 comments4 min readLW link

Progress links and tweets, 2022-11-15

jasoncrawfordNov 16, 2022, 3:21 AM

9 points

0 comments2 min readLW link

(rootsofprogress.org)

EA & LW Forums Weekly Summary (7th Nov − 13th Nov 22′)

Zoe WilliamsNov 16, 2022, 3:04 AM

19 points

0 comments LW link

The FTX Saga—Simplified

AnnapurnaNov 16, 2022, 2:42 AM

44 points

10 comments7 min readLW link

(jorgevelez.substack.com)

Utilitarianism and the idea of a “rational agent” are fundamentally inconsistent with reality

banevNov 16, 2022, 12:19 AM

−4 points

1 comment1 min readLW link

[Question] Is the speed of training large models going to increase significantly in the near future due to Cerebras Andromeda?

Amal Nov 15, 2022, 10:50 PM

13 points

11 comments1 min readLW link

[Question] What is our current best infohazard policy for AGI (safety) research?

Roman LeventovNov 15, 2022, 10:33 PM

12 points

2 comments1 min readLW link

ACX/SSC Meetup 1 pm Sunday Nov 20

svfritzNov 15, 2022, 8:39 PM

2 points

0 comments1 min readLW link

SBF x LoL

Nicholas / Heather KrossNov 15, 2022, 8:24 PM

17 points

6 comments LW link

Some research ideas in forecasting

JsevillamolNov 15, 2022, 7:47 PM

35 points

2 comments LW link

Strategy of Inner Conflict

Jonathan MoregårdNov 15, 2022, 7:38 PM

9 points

4 comments6 min readLW link

(honestliving.substack.com)

The limited upside of interpretability

Peter S. ParkNov 15, 2022, 6:46 PM

13 points

11 comments LW link

Why bet Kelly?

AlexMennenNov 15, 2022, 6:12 PM

32 points

14 comments5 min readLW link

Entropy Scaling And Intrinsic Memory

Alexander Gietelink Oldenziel and Adam Shai

Nov 15, 2022, 6:11 PM

20 points

5 comments5 min readLW link

[Question] Will nanotech/biotech be what leads to AI doom?

tailcalledNov 15, 2022, 5:38 PM

4 points

9 comments2 min readLW link

Value Formation: An Overarching Model

Thane RuthenisNov 15, 2022, 5:16 PM

34 points

20 comments34 min readLW link

Internal communication framework

rosehadshar and Nora_Ammann

Nov 15, 2022, 12:41 PM

38 points

13 comments12 min readLW link

Better Mastodon Aliases

jefftkNov 15, 2022, 12:10 PM

14 points

3 comments1 min readLW link

(www.jefftk.com)

The economy as an analogy for advanced AI systems

rosehadshar and particlemania

Nov 15, 2022, 11:16 AM

28 points

0 comments5 min readLW link

We need better prediction markets

eigenNov 15, 2022, 4:54 AM

9 points

8 comments1 min readLW link

Preventing, reversing, and addressing data leakage: some thoughts

VipulNaikNov 15, 2022, 2:09 AM

14 points

4 comments25 min readLW link

Winners of the AI Safety Nudge Competition

Marc CarauleanuNov 15, 2022, 1:06 AM

4 points

0 comments LW link

Lying to Save Humanity

cebsuvxNov 14, 2022, 11:04 PM

−1 points

4 comments1 min readLW link

Moral contagion heuristic

MvolzNov 14, 2022, 9:17 PM

14 points

3 comments2 min readLW link

Will we run out of ML data? Evidence from projecting dataset size trends

Pablo VillalobosNov 14, 2022, 4:42 PM

75 points

12 comments2 min readLW link

(epochai.org)

I (with the help of a few more people) am planning to create an introduction to AI Safety that a smart teenager can understand. What am I missing?

TapataktNov 14, 2022, 4:12 PM

3 points

5 comments1 min readLW link

Two New Newcomb Variants

eva_Nov 14, 2022, 2:01 PM

26 points

24 comments3 min readLW link

Improving Emergency Vehicle Utilization

jefftkNov 14, 2022, 2:00 PM

15 points

10 comments1 min readLW link

(www.jefftk.com)

X-risk Mitigation Does Actually Require Longtermism

DragonGodNov 14, 2022, 12:54 PM

6 points

1 comment LW link

[Question] Why don’t we have self driving cars yet?

Linda LinseforsNov 14, 2022, 12:19 PM

22 points

16 comments1 min readLW link

Eigenvalues for Distance from The Buddhist Precepts And The Ten Commandments

benjamin.j.campbellNov 14, 2022, 5:50 AM

−3 points

2 comments1 min readLW link

AI Safety Microgrant Round

Chris_LeongNov 14, 2022, 4:25 AM

22 points

1 comment LW link

Estimating the probability that FTX Future Fund grant money gets clawed back

spencergNov 14, 2022, 3:33 AM

28 points

6 comments LW link

Rational overconfidence in the tens of billions: recent example

banevNov 13, 2022, 10:48 PM

−20 points

3 comments2 min readLW link

In Defence of Temporal Discounting in Longtermist Ethics

DragonGodNov 13, 2022, 9:54 PM

25 points

4 comments LW link

Announcing Nonlinear Emergency Funding

KatWoodsNov 13, 2022, 7:02 PM

54 points

0 comments LW link

The Alignment Community Is Culturally Broken

sudoNov 13, 2022, 6:53 PM

136 points

68 comments2 min readLW link

The Futility of Status and Signalling

Ape in the coatNov 13, 2022, 5:14 PM

19 points

4 comments3 min readLW link

A short critique of Vanessa Kosoy’s PreDCA

Martín SotoNov 13, 2022, 4:00 PM

28 points

8 comments4 min readLW link

What’s the Alternative to Independence?

jefftkNov 13, 2022, 3:30 PM

50 points

3 comments1 min readLW link

(www.jefftk.com)

Decision making under model ambiguity, moral uncertainty, and other agents with free will?

Jobst HeitzigNov 13, 2022, 12:50 PM

4 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

The sky is not blue (pardon the obviousness)

banevNov 13, 2022, 10:49 AM

−13 points

6 comments1 min readLW link

Characterizing Intrinsic Compositionality in Transformers with Tree Projections

Ulisse MiniNov 13, 2022, 9:46 AM

12 points

2 comments1 min readLW link

(arxiv.org)

Noting an unsubstantiated communal belief about the FTX disaster

YitzNov 13, 2022, 5:37 AM

50 points

52 comments LW link