All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 789 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

[Question] If alignment problem was unsolvable, would that avoid doom?

Kinrany7 May 2023 22:13 UTC

3 points

3 comments1 min readLW link

An artificially structured argument for expecting AGI ruin

Rob Bensinger7 May 2023 21:52 UTC

91 points

26 comments19 min readLW link

Where “the Sequences” Are Wrong

Thoth Hermes7 May 2023 20:21 UTC

−15 points

5 comments14 min readLW link

(thothhermes.substack.com)

What’s wrong with being dumb?

Adam Zerner7 May 2023 18:31 UTC

14 points

17 comments2 min readLW link

Categories of Arguing Style : Why being good among rationalists isn’t enough to argue with everyone

Camille Berger 7 May 2023 17:45 UTC

16 points

0 comments23 min readLW link

Self-Administered Gell-Mann Amnesia

krs7 May 2023 17:44 UTC

1 point

1 comment1 min readLW link

Understanding mesa-optimization using toy models

tilmanr, rusheb, Guillaume Corlouer, Dan Valentine, afspies, mivanitskiy and Can

7 May 2023 17:00 UTC

43 points

2 comments10 min readLW link

How to have Polygenically Screened Children

GeneSmith7 May 2023 16:01 UTC

354 points

127 comments27 min readLW link

Statistical models & the irrelevance of rare exceptions

patrissimo7 May 2023 15:59 UTC

37 points

6 comments2 min readLW link

Let’s look for coherence theorems

Valdes7 May 2023 14:45 UTC

25 points

18 comments6 min readLW link

Graphical Representations of Paul Christiano’s Doom Model

Nathan Young7 May 2023 13:03 UTC

7 points

0 comments1 min readLW link

An anthropomorphic AI dilemma

TsviBT7 May 2023 12:44 UTC

26 points

0 comments7 min readLW link

Violin Supports

jefftk7 May 2023 12:10 UTC

12 points

1 comment1 min readLW link

(www.jefftk.com)

Properties of Good Textbooks

niplav7 May 2023 8:38 UTC

50 points

11 comments1 min readLW link

Against sacrificing AI transparency for generality gains

Ape in the coat7 May 2023 6:52 UTC

4 points

0 comments2 min readLW link

TED talk by Eliezer Yudkowsky: Unleashing the Power of Artificial Intelligence

bayesed7 May 2023 5:45 UTC

49 points

36 comments1 min readLW link

(www.youtube.com)

Thinking of Convenience as an Economic Term

ozziegooen7 May 2023 1:21 UTC

6 points

0 comments12 min readLW link

(forum.effectivealtruism.org)

Corrigibility, Much more detail than anyone wants to Read

Logan Zoellner7 May 2023 1:02 UTC

26 points

2 comments7 min readLW link

Residual stream norms grow exponentially over the forward pass

StefanHex and TurnTrout

7 May 2023 0:46 UTC

76 points

24 comments11 min readLW link

On the Loebner Silver Prize (a Turing test)

hold_my_fish7 May 2023 0:39 UTC

18 points

2 comments2 min readLW link

Time and Energy Costs to Erase a Bit

DaemonicSigil6 May 2023 23:29 UTC

24 points

32 comments7 min readLW link

How much do you believe your results?

Eric Neyman6 May 2023 20:31 UTC

476 points

17 comments15 min readLW link 3 reviews

(ericneyman.wordpress.com)

Long Covid Risks: 2023 Update

Elizabeth6 May 2023 18:20 UTC

70 points

9 comments4 min readLW link

(acesounderglass.com)

Is “red” for GPT-4 the same as “red” for you?

Yusuke Hayashi6 May 2023 17:55 UTC

9 points

6 comments2 min readLW link

The Broader Fossil Fuel Community

Jeffrey Heninger6 May 2023 14:49 UTC

16 points

1 comment3 min readLW link

Estimating Norovirus Prevalence

jefftk6 May 2023 11:40 UTC

16 points

0 comments2 min readLW link

(www.jefftk.com)

Alignment as Function Fitting

A.H.6 May 2023 11:38 UTC

7 points

0 comments12 min readLW link

My preferred framings for reward misspecification and goal misgeneralisation

Yi-Yang6 May 2023 4:48 UTC

27 points

1 comment8 min readLW link

You don’t need to be a genius to be in AI safety research

Claire Short6 May 2023 2:32 UTC

14 points

1 comment6 min readLW link

Naturalist Collection

LoganStrohl6 May 2023 0:37 UTC

66 points

7 comments15 min readLW link

Do you work at an AI lab? Please quit

Nik Samoylov5 May 2023 23:41 UTC

−29 points

9 comments1 min readLW link

Explaining “Hell is Game Theory Folk Theorems”

electroswing5 May 2023 23:33 UTC

57 points

21 comments5 min readLW link

Sleeping Beauty – the Death Hypothesis

Guillaume Charrier5 May 2023 23:32 UTC

6 points

8 comments5 min readLW link

Orthogonal’s Formal-Goal Alignment theory of change

Tamsin Leake5 May 2023 22:36 UTC

68 points

13 comments4 min readLW link

(carado.moe)

A smart enough LLM might be deadly simply if you run it for long enough

Mikhail Samin5 May 2023 20:49 UTC

19 points

16 comments8 min readLW link

What Jason has been reading, May 2023: “Protopia,” complex systems, Daedalus vs. Icarus, and more

jasoncrawford5 May 2023 19:54 UTC

25 points

2 comments11 min readLW link

(rootsofprogress.org)

CHAT Diplomacy: LLMs and National Security

JohnBuridan5 May 2023 19:45 UTC

25 points

6 comments7 min readLW link

Linkpost for Accursed Farms Discussion / debate with AI expert Eliezer Yudkowsky

gilch5 May 2023 18:20 UTC

14 points

2 comments1 min readLW link

(www.youtube.com)

Regulate or Compete? The China Factor in U.S. AI Policy (NAIR #2)

charles_m5 May 2023 17:43 UTC

2 points

1 comment7 min readLW link

(navigatingairisks.substack.com)

Kingfisher Live CD Process

jefftk5 May 2023 17:00 UTC

13 points

0 comments3 min readLW link

(www.jefftk.com)

What can we learn from Bayes about reasoning?

jasoncrawford5 May 2023 15:52 UTC

21 points

11 comments1 min readLW link

[Question] Why not use active SETI to prevent AI Doom?

RomanS5 May 2023 14:41 UTC

13 points

13 comments1 min readLW link

Investigating Emergent Goal-Like Behavior in Large Language Models using Experimental Economics

phelps-sg5 May 2023 11:15 UTC

6 points

1 comment4 min readLW link

Monthly Shorts 4/23

Celer5 May 2023 7:20 UTC

8 points

1 comment3 min readLW link

(keller.substack.com)

[Question] What is it like to be a compatibilist?

tslarm5 May 2023 2:56 UTC

8 points

72 comments1 min readLW link

Transcript of a presentation on catastrophic risks from AI

RobertM5 May 2023 1:38 UTC

6 points

0 comments8 min readLW link

How to get good at programming

Ulisse Mini5 May 2023 1:14 UTC

39 points

3 comments2 min readLW link

An Update On The Campaign For AI Safety Dot Org

yanni kyriacos5 May 2023 0:21 UTC

−13 points

2 comments1 min readLW link

A brief collection of Hinton’s recent comments on AGI risk

Kaj_Sotala4 May 2023 23:31 UTC

143 points

9 comments11 min readLW link

Robin Hanson and I talk about AI risk

KatjaGrace4 May 2023 22:20 UTC

39 points

8 comments1 min readLW link

(worldspiritsockpuppet.com)