All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 212223 24 25 26 27 28 29 30 31

The Army of Jakoths (a parable)

MikkW21 May 2023 22:48 UTC

−6 points

0 comments1 min readLW link

A&I (Rihanna ‘S&M’ parody lyrics)

nahoj21 May 2023 22:34 UTC

−2 points

0 comments2 min readLW link

Four Battlegrounds: Power in the Age of Artificial Intelligence (Book review)

PeterMcCluskey21 May 2023 21:19 UTC

25 points

0 comments4 min readLW link

(bayesianinvestor.com)

Gender Vectors in ROME’s Latent Space

Xodarap21 May 2023 18:46 UTC

14 points

2 comments3 min readLW link

Weight by Impact

Vaniver21 May 2023 14:37 UTC

29 points

1 comment3 min readLW link

 [outdated] My current theory of change to mitigate existential risk by misaligned ASI

mesaoptimizer21 May 2023 13:46 UTC

32 points

8 comments6 min readLW link

(mesaoptimizer.com)

Babble on growing trust

qbolec21 May 2023 13:19 UTC

13 points

1 comment5 min readLW link

Elevator Positioning

jefftk21 May 2023 11:30 UTC

15 points

1 comment1 min readLW link

(www.jefftk.com)

Transformer Architecture Choice for Resisting Prompt Injection and Jail-Breaking Attacks

RogerDearnaley21 May 2023 8:29 UTC

9 points

1 comment4 min readLW link

Jeff Clune advertising a postdoc on twitter...and asking where he should target his posts

Joyee Chen21 May 2023 1:02 UTC

4 points

0 comments1 min readLW link

Running Sound for Yourself

jefftk20 May 2023 22:10 UTC

11 points

0 comments2 min readLW link

(www.jefftk.com)

Job Opening: SWE to help build signature vetting system for AI-related petitions

Ethan Ashkie and Andrew_Critch

20 May 2023 19:02 UTC

52 points

0 comments1 min readLW link

My Kind of Pragmatism

Nora Belrose20 May 2023 18:58 UTC

37 points

11 comments3 min readLW link

Colors Appear To Have Almost-Universal Symbolic Associations

Thoth Hermes20 May 2023 18:40 UTC

−33 points

4 comments7 min readLW link

(thothhermes.substack.com)

Twiblings, four-parent babies and other reproductive technology

GeneSmith20 May 2023 17:11 UTC

191 points

33 comments6 min readLW link

P-zombies, Compression and the Simulation Hypothesis

RussellThor20 May 2023 11:36 UTC

5 points

0 comments5 min readLW link

The possible shared Craft of deliberate Lexicogenesis

TsviBT20 May 2023 5:56 UTC

49 points

5 comments5 min readLW link

Buying Tall-Poppy-Cutting Offsets

trevor20 May 2023 3:59 UTC

23 points

4 comments2 min readLW link

(www.overcomingbias.com)

Seeing Ghosts by GPT-4

Christopher King20 May 2023 0:11 UTC

−13 points

0 comments1 min readLW link

[Question] What’s the best way to streamline two-party sale negotiations between real humans?

Isaac King19 May 2023 23:30 UTC

15 points

21 comments1 min readLW link

Trust develops gradually via making bids and setting boundaries

Richard_Ngo19 May 2023 22:16 UTC

134 points

12 comments4 min readLW link

Confusions and updates on STEM AI

Eleni Angelou19 May 2023 21:34 UTC

23 points

0 comments3 min readLW link

GPT as an “Intelligence Forklift.”

boazbarak19 May 2023 21:15 UTC

48 points

27 comments3 min readLW link

Idea: medical hypotheses app for mysterious chronic illnesses

riceissa19 May 2023 20:49 UTC

64 points

8 comments3 min readLW link

A flaw in the A.G.I. Ruin Argument

Cole Wyeth19 May 2023 19:40 UTC

1 point

7 comments3 min readLW link

(colewyeth.com)

We are misaligned: the saddening idea that most of humanity doesn’t intrinsically care about x-risk, even on a personal level

Christopher King19 May 2023 16:12 UTC

3 points

5 comments2 min readLW link

Do Deadlines Make Us Less Creative?

lynettebye19 May 2023 15:41 UTC

44 points

6 comments4 min readLW link

Two Axes of Contra Bands

jefftk19 May 2023 14:20 UTC

2 points

0 comments1 min readLW link

(www.jefftk.com)

Is Effective Volunteering Possible?

David Bravo19 May 2023 12:41 UTC

13 points

2 comments9 min readLW link

Mr. Meeseeks as an AI capability tripwire

Eric Zhang19 May 2023 11:33 UTC

37 points

17 comments2 min readLW link

The Compleat Cybornaut

ukc10014, Jozdien and NicholasKees

19 May 2023 8:44 UTC

65 points

2 comments16 min readLW link

[Question] What if we’re not the first AI-capable civilization on Earth?

RomanS19 May 2023 7:50 UTC

−14 points

8 comments1 min readLW link

Resolving internal conflicts requires listening to what parts want

Richard_Ngo19 May 2023 0:04 UTC

64 points

0 comments4 min readLW link

[Question] How could I measure the nootropic benefits testosterone injections may have?

shapeshifter18 May 2023 21:40 UTC

10 points

3 comments1 min readLW link

Investigating Fabrication

LoganStrohl18 May 2023 17:46 UTC

112 points

14 comments16 min readLW link

Microsoft and Google using LLMs for Cybersecurity

Phosphorous18 May 2023 17:42 UTC

6 points

0 comments5 min readLW link

The Benevolent Billionaire (a plagiarized problem)

Ivan Ordonez18 May 2023 17:39 UTC

8 points

11 comments4 min readLW link

Notes from the LSE Talk by Raghuram Rajan on Central Bank Balance Sheet Expansions

PixelatedPenguin18 May 2023 17:34 UTC

1 point

0 comments2 min readLW link

We Shouldn’t Expect AI to Ever be Fully Rational

OneManyNone18 May 2023 17:09 UTC

19 points

31 comments6 min readLW link

Relative Value Functions: A Flexible New Format for Value Estimation

ozziegooen18 May 2023 16:39 UTC

20 points

0 comments1 min readLW link

Some background for reasoning about dual-use alignment research

Charlie Steiner18 May 2023 14:50 UTC

126 points

22 comments9 min readLW link 1 review

The Unexpected Clanging

Chris_Leong18 May 2023 14:47 UTC

14 points

22 comments2 min readLW link

AI #12:The Quest for Sane Regulations

Zvi18 May 2023 13:20 UTC

77 points

12 comments64 min readLW link

(thezvi.wordpress.com)

[Crosspost] A recent write-up of the case for AI (existential) risk

Timsey18 May 2023 13:13 UTC

6 points

0 comments19 min readLW link

Deontological Norms are Unimportant

omnizoid18 May 2023 9:33 UTC

−15 points

8 comments10 min readLW link

Collective Identity

NicholasKees, ukc10014 and Garrett Baker

18 May 2023 9:00 UTC

59 points

12 comments8 min readLW link

Activation additions in a simple MNIST network

Garrett Baker18 May 2023 2:49 UTC

26 points

0 comments2 min readLW link

[Question] What are the limits of the weak man?

ymeskhout18 May 2023 0:50 UTC

9 points

2 comments4 min readLW link

What Yann LeCun gets wrong about aligning AI (video)

blake808618 May 2023 0:02 UTC

0 points

0 comments1 min readLW link

(www.youtube.com)

Let’s use AI to harden human defenses against AI manipulation

Tom Davidson17 May 2023 23:33 UTC

35 points

7 comments24 min readLW link