All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All Jan Feb Mar AprMayJun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 252627 28 29 30 31

[Question] What vegan food resources have you found useful?

Elizabeth25 May 2023 22:46 UTC

29 points

6 comments1 min readLW link

Mob and Bailey

Screwtape25 May 2023 22:14 UTC

78 points

16 comments7 min readLW link

Look At What’s In Front Of You (Conclusion to The Nuts and Bolts of Naturalism)

LoganStrohl25 May 2023 19:00 UTC

50 points

1 comment2 min readLW link

[Market] Will AI xrisk seem to be handled seriously by the end of 2026?

tailcalled25 May 2023 18:51 UTC

15 points

2 comments1 min readLW link

(manifold.markets)

[Question] What should my college major be if I want to do AI alignment research?

metachirality25 May 2023 18:23 UTC

8 points

7 comments1 min readLW link

Is behavioral safety “solved” in non-adversarial conditions?

Robert_AIZI25 May 2023 17:56 UTC

26 points

8 comments2 min readLW link

(aizi.substack.com)

Book Review: How Minds Change

bc4026bd4aaa5b7fe25 May 2023 17:55 UTC

310 points

52 comments15 min readLW link

Self-administered EMDR without a therapist is very useful for a lot of things!

EternallyBlissful25 May 2023 17:54 UTC

49 points

12 comments11 min readLW link

RecurrentGPT: a loom-type tool with a twist

mishka25 May 2023 17:09 UTC

10 points

0 comments3 min readLW link

(arxiv.org)

The Genie in the Bottle: An Introduction to AI Alignment and Risk

Snorkelfarsan25 May 2023 16:30 UTC

5 points

1 comment25 min readLW link

AI #13: Potential Algorithmic Improvements

Zvi25 May 2023 15:40 UTC

45 points

4 comments67 min readLW link

(thezvi.wordpress.com)

Solving the Mechanistic Interpretability challenges: EIS VII Challenge 2

StefanHex and Marius Hobbhahn

25 May 2023 15:37 UTC

71 points

1 comment13 min readLW link

Malthusian Competition (not as bad as it seems)

Logan Zoellner25 May 2023 15:30 UTC

6 points

11 comments2 min readLW link

You Don’t Always Need Indexes

jefftk25 May 2023 14:20 UTC

22 points

6 comments1 min readLW link

(www.jefftk.com)

Theories of Biological Inspiration

Eric Zhang25 May 2023 13:07 UTC

7 points

3 comments1 min readLW link

Evaluating strategic reasoning in GPT models

phelps-sg25 May 2023 11:51 UTC

4 points

1 comment8 min readLW link

Requirements for a STEM-capable AGI Value Learner (my Case for Less Doom)

RogerDearnaley25 May 2023 9:26 UTC

33 points

4 comments15 min readLW link

Alignment solutions for weak AI don’t (necessarily) scale to strong AI

Michael Tontchev25 May 2023 8:26 UTC

6 points

0 comments5 min readLW link

[Question] What features would you like to see in a personal forcasting / prediction tracking app?

regnarg25 May 2023 8:18 UTC

9 points

0 comments1 min readLW link

Announcing the Confido app: bringing forecasting to everyone

regnarg25 May 2023 8:18 UTC

6 points

2 comments10 min readLW link

(forum.effectivealtruism.org)

But What If We Actually Want To Maximize Paperclips?

snerx25 May 2023 7:13 UTC

−17 points

6 comments7 min readLW link

Exploiting Newcomb’s Game Show

carterallen25 May 2023 4:01 UTC

8 points

2 comments2 min readLW link

DeepMind: Model evaluation for extreme risks

Zach Stein-Perlman25 May 2023 3:00 UTC

94 points

12 comments1 min readLW link 1 review

(arxiv.org)

Why I’m Not (Yet) A Full-Time Technical Alignment Researcher

Nicholas / Heather Kross25 May 2023 1:26 UTC

39 points

21 comments4 min readLW link

(www.thinkingmuchbetter.com)

Two ideas for alignment, perpetual mutual distrust and induction

APaleBlueDot25 May 2023 0:56 UTC

1 point

2 comments4 min readLW link

Evaluating Evidence Reconstructions of Mock Crimes -Submission 2

Alan E Dunne24 May 2023 22:17 UTC

−1 points

1 comment3 min readLW link

[Linkpost] Interpretability Dreams

DanielFilan24 May 2023 21:08 UTC

39 points

2 comments2 min readLW link

(transformer-circuits.pub)

Rishi Sunak mentions “existential threats” in talk with OpenAI, DeepMind, Anthropic CEOs

Arjun Panickssery, Baldassare Castiglione and Cleo Nardo

24 May 2023 21:06 UTC

34 points

1 comment1 min readLW link

(www.gov.uk)

If you’re not a morning person, consider quitting allergy pills

Brendan Long24 May 2023 20:11 UTC

8 points

3 comments1 min readLW link

Adumbrations on AGI from an outsider

nicholashalden24 May 2023 17:41 UTC

57 points

44 comments8 min readLW link

(nicholashalden.home.blog)

Open Thread With Experimental Feature: Reactions

jimrandomh24 May 2023 16:46 UTC

101 points

189 comments3 min readLW link

A rejection of the Orthogonality Thesis

ArisC24 May 2023 16:37 UTC

−2 points

11 comments2 min readLW link

(medium.com)

Aligned AI via monitoring objectives in AutoGPT-like systems

Paul Colognese24 May 2023 15:59 UTC

27 points

4 comments4 min readLW link

The Office of Science and Technology Policy put out a request for information on A.I.

HiroSakuraba24 May 2023 13:33 UTC

59 points

4 comments1 min readLW link

(www.whitehouse.gov)

ChatGPT (May 2023) on Designing Friendly Superintelligence

Mitchell_Porter24 May 2023 10:47 UTC

5 points

0 comments1 min readLW link

(singularitypolitics.wordpress.com)

No—AI is just as energy-efficient as your brain.

Maxwell Clarke24 May 2023 2:30 UTC

11 points

7 comments1 min readLW link

[Question] What projects and efforts are there to promote AI safety research?

Christopher King24 May 2023 0:33 UTC

4 points

0 comments1 min readLW link

My May 2023 priorities for AI x-safety: more empathy, more unification of concerns, and less vilification of OpenAI

Andrew_Critch24 May 2023 0:02 UTC

268 points

39 comments8 min readLW link

AI Safety Newsletter #7: Disinformation, Governance Recommendations for AI labs, and Senate Hearings on AI

Dan H, Akash and aogara

23 May 2023 21:47 UTC

25 points

0 comments6 min readLW link

(newsletter.safe.ai)

The Polarity Problem [Draft]

Dan H, cdkg and Simon Goldstein

23 May 2023 21:05 UTC

24 points

3 comments44 min readLW link

Progress links and tweets, 2023-05-23

jasoncrawford23 May 2023 20:15 UTC

16 points

0 comments1 min readLW link

(rootsofprogress.org)

 Yoshua Bengio: How Rogue AIs may Arise

harfe23 May 2023 18:28 UTC

92 points

12 comments18 min readLW link

(yoshuabengio.org)

‘Fundamental’ vs ‘applied’ mechanistic interpretability research

Lee Sharkey23 May 2023 18:26 UTC

65 points

6 comments3 min readLW link

Coercion is an adaptation to scarcity; trust is an adaptation to abundance

Richard_Ngo23 May 2023 18:14 UTC

90 points

11 comments4 min readLW link

[Question] Is “brittle alignment” good enough?

the8thbit23 May 2023 17:35 UTC

9 points

5 comments3 min readLW link

Will Artificial Superintelligence Kill Us?

James_Miller23 May 2023 16:27 UTC

33 points

2 comments22 min readLW link

Phone Number Jingle

jefftk23 May 2023 15:20 UTC

11 points

12 comments1 min readLW link

(www.jefftk.com)

GPT4 is capable of writing decent long-form science fiction (with the right prompts)

RomanS23 May 2023 13:41 UTC

22 points

28 comments65 min readLW link

[Question] Do humans still provide value in correspondence chess?

Jonathan Paulson23 May 2023 12:15 UTC

24 points

16 comments1 min readLW link

[Linkpost] The AGI Show podcast

Soroush Pour23 May 2023 9:52 UTC

4 points

0 comments1 min readLW link