All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All Jan Feb Mar Apr May Jun JulAugSep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 192021 22 23 24 25 26 27 28 29 30 31

AI Forecasting: Two Years In

jsteinhardt19 Aug 2023 23:40 UTC

72 points

15 comments11 min readLW link

(bounded-regret.ghost.io)

Four management/leadership book summaries

nikola19 Aug 2023 23:38 UTC

25 points

2 comments7 min readLW link

Interpreting a dimensionality reduction of a collection of matrices as two positive semidefinite block diagonal matrices

Joseph Van Name19 Aug 2023 19:52 UTC

16 points

2 comments5 min readLW link

Will AI kill everyone? Here’s what the godfathers of AI have to say [RA video]

Writer19 Aug 2023 17:29 UTC

58 points

8 comments1 min readLW link

(youtu.be)

Ten variations on red-pill-blue-pill

Richard_Kennaway19 Aug 2023 16:34 UTC

22 points

34 comments3 min readLW link

Are we running out of new music/movies/art from a metaphysical perspective? (updated)

stephen_s19 Aug 2023 16:24 UTC

4 points

23 comments1 min readLW link

[Question] Any ideas for a prediction market observable that quantifies “culture-warisation”?

Ppau19 Aug 2023 15:11 UTC

6 points

1 comment1 min readLW link

[Question] Clarifying how misalignment can arise from scaling LLMs

Util19 Aug 2023 14:16 UTC

3 points

1 comment1 min readLW link

Chess as a case study in hidden capabilities in ChatGPT

AdamYedidia19 Aug 2023 6:35 UTC

47 points

32 comments6 min readLW link

We can do better than DoWhatIMean (inextricably kind AI)

lemonhope19 Aug 2023 5:41 UTC

25 points

8 comments2 min readLW link

Supervised Program for Alignment Research (SPAR) at UC Berkeley: Spring 2023 summary

mic, dx26, adamk and Carolyn Qian

19 Aug 2023 2:27 UTC

20 points

2 comments6 min readLW link

Could fabs own AI?

lemonhope19 Aug 2023 0:16 UTC

15 points

0 comments3 min readLW link

Is Chinese total factor productivity lower today than it was in 1956?

Ege Erdil18 Aug 2023 22:33 UTC

43 points

0 comments26 min readLW link

Rationality-ish Meetups Showcase: 2019-2021

jenn18 Aug 2023 22:22 UTC

10 points

0 comments5 min readLW link

The U.S. is becoming less stable

lc18 Aug 2023 21:13 UTC

146 points

68 comments2 min readLW link

Meetup Tip: Board Games

Screwtape18 Aug 2023 18:11 UTC

9 points

4 comments7 min readLW link

[Question] AI labs’ requests for input

Zach Stein-Perlman18 Aug 2023 17:00 UTC

29 points

0 comments1 min readLW link

6 non-obvious mental health issues specific to AI safety

Igor Ivanov18 Aug 2023 15:46 UTC

145 points

24 comments4 min readLW link

When discussing AI doom barriers propose specific plausible scenarios

anithite18 Aug 2023 4:06 UTC

5 points

0 comments3 min readLW link

Risks from AI Overview: Summary

Dan H, Mantas Mazeika and TW123

18 Aug 2023 1:21 UTC

25 points

1 comment13 min readLW link

(www.safe.ai)

Managing risks of our own work

Beth Barnes18 Aug 2023 0:41 UTC

66 points

0 comments2 min readLW link

ACI#5: From Human-AI Co-evolution to the Evolution of Value Systems

Akira Pyinya18 Aug 2023 0:38 UTC

0 points

0 comments9 min readLW link

Memetic Judo #1: On Doomsday Prophets v.3

Max TK18 Aug 2023 0:14 UTC

25 points

17 comments3 min readLW link

Looking for judges for critiques of Alignment Plans

Iknownothing17 Aug 2023 22:35 UTC

6 points

0 comments1 min readLW link

How is ChatGPT’s behavior changing over time?

Phib17 Aug 2023 20:54 UTC

3 points

0 comments1 min readLW link

(arxiv.org)

Progress links digest, 2023-08-17: Cloud seeding, robotic sculptors, and rogue planets

jasoncrawford17 Aug 2023 20:29 UTC

15 points

1 comment4 min readLW link

(rootsofprogress.org)

Model of psychosis, take 2

Steven Byrnes17 Aug 2023 19:11 UTC

33 points

13 comments4 min readLW link

[Linkpost] Robustified ANNs Reveal Wormholes Between Human Category Percepts

Bogdan Ionut Cirstea17 Aug 2023 19:10 UTC

6 points

2 comments1 min readLW link

Against Almost Every Theory of Impact of Interpretability

Charbel-Raphaël17 Aug 2023 18:44 UTC

325 points

87 comments26 min readLW link

Goldilocks and the Three Optimisers

dkl917 Aug 2023 18:15 UTC

−10 points

0 comments5 min readLW link

(dkl9.net)

Announcing Foresight Institute’s AI Safety Grants Program

Allison Duettmann17 Aug 2023 17:34 UTC

35 points

2 comments1 min readLW link

The Negentropy Cliff

mephistopheles17 Aug 2023 17:08 UTC

6 points

10 comments1 min readLW link

“AI Wellbeing” and the Ongoing Debate on Phenomenal Consciousness

FlorianH17 Aug 2023 15:47 UTC

10 points

6 comments7 min readLW link

AI #25: Inflection Point

Zvi17 Aug 2023 14:40 UTC

59 points

9 comments36 min readLW link

(thezvi.wordpress.com)

[Question] Why might General Intelligences have long term goals?

yrimon17 Aug 2023 14:10 UTC

3 points

17 comments1 min readLW link

Understanding Counterbalanced Subtractions for Better Activation Additions

ojorgensen17 Aug 2023 13:53 UTC

21 points

0 comments14 min readLW link

Reflections on “Making the Atomic Bomb”

boazbarak17 Aug 2023 2:48 UTC

51 points

7 comments8 min readLW link

Autonomous replication and adaptation: an attempt at a concrete danger threshold

Hjalmar_Wijk17 Aug 2023 1:31 UTC

44 points

0 comments13 min readLW link

[Question] (Thought experiment) If you had to choose, which would you prefer?

kuira17 Aug 2023 0:57 UTC

9 points

2 comments1 min readLW link

Some rules for life (v.0,0)

Neil 17 Aug 2023 0:43 UTC

38 points

13 comments12 min readLW link

(neilwarren.substack.com)

When AI critique works even with misaligned models

Fabien Roger17 Aug 2023 0:12 UTC

23 points

0 comments2 min readLW link

Book Launch: “The Carving of Reality,” Best of LessWrong vol. III

Raemon16 Aug 2023 23:52 UTC

131 points

22 comments5 min readLW link

One example of how LLM propaganda attacks can hack the brain

trevor16 Aug 2023 21:41 UTC

24 points

8 comments4 min readLW link

If we had known the atmosphere would ignite

Jeffs16 Aug 2023 20:28 UTC

56 points

63 comments2 min readLW link

Stampy’s AI Safety Info—New Distillations #4 [July 2023]

markov16 Aug 2023 19:03 UTC

22 points

10 comments1 min readLW link

(aisafety.info)

A Proof of Löb’s Theorem using Computability Theory

jessicata16 Aug 2023 18:57 UTC

71 points

0 comments17 min readLW link

(unstableontology.com)

Summary of and Thoughts on the Hotz/Yudkowsky Debate

Zvi16 Aug 2023 16:50 UTC

105 points

47 comments9 min readLW link

(thezvi.wordpress.com)

Red Pill vs Blue Pill, Bayes style

ErickBall16 Aug 2023 15:23 UTC

28 points

33 comments1 min readLW link

What does it mean to “trust science”?

jasoncrawford16 Aug 2023 14:56 UTC

34 points

9 comments1 min readLW link

(rootsofprogress.org)

Jason Crawford / The Roots of Progress in Bangalore, August 21 to September 8

jasoncrawford16 Aug 2023 13:36 UTC

13 points

1 comment1 min readLW link

(rootsofprogress.org)