All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All Jan Feb Mar Apr May Jun JulAugSep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 212223 24 25 26 27 28 29 30 31

Causality and a Cost Semantics for Neural Networks

scottviteri21 Aug 2023 21:02 UTC

22 points

1 comment1 min readLW link

Ideas for improving epistemics in AI safety outreach

mic21 Aug 2023 19:55 UTC

64 points

6 comments3 min readLW link

Rice’s Theorem says that AIs can’t determine much from studying AI source code

Michael Weiss-Malik21 Aug 2023 19:05 UTC

−12 points

4 comments1 min readLW link

Large Language Models will be Great for Censorship

Ethan Edwards21 Aug 2023 19:03 UTC

183 points

14 comments8 min readLW link

(ethanedwards.substack.com)

“Throwing Exceptions” Is A Strange Programming Pattern

Thoth Hermes21 Aug 2023 18:50 UTC

−2 points

13 comments6 min readLW link

(thothhermes.substack.com)

[Question] Which possible AI systems are relatively safe?

Zach Stein-Perlman21 Aug 2023 17:00 UTC

42 points

20 comments1 min readLW link

Self-shutdown AI

jan betley21 Aug 2023 16:48 UTC

13 points

2 comments2 min readLW link

Contextual Translations—Attempt 1

Varshul Gupta21 Aug 2023 14:30 UTC

−1 points

0 comments2 min readLW link

(dubverseblack.substack.com)

DIY Deliberate Practice

lynettebye21 Aug 2023 12:22 UTC

62 points

4 comments5 min readLW link

(lynettebye.com)

Downstairs Opening: 2br Apartment

jefftk21 Aug 2023 0:50 UTC

8 points

2 comments3 min readLW link

(www.jefftk.com)

Efficiency and resource use scaling parity

Ege Erdil21 Aug 2023 0:18 UTC

51 points

1 comment20 min readLW link 1 review

Ruining an expected-log-money maximizer

philh20 Aug 2023 21:20 UTC

31 points

33 comments1 min readLW link 1 review

(reasonableapproximation.net)

Steven Wolfram on AI Alignment

Bill Benzon20 Aug 2023 19:49 UTC

66 points

15 comments4 min readLW link

[Question] What value does personal prediction tracking have?

fx20 Aug 2023 18:43 UTC

7 points

3 comments1 min readLW link

Jan Kulveit’s Corrigibility Thoughts Distilled

brook20 Aug 2023 17:52 UTC

20 points

1 comment5 min readLW link

Memetic Judo #3: The Intelligence of Stochastic Parrots v.2

Max TK20 Aug 2023 15:18 UTC

8 points

33 comments6 min readLW link

ACX/SSC Boulder meetup- September 23

Josh Sacks20 Aug 2023 14:16 UTC

1 point

4 comments1 min readLW link

“Dirty concepts” in AI alignment discourses, and some guesses for how to deal with them

Nora_Ammann and peckzy

20 Aug 2023 9:13 UTC

65 points

4 comments3 min readLW link

Call for Papers on Global AI Governance from the UN

Chris_Leong20 Aug 2023 8:56 UTC

19 points

0 comments1 min readLW link

(www.linkedin.com)

How do I read things on the internet

Vlad Sitalo20 Aug 2023 5:43 UTC

16 points

2 comments8 min readLW link

(vlad.roam.garden)

AI Forecasting: Two Years In

jsteinhardt19 Aug 2023 23:40 UTC

72 points

15 comments11 min readLW link

(bounded-regret.ghost.io)

Four management/leadership book summaries

nikola19 Aug 2023 23:38 UTC

25 points

2 comments7 min readLW link

Interpreting a dimensionality reduction of a collection of matrices as two positive semidefinite block diagonal matrices

Joseph Van Name19 Aug 2023 19:52 UTC

16 points

2 comments5 min readLW link

Will AI kill everyone? Here’s what the godfathers of AI have to say [RA video]

Writer19 Aug 2023 17:29 UTC

58 points

8 comments1 min readLW link

(youtu.be)

Ten variations on red-pill-blue-pill

Richard_Kennaway19 Aug 2023 16:34 UTC

22 points

34 comments3 min readLW link

Are we running out of new music/movies/art from a metaphysical perspective? (updated)

stephen_s19 Aug 2023 16:24 UTC

4 points

23 comments1 min readLW link

[Question] Any ideas for a prediction market observable that quantifies “culture-warisation”?

Ppau19 Aug 2023 15:11 UTC

6 points

1 comment1 min readLW link

[Question] Clarifying how misalignment can arise from scaling LLMs

Util19 Aug 2023 14:16 UTC

3 points

1 comment1 min readLW link

Chess as a case study in hidden capabilities in ChatGPT

AdamYedidia19 Aug 2023 6:35 UTC

47 points

32 comments6 min readLW link

We can do better than DoWhatIMean (inextricably kind AI)

lemonhope19 Aug 2023 5:41 UTC

25 points

8 comments2 min readLW link

Supervised Program for Alignment Research (SPAR) at UC Berkeley: Spring 2023 summary

mic, dx26, adamk and Carolyn Qian

19 Aug 2023 2:27 UTC

20 points

2 comments6 min readLW link

Could fabs own AI?

lemonhope19 Aug 2023 0:16 UTC

15 points

0 comments3 min readLW link

Is Chinese total factor productivity lower today than it was in 1956?

Ege Erdil18 Aug 2023 22:33 UTC

43 points

0 comments26 min readLW link

Rationality-ish Meetups Showcase: 2019-2021

jenn18 Aug 2023 22:22 UTC

10 points

0 comments5 min readLW link

The U.S. is becoming less stable

lc18 Aug 2023 21:13 UTC

146 points

68 comments2 min readLW link

Meetup Tip: Board Games

Screwtape18 Aug 2023 18:11 UTC

9 points

4 comments7 min readLW link

[Question] AI labs’ requests for input

Zach Stein-Perlman18 Aug 2023 17:00 UTC

29 points

0 comments1 min readLW link

6 non-obvious mental health issues specific to AI safety

Igor Ivanov18 Aug 2023 15:46 UTC

145 points

24 comments4 min readLW link

When discussing AI doom barriers propose specific plausible scenarios

anithite18 Aug 2023 4:06 UTC

5 points

0 comments3 min readLW link

Risks from AI Overview: Summary

Dan H, Mantas Mazeika and TW123

18 Aug 2023 1:21 UTC

25 points

1 comment13 min readLW link

(www.safe.ai)

Managing risks of our own work

Beth Barnes18 Aug 2023 0:41 UTC

66 points

0 comments2 min readLW link

ACI#5: From Human-AI Co-evolution to the Evolution of Value Systems

Akira Pyinya18 Aug 2023 0:38 UTC

0 points

0 comments9 min readLW link

Memetic Judo #1: On Doomsday Prophets v.3

Max TK18 Aug 2023 0:14 UTC

25 points

17 comments3 min readLW link

Looking for judges for critiques of Alignment Plans

Iknownothing17 Aug 2023 22:35 UTC

6 points

0 comments1 min readLW link

How is ChatGPT’s behavior changing over time?

Phib17 Aug 2023 20:54 UTC

3 points

0 comments1 min readLW link

(arxiv.org)

Progress links digest, 2023-08-17: Cloud seeding, robotic sculptors, and rogue planets

jasoncrawford17 Aug 2023 20:29 UTC

15 points

1 comment4 min readLW link

(rootsofprogress.org)

Model of psychosis, take 2

Steven Byrnes17 Aug 2023 19:11 UTC

33 points

13 comments4 min readLW link

[Linkpost] Robustified ANNs Reveal Wormholes Between Human Category Percepts

Bogdan Ionut Cirstea17 Aug 2023 19:10 UTC

6 points

2 comments1 min readLW link

Against Almost Every Theory of Impact of Interpretability

Charbel-Raphaël17 Aug 2023 18:44 UTC

325 points

87 comments26 min readLW link

Goldilocks and the Three Optimisers

dkl917 Aug 2023 18:15 UTC

−10 points

0 comments5 min readLW link

(dkl9.net)