All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 20232024

All Jan Feb Mar Apr May Jun Jul Aug SepOctNov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 192021 22 23 24 25 26 27 28 29 30 31

D/acc AI Security Salon

Allison Duettmann19 Oct 2024 22:17 UTC

19 points

0 comments1 min readLW link

Who Should Have Been Killed, and Contains Neato? Who Else Could It Be, but that Villain Magneto!

Ace Delgado19 Oct 2024 20:39 UTC

−16 points

0 comments1 min readLW link

If far-UV is so great, why isn’t it everywhere?

Austin Chen19 Oct 2024 18:56 UTC

70 points

23 comments1 min readLW link

(strainhardening.substack.com)

What if AGI was already accidentally created in 2019? [Fictional story]

Alice Wanderland19 Oct 2024 9:17 UTC

−3 points

2 comments15 min readLW link

(aliceandbobinwanderland.substack.com)

[Question] What actual bad outcome has “ethics-based” RLHF AI Alignment already prevented?

Roko19 Oct 2024 6:11 UTC

7 points

16 comments1 min readLW link

[Question] What’s a good book for a technically-minded 11-year old?

Martin Sustrik19 Oct 2024 6:05 UTC

10 points

32 comments1 min readLW link

Methodology: Contagious Beliefs

James Stephen Brown19 Oct 2024 3:58 UTC

3 points

0 comments7 min readLW link

AI Prejudices: Practical Implications

PeterMcCluskey19 Oct 2024 2:19 UTC

12 points

0 comments5 min readLW link

(bayesianinvestor.com)

Start an Upper-Room UV Installation Company?

jefftk19 Oct 2024 2:00 UTC

44 points

9 comments1 min readLW link

(www.jefftk.com)

How I’d like alignment to get done (as of 2024-10-18)

TristanTrim18 Oct 2024 23:39 UTC

11 points

4 comments4 min readLW link

Sabotage Evaluations for Frontier Models

David Duvenaud, Joe Benton, Sam Bowman, evhub, mishajw, Eric Christiansen, HoldenKarnofsky, Ethan Perez and Buck

18 Oct 2024 22:33 UTC

93 points

55 comments6 min readLW link

(assets.anthropic.com)

D&D Sci Coliseum: Arena of Data

aphyer18 Oct 2024 22:02 UTC

41 points

23 comments4 min readLW link

the Daydication technique

chaosmage18 Oct 2024 21:47 UTC

27 points

0 comments2 min readLW link

[Linkpost] Hawkish nationalism vs international AI power and benefit sharing

jakub_krys and Naci Cankaya

18 Oct 2024 18:13 UTC

7 points

5 comments1 min readLW link

(nacicankaya.substack.com)

LLM Psychometrics and Prompt-Induced Psychopathy

Korbinian K.18 Oct 2024 18:11 UTC

12 points

2 comments10 min readLW link

A short project on Mamba: grokking & interpretability

Alejandro Tlaie18 Oct 2024 16:59 UTC

21 points

0 comments6 min readLW link

LLMs can learn about themselves by introspection

Felix J Binder and Owain_Evans

18 Oct 2024 16:12 UTC

102 points

38 comments9 min readLW link

[Question] Are there more than 12 paths to Superintelligence?

p4rziv4l18 Oct 2024 16:05 UTC

−3 points

0 comments1 min readLW link

Low Probability Estimation in Language Models

Gabriel Wu18 Oct 2024 15:50 UTC

50 points

0 comments10 min readLW link

(www.alignment.org)

The Mysterious Trump Buyers on Polymarket

Annapurna18 Oct 2024 13:26 UTC

52 points

10 comments2 min readLW link

(jorgevelez.substack.com)

On Intentionality, or: Towards a More Inclusive Concept of Lying

Cornelius Dybdahl18 Oct 2024 10:37 UTC

8 points

0 comments4 min readLW link

Species as Canonical Referents of Super-Organisms

Yudhister Kumar18 Oct 2024 7:49 UTC

9 points

8 comments2 min readLW link

(www.yudhister.me)

NAO Updates, Fall 2024

jefftk18 Oct 2024 0:00 UTC

32 points

2 comments1 min readLW link

(naobservatory.org)

You’re Playing a Rough Game

jefftk17 Oct 2024 19:20 UTC

25 points

2 comments2 min readLW link

(www.jefftk.com)

P=NP

OnePolynomial17 Oct 2024 17:56 UTC

−25 points

0 comments8 min readLW link

Factoring P(doom) into a bayesian network

Joseph Gardi17 Oct 2024 17:55 UTC

1 point

0 comments1 min readLW link

understanding bureaucracy

dhruvmethi17 Oct 2024 17:55 UTC

1 point

2 comments8 min readLW link

AI #86: Just Think of the Potential

Zvi17 Oct 2024 15:10 UTC

58 points

8 comments57 min readLW link

(thezvi.wordpress.com)

Concrete benefits of making predictions

Jonny Spicer and Sage Future

17 Oct 2024 14:23 UTC

32 points

5 comments6 min readLW link

(fatebook.io)

Arithmetic is an underrated world-modeling technology

dynomight17 Oct 2024 14:00 UTC

147 points

32 comments6 min readLW link

(dynomight.net)

The Computational Complexity of Circuit Discovery for Inner Interpretability

Bogdan Ionut Cirstea17 Oct 2024 13:18 UTC

11 points

2 comments1 min readLW link

(arxiv.org)

[Question] is there a big dictionary somewhere with all your jargon and acronyms and whatnot?

KvmanThinking17 Oct 2024 11:30 UTC

4 points

7 comments1 min readLW link

[Question] Is there a known method to find others who came across the same potential infohazard without spoiling it to the public?

hive17 Oct 2024 10:47 UTC

4 points

6 comments1 min readLW link

It is time to start war gaming for AGI

yanni kyriacos17 Oct 2024 5:14 UTC

4 points

1 comment1 min readLW link

[Question] Reinforcement Learning: Essential Step Towards AGI or Irrelevant?

Double17 Oct 2024 3:37 UTC

1 point

0 comments1 min readLW link

[Question] EndeavorOTC legit?

FinalFormal217 Oct 2024 1:33 UTC

3 points

0 comments1 min readLW link

The Cognitive Bootcamp Agreement

Raemon16 Oct 2024 23:24 UTC

34 points

0 comments8 min readLW link

Bitter lessons about lucid dreaming

avturchin16 Oct 2024 21:27 UTC

77 points

62 comments2 min readLW link

Towards Quantitative AI Risk Management

Henry Papadatos and simeon_c

16 Oct 2024 19:26 UTC

28 points

1 comment6 min readLW link

Why Academia is Mostly Not Truth-Seeking

Zero Contradictions16 Oct 2024 19:14 UTC

−7 points

6 comments1 min readLW link

(thewaywardaxolotl.blogspot.com)

Launching Adjacent News

Lucas Kohorst16 Oct 2024 17:58 UTC

23 points

0 comments4 min readLW link

[Question] Interest in Leetcode, but for Rationality?

Gregory 16 Oct 2024 17:54 UTC

74 points

20 comments2 min readLW link

Request for advice: Research for Conversational Game Theory for LLMs

Rome Viharo16 Oct 2024 17:53 UTC

10 points

0 comments1 min readLW link

Why humans won’t control superhuman AIs.

Spiritus Dei16 Oct 2024 16:48 UTC

−11 points

1 comment6 min readLW link

Against empathy-by-default

Steven Byrnes16 Oct 2024 16:38 UTC

60 points

24 comments7 min readLW link

cancer rates after gene therapy

bhauth16 Oct 2024 15:32 UTC

49 points

0 comments3 min readLW link

(bhauth.com)

Monthly Roundup #23: October 2024

Zvi16 Oct 2024 13:50 UTC

39 points

13 comments50 min readLW link

(thezvi.wordpress.com)

[Question] Change My Mind: Thirders in “Sleeping Beauty” are Just Doing Epistemology Wrong

DragonGod16 Oct 2024 10:20 UTC

8 points

67 comments6 min readLW link

[Question] After uploading your consciousness...

Jinge Wang16 Oct 2024 3:52 UTC

−2 points

0 comments1 min readLW link

The ELYSIUM Proposal - Extrapolated voLitions Yielding Separate Individualized Utopias for Mankind

Roko16 Oct 2024 1:24 UTC

10 points

18 comments1 min readLW link

(transhumanaxiology.substack.com)