18 Dec 2023 20:35 UTC

168 points

21 comments12 min readLW link

The Shortest Path Between Scylla and Charybdis

Thane Ruthenis18 Dec 2023 20:08 UTC

50 points

8 comments5 min readLW link

OpenAI: Preparedness framework

Zach Stein-Perlman18 Dec 2023 18:30 UTC

70 points

23 comments4 min readLW link

(openai.com)

[Valence series] 5. “Valence Disorders” in Mental Health & Personality

Steven Byrnes18 Dec 2023 15:26 UTC

43 points

12 comments13 min readLW link

Discussion: Challenges with Unsupervised LLM Knowledge Discovery

Seb Farquhar, Vikrant Varma, zac_kenton, gasteigerjo, Vlad Mikulik and Rohin Shah

18 Dec 2023 11:58 UTC

147 points

21 comments10 min readLW link

Interpreting the Learning of Deceit

RogerDearnaley18 Dec 2023 8:12 UTC

30 points

14 comments9 min readLW link

Talk: “AI Would Be A Lot Less Alarming If We Understood Agents”

johnswentworth17 Dec 2023 23:46 UTC

58 points

3 comments1 min readLW link

(www.youtube.com)

∀: a story

Richard_Ngo17 Dec 2023 22:42 UTC

37 points

1 comment8 min readLW link

(www.narrativeark.xyz)

Reviving a 2015 MacBook

jefftk17 Dec 2023 21:00 UTC

11 points

0 comments1 min readLW link

(www.jefftk.com)

A Common-Sense Case For Mutually-Misaligned AGIs Allying Against Humans

Thane Ruthenis17 Dec 2023 20:28 UTC

29 points

7 comments11 min readLW link

The Limits of Artificial Consciousness: A Biology-Based Critique of Chalmers’ Fading Qualia Argument

Štěpán Los17 Dec 2023 19:11 UTC

−6 points

9 comments17 min readLW link

What makes teaching math special

Viliam17 Dec 2023 14:15 UTC

41 points

27 comments11 min readLW link

The predictive power of dissipative adaptation

dr_s17 Dec 2023 14:01 UTC

56 points

14 comments19 min readLW link

Linkpost: Francesca v Harvard

Linch17 Dec 2023 6:18 UTC

5 points

5 comments2 min readLW link

(www.francesca-v-harvard.org)

Lessons from massaging myself, others, dogs, and cats

Chipmonk17 Dec 2023 4:28 UTC

2 points

27 comments5 min readLW link

(chipmonk.blog)

The Serendipity of Density

jefftk17 Dec 2023 3:50 UTC

40 points

4 comments1 min readLW link

(www.jefftk.com)

Bounty: Diverse hard tasks for LLM agents

Beth Barnes and Megan Kinniment

17 Dec 2023 1:04 UTC

49 points

31 comments16 min readLW link

2022 (and All Time) Posts by Pingback Count

Raemon16 Dec 2023 21:17 UTC

53 points

14 comments6 min readLW link

“Humanity vs. AGI” Will Never Look Like “Humanity vs. AGI” to Humanity

Thane Ruthenis16 Dec 2023 20:08 UTC

189 points

34 comments5 min readLW link

A visual analogy for text generation by LLMs?

Bill Benzon16 Dec 2023 17:58 UTC

3 points

0 comments1 min readLW link

Upgrading the AI Safety Community

trevor and Nicholas / Heather Kross

16 Dec 2023 15:34 UTC

42 points

9 comments42 min readLW link

cold aluminum for medicine

bhauth16 Dec 2023 14:38 UTC

42 points

4 comments4 min readLW link

(www.bhauth.com)

Scalable Oversight and Weak-to-Strong Generalization: Compatible approaches to the same problem

Ansh Radhakrishnan, Buck, ryan_greenblatt and Fabien Roger

16 Dec 2023 5:49 UTC

73 points

3 comments6 min readLW link

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision

leogao16 Dec 2023 5:39 UTC

55 points

5 comments1 min readLW link

Pope Francis shares thoughts on responsible AI development

corruptedCatapillar16 Dec 2023 3:49 UTC

15 points

4 comments1 min readLW link

(www.vatican.va)

Current AIs Provide Nearly No Data Relevant to AGI Alignment

Thane Ruthenis15 Dec 2023 20:16 UTC

124 points

157 comments8 min readLW link 1 review

Agglomeration of ‘Ought’

DavidAndresBloom15 Dec 2023 19:07 UTC

1 point

1 comment11 min readLW link

Predicting the future with the power of the Internet (and pissing off Rob Miles)

Writer15 Dec 2023 17:37 UTC

23 points

9 comments4 min readLW link

(youtu.be)

Progress links digest, 2023-12-15: Vitalik on d/acc, $100M+ in prizes, and more

jasoncrawford15 Dec 2023 15:52 UTC

20 points

0 comments12 min readLW link

(rootsofprogress.org)

“AI Alignment” is a Dangerously Overloaded Term

Roko15 Dec 2023 14:34 UTC

108 points

100 comments3 min readLW link

[Valence series] 4. Valence & Social Status (deprecated)

Steven Byrnes15 Dec 2023 14:24 UTC

35 points

19 comments11 min readLW link

Contra Scott on Abolishing the FDA

Maxwell Tabarrok15 Dec 2023 14:00 UTC

46 points

3 comments6 min readLW link

(maximumprogress.substack.com)

[Paper] Trajectories through semantic spaces in schizophrenia and the relationship to ripple bursts

bvbvbvbvbvbvbvbvbvbvbv15 Dec 2023 13:37 UTC

3 points

0 comments1 min readLW link

(www.pnas.org)

Takeaways from a Mechanistic Interpretability project on “Forbidden Facts”

Tony Wang, Miles Wang and kaivu

15 Dec 2023 11:05 UTC

33 points

8 comments10 min readLW link

Refinement of Active Inference agency ontology

Roman Leventov15 Dec 2023 9:31 UTC

16 points

0 comments5 min readLW link

(arxiv.org)

EU policymakers reach an agreement on the AI Act

tlevin15 Dec 2023 6:02 UTC

78 points

7 comments7 min readLW link

Where Does Adversarial Pressure Come From?

quetzal_rainbow14 Dec 2023 22:31 UTC

16 points

1 comment2 min readLW link

Epoch wise critical periods, and singular learning theory

Garrett Baker14 Dec 2023 20:55 UTC

9 points

1 comment5 min readLW link

OpenAI Superalignment: Weak-to-strong generalization

Dalmert14 Dec 2023 19:47 UTC

25 points

3 comments1 min readLW link

(openai.com)

Applications for EA Global are still open!

Eli_Nathan14 Dec 2023 19:10 UTC

1 point

0 comments1 min readLW link

Personal Development System: Winning Repeatedly and Growing Effectively With The BIG4

Paul Rohde14 Dec 2023 18:49 UTC

13 points

0 comments33 min readLW link

(blog.paul-rohde.com)

Introducing The ‘From Big Ideas To Real-World Results’: A Series for Effective Personal Development

Paul Rohde14 Dec 2023 18:49 UTC

13 points

1 comment8 min readLW link

(blog.paul-rohde.com)

Talking With People Who Speak to Congressional Staffers about AI risk

Eneasz14 Dec 2023 17:55 UTC

32 points

0 comments1 min readLW link

(www.thebayesianconspiracy.com)

Bayesian Injustice

Kevin Dorst14 Dec 2023 15:44 UTC

124 points

10 comments6 min readLW link

(kevindorst.substack.com)

AI #42: The Wrong Answer

Zvi14 Dec 2023 14:50 UTC

67 points

6 comments54 min readLW link

(thezvi.wordpress.com)

Some for-profit AI alignment org ideas

Eric Ho14 Dec 2023 14:23 UTC

86 points

19 comments9 min readLW link

Mapping the semantic void: Strange goings-on in GPT embedding spaces

mwatkins14 Dec 2023 13:10 UTC

114 points

31 comments14 min readLW link

Categorical Organization in Memory: ChatGPT Organizes the 665 Topic Tags from My New Savanna Blog

Bill Benzon14 Dec 2023 13:02 UTC

0 points

6 comments2 min readLW link

Moral Mountains

Adam Zerner14 Dec 2023 10:40 UTC

8 points

10 comments2 min readLW link

Update on Chinese IQ-related gene panels

Lao Mein14 Dec 2023 10:12 UTC

70 points

7 comments1 min readLW link