All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 20232024

All Jan FebMarApr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 252627 28 29 30 31

Third-party testing as a key ingredient of AI policy

Zac Hatfield-Dodds25 Mar 2024 22:40 UTC

11 points

1 comment12 min readLW link

(www.anthropic.com)

Idea: Safe Fallback Regulations for Widely Deployed AI Systems

Aaron_Scher25 Mar 2024 21:27 UTC

4 points

0 comments6 min readLW link

Announcing Neuronpedia: Platform for accelerating research into Sparse Autoencoders

Johnny Lin and Joseph Bloom

25 Mar 2024 21:17 UTC

92 points

7 comments7 min readLW link

Testing ChatGPT for cell type recognition

Metacelsus25 Mar 2024 19:59 UTC

7 points

2 comments3 min readLW link

(denovo.substack.com)

Should rationalists be spiritual / Spirituality as overcoming delusion

Kaj_Sotala and romeostevensit

25 Mar 2024 16:48 UTC

49 points

57 comments29 min readLW link

Photo Curation Approach

jefftk25 Mar 2024 15:10 UTC

9 points

3 comments2 min readLW link

(www.jefftk.com)

On attunement

Joe Carlsmith25 Mar 2024 12:47 UTC

98 points

8 comments22 min readLW link

On Lex Fridman’s Second Podcast with Altman

Zvi25 Mar 2024 12:20 UTC

51 points

10 comments10 min readLW link

(thezvi.wordpress.com)

On the Confusion between Inner and Outer Misalignment

Chris_Leong25 Mar 2024 11:59 UTC

17 points

10 comments1 min readLW link

A Bit For You

Ronak_Mehta24 Mar 2024 22:18 UTC

0 points

0 comments2 min readLW link

(ronakrm.github.io)

All About Concave and Convex Agents

mako yass24 Mar 2024 21:37 UTC

63 points

23 comments8 min readLW link

Do not delete your misaligned AGI.

mako yass24 Mar 2024 21:37 UTC

62 points

13 comments3 min readLW link

[Question] Define “Agent” (Embedded)

Apollonia24 Mar 2024 20:14 UTC

10 points

1 comment1 min readLW link

[Question] Could LLMs Help Generate New Concepts in Human Language?

Pekka Lampelto24 Mar 2024 20:13 UTC

10 points

4 comments2 min readLW link

Wittgenstein and the Private Language Argument

TMFOW24 Mar 2024 20:06 UTC

4 points

0 comments14 min readLW link

(tmfow.substack.com)

Self-Play By Analogy

Amica Terra24 Mar 2024 20:06 UTC

−2 points

2 comments7 min readLW link

Can quantised autoencoders find and interpret circuits in language models?

charlieoneill24 Mar 2024 20:05 UTC

28 points

4 comments24 min readLW link

Mandolin Harp Sensor Placement

jefftk24 Mar 2024 18:40 UTC

11 points

0 comments1 min readLW link

(www.jefftk.com)

AI Alignment and the Classical Humanist Tradition

PeteJ24 Mar 2024 13:37 UTC

−1 points

4 comments2 min readLW link

UNGA Resolution on AI: 5 Key Takeaways Looking to Future Policy

Heramb24 Mar 2024 12:23 UTC

3 points

0 comments3 min readLW link

(forum.effectivealtruism.org)

[Question] Are (Motor)sports like F1 a good thing to calibrate estimates against?

CstineSublime24 Mar 2024 9:07 UTC

4 points

2 comments1 min readLW link

Nuclear Quantum Immortality Hacking

Nezek23 Mar 2024 22:08 UTC

−3 points

2 comments2 min readLW link

As Many Ideas

Screwtape23 Mar 2024 18:55 UTC

7 points

0 comments1 min readLW link

My Detailed Notes & Commentary from Secular Solstice

Jeffrey Heninger23 Mar 2024 18:48 UTC

35 points

16 comments13 min readLW link

General Thoughts on Secular Solstice

Jeffrey Heninger23 Mar 2024 18:48 UTC

100 points

60 comments8 min readLW link

How to make food/water testing cheaper/more scalable? [eg for purity/toxin testing]

Alex K. Chen (parrot)23 Mar 2024 5:28 UTC

9 points

2 comments1 min readLW link

Prototyping Pluck Sensors

jefftk23 Mar 2024 1:30 UTC

9 points

0 comments2 min readLW link

(www.jefftk.com)

Dangers of Closed-Loop AI

Gordon Seidoh Worley22 Mar 2024 23:52 UTC

35 points

9 comments2 min readLW link

Why The Insects Scream

omnizoid22 Mar 2024 19:47 UTC

4 points

11 comments9 min readLW link

What does “autodidact” mean?

bhauth22 Mar 2024 18:37 UTC

22 points

19 comments1 min readLW link

[Linkpost] Vague Verbiage in Forecasting

trevor22 Mar 2024 18:05 UTC

11 points

9 comments3 min readLW link

(goodjudgment.com)

Wolf and Rabbit

Richard Henage22 Mar 2024 17:20 UTC

14 points

4 comments1 min readLW link

AI Model Registries: A Regulatory Review

Deric Cheng and Elliot Mckernon

22 Mar 2024 16:04 UTC

9 points

0 comments6 min readLW link

Video and transcript of presentation on Scheming AIs

Joe Carlsmith22 Mar 2024 15:52 UTC

32 points

1 comment32 min readLW link

Benchmarking LLM Agents on Kaggle Competitions

aogara22 Mar 2024 13:09 UTC

15 points

4 comments5 min readLW link

American Acceleration vs Development

Maxwell Tabarrok22 Mar 2024 13:01 UTC

1 point

0 comments4 min readLW link

(www.maximum-progress.com)

Transformative AI and Scenario Planning for AI X-risk

Elliot Mckernon and Justin Bullock

22 Mar 2024 9:38 UTC

15 points

0 comments8 min readLW link

The Pyromaniacs

Ted Sanders22 Mar 2024 6:55 UTC

−3 points

1 comment2 min readLW link

Vernor Vinge, who coined the term “Technological Singularity”, dies at 79

Kaj_Sotala21 Mar 2024 22:14 UTC

149 points

25 comments1 min readLW link

(arstechnica.com)

ChatGPT can learn indirect control

Raymond D21 Mar 2024 21:11 UTC

213 points

27 comments1 min readLW link

“Deep Learning” Is Function Approximation

Zack_M_Davis21 Mar 2024 17:50 UTC

98 points

28 comments10 min readLW link

(zackmdavis.net)

A Teacher vs. Everyone Else

ronak6921 Mar 2024 17:45 UTC

41 points

8 comments2 min readLW link

Static vs Dynamic Alignment

Gracie Green21 Mar 2024 17:44 UTC

5 points

0 comments29 min readLW link

On green

Joe Carlsmith21 Mar 2024 17:38 UTC

266 points

35 comments31 min readLW link

Comparing Alignment to other AGI interventions: Extensions and analysis

Martín Soto21 Mar 2024 17:30 UTC

7 points

0 comments4 min readLW link

The Comcast Problem

RamblinDash21 Mar 2024 16:46 UTC

1 point

15 comments1 min readLW link

Vipassana Meditation and Active Inference: A Framework for Understanding Suffering and its Cessation

Benjamin Sturgeon21 Mar 2024 12:32 UTC

50 points

8 comments19 min readLW link

AI #56: Blackwell That Ends Well

Zvi21 Mar 2024 12:10 UTC

34 points

16 comments68 min readLW link

(thezvi.wordpress.com)

An Affordable CO2 Monitor

Pretentious Penguin21 Mar 2024 3:06 UTC

28 points

1 comment1 min readLW link

DeepMind: Evaluating Frontier Models for Dangerous Capabilities

Zach Stein-Perlman21 Mar 2024 3:00 UTC

61 points

8 comments1 min readLW link

(arxiv.org)