All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 20232024

All Jan FebMarApr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 161718 19 20 21 22 23 24 25 26 27 28 29 30 31

Celiefs

TheLemmaLlama16 Mar 2024 23:56 UTC

3 points

8 comments1 min readLW link

My PhD thesis: Algorithmic Bayesian Epistemology

Eric Neyman16 Mar 2024 22:56 UTC

259 points

14 comments7 min readLW link

(arxiv.org)

How people stopped dying from diarrhea so much (& other life-saving decisions)

Writer16 Mar 2024 16:00 UTC

45 points

0 comments1 min readLW link

(youtu.be)

Transformative trustbuilding via advancements in decentralized lie detection

trevor16 Mar 2024 5:56 UTC

17 points

7 comments38 min readLW link

(www.ncbi.nlm.nih.gov)

Enter the WorldsEnd

Akram Choudhary16 Mar 2024 1:34 UTC

−25 points

8 comments1 min readLW link

Strong-Misalignment: Does Yudkowsky (or Christiano, or TurnTrout, or Wolfram, or…etc.) Have an Elevator Speech I’m Missing?

Benjamin Bourlier15 Mar 2024 23:17 UTC

−4 points

3 comments16 min readLW link

Introducing METR’s Autonomy Evaluation Resources

Megan Kinniment and Beth Barnes

15 Mar 2024 23:16 UTC

90 points

0 comments1 min readLW link

(metr.github.io)

Are AIs conscious? It might depend

Logan Zoellner15 Mar 2024 23:09 UTC

6 points

6 comments3 min readLW link

Beyond Maxipok — good reflective governance as a target for action

owencb15 Mar 2024 22:22 UTC

20 points

0 comments1 min readLW link

Middle Child Phenomenon

PhilosophicalSoul15 Mar 2024 20:47 UTC

3 points

3 comments2 min readLW link

Capability or Alignment? Respect the LLM Base Model’s Capability During Alignment

Jingfeng Yang15 Mar 2024 17:56 UTC

7 points

0 comments24 min readLW link

Rational Animations offers animation production and writing services!

Writer15 Mar 2024 17:26 UTC

33 points

0 comments1 min readLW link

Improving SAE’s by Sqrt()-ing L1 & Removing Lowest Activating Features

Logan Riggs and Jannik Brinkmann

15 Mar 2024 16:30 UTC

26 points

5 comments4 min readLW link

Stuttgart, Germany—ACX Spring Meetups Everywhere 2024

Benjamin R15 Mar 2024 14:59 UTC

2 points

1 comment1 min readLW link

Controlling AGI Risk

TeaSea15 Mar 2024 4:56 UTC

6 points

8 comments4 min readLW link

Ulm, Germany—ACX Spring Meetups Everywhere 2024

Benjamin R15 Mar 2024 1:32 UTC

2 points

1 comment1 min readLW link

Newport News/ Virginia ACX Meetup

Daniel14 Mar 2024 23:46 UTC

1 point

0 comments1 min readLW link

Constructive Cauchy sequences vs. Dedekind cuts

jessicata14 Mar 2024 23:04 UTC

47 points

23 comments4 min readLW link

(unstableontology.com)

A Nail in the Coffin of Exceptionalism

Yeshua God14 Mar 2024 22:41 UTC

−17 points

0 comments3 min readLW link

Toward a Broader Conception of Adverse Selection

Ricki Heicklen14 Mar 2024 22:40 UTC

177 points

61 comments13 min readLW link

(bayesshammai.substack.com)

More people getting into AI safety should do a PhD

AdamGleave14 Mar 2024 22:14 UTC

60 points

24 comments12 min readLW link

(gleave.me)

Collection (Part 6 of “The Sense Of Physical Necessity”)

LoganStrohl14 Mar 2024 21:37 UTC

28 points

0 comments8 min readLW link

Fixed point or oscillate or noise

lemonhope14 Mar 2024 18:37 UTC

3 points

10 comments1 min readLW link

How useful is “AI Control” as a framing on AI X-Risk?

habryka and ryan_greenblatt

14 Mar 2024 18:06 UTC

70 points

4 comments34 min readLW link

Sparse autoencoders find composed features in small toy models

Evan Anders, Clement Neo, Jason Hoelscher-Obermaier and Jessica N. Howard

14 Mar 2024 18:00 UTC

33 points

12 comments15 min readLW link

AI #55: Keep Clauding Along

Zvi14 Mar 2024 15:40 UTC

62 points

16 comments70 min readLW link

(thezvi.wordpress.com)

To the average human, controlled AI is just as lethal as ‘misaligned’ AI

YonatanK14 Mar 2024 14:52 UTC

6 points

20 comments5 min readLW link

Claude vs GPT

Maxwell Tabarrok14 Mar 2024 12:41 UTC

12 points

2 comments2 min readLW link

(www.maximum-progress.com)

A brief review of China’s AI industry and regulations

Elliot Mckernon14 Mar 2024 12:19 UTC

24 points

0 comments16 min readLW link

[Question] Can any LLM be represented as an Equation?

Valentin Baltadzhiev14 Mar 2024 9:51 UTC

1 point

2 comments1 min readLW link

‘Empiricism!’ as Anti-Epistemology

Eliezer Yudkowsky14 Mar 2024 2:02 UTC

171 points

90 comments25 min readLW link

How I turned doing therapy into object-level AI safety research

Chipmonk14 Mar 2024 1:54 UTC

15 points

5 comments4 min readLW link

Opportunistic Time-Management

Richard Henage13 Mar 2024 21:38 UTC

13 points

2 comments1 min readLW link

AI governance and strategy: a list of research agendas and work that could be done.

NathanBarnard and Erin Robertson

13 Mar 2024 21:23 UTC

7 points

0 comments17 min readLW link

Highlights from Lex Fridman’s interview of Yann LeCun

Joel Burget13 Mar 2024 20:58 UTC

48 points

15 comments41 min readLW link

On the Latest TikTok Bill

Zvi13 Mar 2024 18:50 UTC

58 points

7 comments29 min readLW link

(thezvi.wordpress.com)

[Question] Recommended book for a balanced take and lessons learned from covid pandemic response

Martin Hare Robertson13 Mar 2024 18:14 UTC

4 points

0 comments1 min readLW link

ACX/LW Seattle spring meetup 2024

Nikita Sokolsky13 Mar 2024 17:24 UTC

12 points

3 comments1 min readLW link

Laying the Foundations for Vision and Multimodal Mechanistic Interpretability & Open Problems

Sonia Joseph and Neel Nanda

13 Mar 2024 17:09 UTC

44 points

13 comments14 min readLW link

I was raised by devout Mormons, AMA [&|] Soliciting Advice

ErioirE13 Mar 2024 16:52 UTC

31 points

41 comments2 min readLW link

Relational Agency: Consistently Reaching Out

Jonathan Moregård13 Mar 2024 14:34 UTC

16 points

0 comments5 min readLW link

(open.substack.com)

[Question] What could a policy banning AGI look like?

TsviBT13 Mar 2024 14:19 UTC

76 points

23 comments3 min readLW link

Clickbait Soapboxing

DaystarEld13 Mar 2024 14:09 UTC

24 points

15 comments3 min readLW link

(daystareld.com)

Virtual AI Safety Unconference 2024

Orpheus, Linda Linsefors, Joe Rogero, Arjun Yadav and Manuela García

13 Mar 2024 13:54 UTC

14 points

0 comments1 min readLW link

Jobs, Relationships, and Other Cults

Ruby and Elizabeth

13 Mar 2024 5:58 UTC

40 points

9 comments35 min readLW link

How do you improve the quality of your drinking water?

Alex K. Chen (parrot)13 Mar 2024 0:37 UTC

11 points

2 comments1 min readLW link

The Parable Of The Fallen Pendulum—Part 2

johnswentworth12 Mar 2024 21:41 UTC

77 points

8 comments4 min readLW link

Open consultancy: Letting untrusted AIs choose what answer to argue for

Fabien Roger12 Mar 2024 20:38 UTC

35 points

5 comments5 min readLW link

[Question] Is anyone working on formally verified AI toolchains?

metachirality12 Mar 2024 19:36 UTC

17 points

4 comments1 min readLW link

Transformer Debugger

Henk Tillman12 Mar 2024 19:08 UTC

25 points

0 comments1 min readLW link

(github.com)